In this case I have data and then set of pickles created from the data
Nothing in mind, just wanted to know if there was one 🙂
In this case, particularly because of pickle protocol version between 3.7 and 3.8
forking and using the latest code fixes the boto issue at least
Think I will have to fork and play around with it 🙂
Yes, I have no experience with triton does it do lazy loading? Was wondering how it can handle 10s, 100s of models. If we load balance across a set of these engine containers with say 100 models and all of these models get traffic but distribution is not even, each of those engine container will load all those 100 models?
I am also not understanding how clearml-serving is doing the version for models in triton.
Yeah please if you can share some general active ones to discuss both algos and engineering side
Progress with boto3 added, but fails:
As of now solving by updating the git config locally before creating the task
AgitatedDove14 - are there cases when it tries to skip steps?
I also have a pipelines.yaml which i convert to a pipeline
Anything that is shown in git status as untracked? So ignore .gitignored. and maybe a oaram or config to say include untracked. Anyway, it's only a nice to have feature.
AgitatedDove14 - it does have boto but the clearml-serving installation and code refers to older commit hash and hence the task was not using them - https://github.com/allegroai/clearml-serving/blob/main/clearml_serving/serving_service.py#L217
Will debug a bit more and see what’s up
Would be good to have frequentish releases if possible 🙂
AgitatedDove14 - apologies for late reply. So to give context this in a Sagemaker notebook which has conda envs.
I use a lifecycle like this to pip install a package (a .tar.gz downloaded from s3) in a conda env- https://github.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples/blob/master/scripts/install-pip-package-single-environment/on-start.sh
In the notebook I can do things like create experiments and so on. Now the problem is in running the cloned experimen...
that or in clearml.conf or both
From the code - it’s supposed to not cache if task override is different? I also have task_override that adds a version which changes each run
Got the engine running.
curl <serving-engine-ip>:8000/v2/models/keras_mnist/versions/1
What’s the serving-engine-ip supposed to be?
I am essentially creating a EphemeralDataset abstraction and creating controlled lifecycle for it such that the data is removed after a day in experiments. Additionally and optionally, data created during a step in a pipeline can be cleared once the pipeline completes
That's cool AgitatedDove14 , will try it out and pester you a bit more. 🙂
AgitatedDove14 either based on scenario
That makes sense - one part I am confused on is - The Triton engine container hosts all the models right? Do we launch multiple gorups of these in different projects?