As in run a training experiment, then a test/validation experiment to choose best model etc etc and also have a human validate sample results via annotations all as part of a pipeline
I would prefer controlled behavior than some available version being used. Here triggered a bunch of jobs that all went fine and even evaluations were fine and then when we triggered a inference deploy it failed
Looks like Task.current_task() is indeed None in this case. Bit of log below where I print(Task.current_task()) as first step in the script
Environment setup completed successfully Starting Task Execution: None
It completed after the max_job limit (10)
irrespective of what I actually have installed when running the script?
Ok i did a pip install -r requirements.txt and NOW it picks them up correctly
Can you point me at relevant code in ClearML for the autoconnect so that I can understand exactly what's happening
Nope, that doesn’t seem to be it. Will debug a bit more.
Essentially - 1. run a task normally. 2. clone 3. edit to have only those two lines.
Question - since this is a task, why is Task.currnet_task() None?
But that itself is running in a task right?
pipeline code itself is pretty standard
And anyway once a model is published can’t update it right? Which means there will be atleast multiple published models entries of same model over time?
BTW AgitatedDove14 - that 1.0.0 is the helm chart version, not necessarily the version of the app the chart deploys
just that the task itself is still Running state
Planning to exec into the container and run it in a loop and see what happens
How can a task running like this know its own project name?
It’s essentially this now:
from clearml import Task print(Task.current_task())
Yes, I have no experience with triton does it do lazy loading? Was wondering how it can handle 10s, 100s of models. If we load balance across a set of these engine containers with say 100 models and all of these models get traffic but distribution is not even, each of those engine container will load all those 100 models?
Thanks AgitatedDove14 . Have removed Task.current_task() usage for this now. Think I can do without it
Ok it’s not implemented right, that’s what I was asking
Yeah meant the 1.0+ release as I don’t think the chart has been updated
Ok, but doesn't work for me though. Can you or AgitatedDove14 help me in linking to relevant code so that I can see what's wrong?
The Optimizer task is taking a lot of time to complete. Is it doing something here: