Reputation
Badges 1
25 × Eureka!Also, how do pipelines compare here?
Pipelines are a type of Task, so like Tasks you can clone and enqueue them, or set them as the target of the trigger.
the most flexible solution would be to have some way of triggering the execution of a script in the parent task environment,
This is the exact idea of the TriggerScheduler None
What am I missing here?
Task.enqueue will execute immediately, i need execute task to spesific time
Oh I see what you mean, trigger -> scheduled (cron alike) -> Task executed.
Is that correct?
Hi @<1523701601770934272:profile|GiganticMole91>
Do you mean something like a git ops triggered by PR / tag etc ?
@<1523701601770934272:profile|GiganticMole91> really nice!
but can we scheduled new task here?
@<1523701260895653888:profile|QuaintJellyfish58> do you mean schedule a Task from the scheduled function? if yes, you can do something similar to @<1523701601770934272:profile|GiganticMole91> , you create/clone existing Task, change arguments and push it into an execution queue. wdyt?
Oh I think that I understand what's going on, @<1523701260895653888:profile|QuaintJellyfish58> let me check how to update the cron scheduler while it is running (I really like this idea, so if this is not already supported I'l like us to add this capability 🙂 )
I can verify the behavior, I think it has to do with the way the subparser was setup.
This was the only way for me to get it to run:script.py test blah1 blah2 blah3 42
When I passed specific arguments (for example --steps) it ignored them...
When I passed specific arguments (for example --steps) it ignored them...
script.py test blah1 blah2 blah3 42
Is this how it is intended to be used ?
@<1527459125401751552:profile|CloudyArcticwolf80> what are you seeing in the Args section ?
what exactly is not working ?
and does the code above reproduce the issue/bug? because obviously should not happen
ReassuredTiger98 that is a good point, at the moment they are designed as "machine level" configs, but we do have built in support to allow multiple configurations. The technical issue is we have to read the configuration file before we initial the Task object, that means we still are not aware of the git root (which I assume is where we could put a configuration file)
BTW: regrading the detect_with_conda_freeze
we hope that this flag is rarely used, as the Clearml should auto-detect t...
ReassuredTiger98 All that said, how about opening an Issue on GitHub (feature request)? if we get a bit of support from users, we could definitely add it
BTW: you can always set different config files by with an environment variable:CLEARML_CONFIG_FILE="path/to/cobfig/file
Hi AbruptHedgehog21
How i can add S3 credentials to S3 bucket in example.env for clearml-serving-triton? I need to add bucket name, keys and endpoint
Basically boto (s3) environment variables would just work:
https://clear.ml/docs/latest/docs/clearml_serving/clearml_serving#advanced-setup---s3gsazure-access-optional
AbruptHedgehog21 the bucket and the full link are registered on the model object itself, you can see them in the ui, under the models tab. The only thing you actually need to pass inside is the credentials. Make sense?
Just making sure i understand, you are to upload your models with clearml to the Yandex compatible s3 storage?
How can i get loaded model in Preporcess class in ClearML Serving?
ComfortableShark77
You mean your preprocess class needs a python package or is it your own module ?
HugeArcticwolf77 you can add --services-mode
to the agent, and it will basically keep on spinning Tasks in parallel (unfortunately the open source version does not include a way to limit it to a maximum of concurrent Tasks)
Hi @<1603198134261911552:profile|ColossalReindeer77>
When you select poetry as package manager the agent passes control to poetry, this means poetry needs to decide on hte correct torch wheel based on your cuda. I do not think poetry can do that, but I do think you can specify the extra index url to take the torch wheel from:
None
- Triton server does not support saving models off to normal RAM for faster loading/unloadingCorrect, the enterprise version also does not support RAM caching
Therefore, currently, we can deploy 100 models when only 5 can be concurrently loaded, but when they are unloaded/loaded (automatically by ClearML), it will take a few seconds because it is being read from the the SSD, depending on the size.
Correct, there is also deserializing CPU time (imaging unpickling 20GB file, this takes ...
Hi @<1523711619815706624:profile|StrangePelican34>
if I am trying to deploy 100 models on a GPU that can handle 5 concurrently,
Main limitation is Triton's ability to dynamically load / unload models. We know Nvidia is adding this capability, but I think this is still not out, once they support it, it should be transparent
I start the TaskScheduler, register a task, and stop the scheduler, how do I restart the TaskScheduler in a way that re-register the tasks?
if it's aborted, just re-enqueue it?
(it serializes itself and stores it's state on the Task object, so when re-launched it will deserialize from the last state)
but then an error message in the web-app pops up
Fetch parents failed
and the Scheduler task disappears
And the Task is still running? What's he clearml python version and webui version ?
Thanks for checking @<1545216070686609408:profile|EnthusiasticCow4> stable release will be out soon
Great, please feel free to share your thoughts here 🙂
Hi @<1545216070686609408:profile|EnthusiasticCow4>
My biggest concern is what happens if the TaskScheduler instance is shutdown.
good question, follow up, what happens to the cron service machine if it fails?!
TaskScheduler instance is shutdown.
And yes you are correct if someone stops the TaskScheduler instance
it is the equivalent of stopping the cron service...
btw: we are working on moving some of the cron/triggers capabilities to the backend , it will not be as flexi...
Hi @<1545216070686609408:profile|EnthusiasticCow4> let me know if this one solves the issue
pip install clearml==1.14.2rc0
Hmm I guess we should better state that, I'll pass it on 🙂
Could you give an example of such configurations ?
(e.g. what would be diff from one to another)
ReassuredTiger98 no, but I might be missing something.
How do you mean project-specific?