Reputation
Badges 1
25 × Eureka!Hmm check if this one works:optimizer._get_child_tasks_ids( parent_task_id=optimizer._job_parent_id or optimizer._base_task_id, order_by=optimizer._objective_metric._get_last_metrics_encode_field(), additional_filters={'page_size': int(top_k), 'page': 0})
If it does, let's PR it as a dedicated function
Hi VexedKangaroo32 , there is now an RC with a fix:pip install trains==0.13.4rc0
Let me know if it solved the problem
Hi UpsetTurkey67
"General/my_parameter_name" so that only this part of the configuration will be updated?
I'm assuming this is a Hyperparameter not a configuration object (i.e. task.connect not task.connect_configuration), if this is the case then Yes π
HelplessCrocodile8 I just tried it, everything seems to work (ubuntu 20.04) π
What's the OS your are using? Python version? Is it conda ?
(basically python abusing types/casting where the value can be both str/bool on the same argparser aergument)
https://github.com/allegroai/clearml/blob/fcad50b6266f445424a1f1fb361f5a4bc5c7f6a3/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py#L86
you can just pass the instance of the OptunaOptimizer, you created, and continue the study
As long as you import clearml on the main script, it should work. Regarding the Nvidia container, it should not interfere with any running processes, the only issue is memory limit. BTW any reason not to spin an agent on a dedicated machine? What is the gpu used for in the ckearml server machine?
Thanks MortifiedDove27 ! Let me see if I can reproduce it, if I understand the difference, it's the Task.init in a nested function, is that it?
BTW what's the hydra version? Python, and OS?
Hmm are you running the clearml-agent on this machine? (This is the orchestration module, it will spin the Tasks and the dockers on the gpus)
Martin, thank you very much for your time and dedication, I really appreciate it
My pleasure π
Yes, I have latest 1.0.5 version now and it gives same result in UI as previous version that I used
Hmm are you saying the auto hydra connection doesn't work ? is it the folder structure ?
When is the Task.init is called ?
See example here:
https://github.com/allegroai/clearml/blob/master/examples/frameworks/hydra/hydra_example.py
MortifiedDove27 did you update to the latest cleaml python package ?
The agent ip? Generally whatβs the expected pattern to deploy and scale this for multiple models?
Yes the agent's IP, and with multiple agents, one would probably use k8s for the nodes, then configure ingest. This is the next step for the cleaml-serving, adding support for KFServing or manually configuring the ingest. wdyt?
default is clearml data server
Yes the default is the clearml files server, what did you configure it to ? (e.g. should be something like None )
Hi EnviousStarfish54
Color coding on the entire UI is stored per user (I think that on your local cookies, but I might be wrong). Anyhow any title/series combination will have the select color regardless of the project.
This way you can configure once that loss is red and accuracy is green, etc.
In that case when you create the Tasks for the step,do not specify any packages/requirements, then the agent will just use the "requirements.txt" from the repository.
If you need you can also specify them when you create the Task itself see https://github.com/allegroai/clearml/blob/912f6f5ba2328b26de042de03f02de5802df360f/clearml/task.py#L608
https://github.com/allegroai/clearml/blob/912f6f5ba2328b26de042de03f02de5802df360f/clearml/task.py#L609
Hi ShakyJellyfish91
Check mount default here:
https://github.com/allegroai/clearml-agent/blob/e416ab526ba9fe05daa977b34c9e46b50fb214a0/docs/clearml.conf#L186
Is this what you are after, or do you actually want to change the start up script?
If there was an SSL issue it should log to console right?
correct, also the agent is able to report, so I'm assuming configuration is correct
@<1724960464275771392:profile|DepravedBee82> could you try to put the clearml import + Task .init at the top of your code?
I see... In the triton pod, when you run it, it should print the combined pbtxt. Can you print both before/after ones? so that we could compare ?
Thanks NonchalantDeer14 !
BTW: how do you submit the multi GPU job? Is it multi-gpu or multi node ?
Nice π
@<1523710674990010368:profile|GreasyPenguin14> for future reference the agent
part in the clearml.conf is only created when you call clearml-agent init (no need for it for the python SDK). Full default configuration is here:
None
Thank you AttractiveWoodpecker16 !
Removing the uncommitted changes so that you can launch it from an agent? Or is it visual only?
Hmm and how would you imagine a transparent integration here (the example looks like a lot of boilerplate code...)
I can't seem to figure out what the names should be from the pytorch example - where did INPUT__0 come from
This is actually the latyer name in the model:
https://github.com/allegroai/clearml-serving/blob/4b52103636bc7430d4a6666ee85fd126fcb49e2e/examples/pytorch/train_pytorch_mnist.py#L24
Which is just the default name Pytorch gives the layer
https://discuss.pytorch.org/t/how-to-get-layer-names-in-a-network/134238
it appears I need to converted into TorchScript?
Yes, this ...