What is the use case of accessing clearml.conf
during runtime?
@<1554638179657584640:profile|GlamorousParrot83> , can you add also the full log?
Not sure, let me know what works 🙂
In that case you can use the model ID. Please note that your suggestion wouldn't necessarily solve the problem since a task can have two models with the same name in the same task in the same project...
PunyWoodpecker71 , what do you mean by 'experiment detail of the a project'? Can you give me an example?
Hi @<1831502554446434304:profile|TestyKitten53> , what if you set it to true? Do you get the same errors?
Can you add the full log?
Hi @<1523701260895653888:profile|QuaintJellyfish58> , yes this is correct. You can also set your files_server in clearml.conf to point to S3 bucket as well and this way debug samples will be saved there as well
I think AnxiousSeal95 updates us when there is a new version or release 🙂
The project should have a system tag called 'hidden'. If you remove the tag via the API ( None ) that should solve the issue.
How was the project turned to hidden?
Hi @<1710827340621156352:profile|HungryFrog27> , what seems to be the issue?
Hi @<1597762318140182528:profile|EnchantingPenguin77> , do you have a code snippet that reproduces this? Where is that API call originating from?
Hi @<1555362936292118528:profile|AdventurousElephant3> , if you clone/reset the task, you can change the logging level to 'debug'
I guess that's a good point but really applicable if your training is CPU intensive. If your training is GPU intensive I guess most of the load goes on the GPU so running over VM (EC2 instances for example) shouldn't have much of a difference but this is worthy of testing.
I found this article talking about performance
https://blog.equinix.com/blog/2022/01/04/3-reasons-why-you-should-consider-running-containers-on-bare-metal/
But it doesn't really say what the difference in performance is...
@<1523703961872240640:profile|CrookedWalrus33> , you can use the UI as reference. Open dev tools (F12) and see the network (filter by XHR).
For example, in scalars/plots/debug samples tabs the relevant calls seem to be:
events.get_task_single_value_metrics
events.scalar_metrics_iter_histogram
events.get_task_plots
events.get_task_metrics
Hi @<1535069219354316800:profile|PerplexedRaccoon19> , not sure I understand what you mean, can you please elaborate on what you mean by doing the evaluations within ClearML?
Hi David,
What version of ClearML server & SDK are you using?
In compare view you need to switch to 'Last Values' to see these scalars. Please see screenshot
Since the "grand" dataset will inherit from the child versions you wouldn't need to have data duplications
Also in applications I see an option for subnet ID & security group
Hi @<1541954607595393024:profile|BattyCrocodile47> , how does ClearML react when you run the scripts this way? The repository is logged as usual?
Hi @<1570220844972511232:profile|ObnoxiousBluewhale25> , you can use the output_uri
parameter in Task.init
to set a predetermined output destination for models and artifacts
Hi RoundMosquito25 , where is this error coming from? API server?
DepressedChimpanzee34 , Hi 🙂
Let's break this one down:
In the 'queues & workers' window if you switch to 'queues' you can actually see all the workers assigned to a specific queue In the workers window, you can see which workers are active and which are not. Is this enough or do you think something else is needed? You can see the resources used by each worker in the workers window. Is that what you mean? You can already do that! Simply drag and drop experiments in the queue window
I'm...
TimelyPenguin76 , what do you think?