Hi, If I want to run many hyperparameter tuning trials in parallel e.g. 50 trials concurrently, what's the best way to do it in the cloud? I have an AWS account but not much knowledge about stuff like Kubernetes. Should I use the premium version of ClearML? Is there any simple and practical tutorial for how to setup your cloud system for ClearML?

AgitatedDove41 Hi!

If I understand correctly you would like to run the training on AWS?

Yes, on AWS or on anything as long as it's not too hard for someone who doesn't know much DevOps stuff.
Ideally I want it to automatically spawn new compute instances when required and terminate the instances when not in use.

It can be also on ClearML. I'm just evaluating options currently. Is the free version suitable for running big experiment like that? (50-100 trials concurrent for maybe an hour and then stop)
My workflow is that I like to try a bunch of parameters (50-100 trials at a time) and then develop idea of what to do next.

AgitatedDove41 , you can run as many instances as you'd like 🙂

Please read this to see how it's done with AWS. I don't believe you need much DevOps knowledge 🙂

Thank you for the link, I'll learn how to do it.

AgitatedDove41 , forgot to add the link for the docs, here it is:


