Also, how are you saving your models?
connected_config = task.connect({})
Looks like you're connecting an empty config..
I see. It appears to be an open bug. As a workaround for now, if you go into 'Projects' -> 'All Experiments' and then search for the ID with the search bar at the top right (Magnifying glass icon)
WackyRabbit7 what version of clearml server did you install? Try changing the URL of the browser to <HOST>:8080/login instead of <HOST>:8080/dashboard.
It's an issue that was solved recently, not sure if the fix is out yet 🙂
Hi SubstantialElk6 , I think you need to have Task.init() inside these sub processes as well.
Hi @<1523701457835003904:profile|AbruptHedgehog21> , I'm not sure I understand - How do you use set_base_docker and what do you expect to happen?
But it looks like you aren't running the agent in docker mode, at least from the log you provided . You need to add the --docker tag to the command when you run it
Can you try specifying the argument explicitly? For example:set_offline(offline_mode=False)
Part of the docker compose, there is a container with a special agent that works specifically for managing services for the system, like the pipelines controllers
ConvolutedSealion94 , what if you add a sleep time of 15-20 seconds to the end of the script? I'm guessing that your entire script is justfrom clearml import Task task = Task.init()Correct?
Can you please elaborate a bit on your setup and what you're trying to achieve?
When going to the UI, open developer tools (F12) and see what returns when you go to 'all experiments' to see what is called and what is returned for tasks.get_all_ex
Hi @<1533257411639382016:profile|RobustRat47> , what would you define as most metrics?
I don't think there is such a capability, but please open a GitHub feature request, I think it would be a cool feature!
GrievingTurkey78 , I'm not sure. Let me check.
Do you have cpu/gpu tracking through both pytorch lightning AND ClearML reported in your task?
Hi @<1534344462161940480:profile|QuaintSeal61> , are you doing something heavy on the system or just navigating the UI? Is there any chance you reached your API call limit?
Hi @<1670964687132430336:profile|SpicyFrog56> , can you please add the full log?
GreasyLeopard35 , what happens if you try to run the command it's (agent) trying to run yourself?
And what was the result from 19:15 yesterday? The 401 error? Please note that's a different set of credentials
RotundSquirrel78 , can you please check the webserver container logs to see if there were any errors
Hi @<1710827340621156352:profile|HungryFrog27> , I'd suggest running the agent with --debug flag for more information. Can you provide a full log of both the HPO task and one of the children?
Are you referring to the ec2 instance or the AWS autoscaler itself that is running?
Hi @<1644147961996775424:profile|HurtStarfish47> , published is one step after finalized technically, similar to tasks
@<1719524641879363584:profile|ThankfulClams64> , there is a difference between models & tasks/experiments. Everything during training is automatically reported to the task/experiment, not the model. If you want to add anything to models themselves you have to add it manually. (Keep in mind that taks/experiments are separate entities from models, although there is a connection between the two)
Once you manually add either metadata or metrics you will be able to add custom columns. This is not...
Hi SoggyBeetle95 , did you try rolling back and re-trying the upgrade? Do you have a backup of the data?