So actually I donβt need to play with this limit, I am OK with the default for now
Hi DeterminedCrab71 Version: 1.1.1-135 β’ 1.1.1 β’ 2.14
ok, what is your problem then?
, causing it to unregister from the server (and thus not remain there).
Do you mean that the agent actively notifies the server that it is going down? or the server infers that the agent is down after a timeout?
the reindexing operation showed no error and copied everything
I wouldn't do it, this is less code to maintain from your side and honestly too much auto magic makes it difficult for the user to control the environment (ie. to understand what happens behind the scenes). I am not sure what switching back will solve, here the wheel should have been correct, it's just the architecture of the card that is incompatible
Awesome! Thanks! π
Hey @<1523701205467926528:profile|AgitatedDove14> , Actually I just realised that I was confused by the fact that when the task is reset, because of the sorting it disappears, making it seem like it was deleted. I think it's a UX issue: When I click on reset.
- The pop shows "Deleting 100%"
- The task disappears in the list of tasks because of the sortingThis led me to thing that there was a bug and the task was deleted
I think clearml-agent tries to execute /usr/bon/python3.6 to start the task, instead of using the python used to start clearml-agent
interestingly, it works on one machine, but not on another one
AgitatedDove14 SuccessfulKoala55 I just saw that clearml-server 1.4.0 was released, congrats π π Was this bug fixed with this new version?
How about the overhead of running the training on docker on a VM?
Yes, perfect!!
AgitatedDove14 It was only on comparison as far as I remember
Yes! not a strong use case though, rather I wanted to ask if it was supported somehow
Hi SuccessfulKoala55 , not really wrong, rather I don't understand it, the docker image with the args after it
Hi CostlyOstrich36 , I am not using Hydra, only OmegaConf, so you mean just calling OmegaConf.load should be enough?
This works well when I run the agent in virtualenv mode (remove --docker
)
I am using an old version of the aws autoscaler, so the instance has the following user data executed:echo "{clearml_conf}" >>/root/clearml.conf ... python -m clearml_agent --config-file '/root/clearml.conf' daemon --detached --queue '{queue}' --docker --cpu-only
Ok, but that means this cleanup code should live somewhere else than inside the task itself right? Otherwise it won't be executed since the task will be killed
Ok, I could reproduce with Firefox and Chromium. Steps:
Add creds (either via the popup or in the settings) Go the /settings/webapp-configuration -> Creds should be there Hit F5 Creds are gone
that would work for pytorch and clearml yes, but what about my local package?
Thanks, the message is not logged in GCloud instances logs when using startup scripts, this is why I did not see it. π