Reputation
Badges 1
979 × Eureka!the reindexing operation showed no error and copied everything
I wouldn't do it, this is less code to maintain from your side and honestly too much auto magic makes it difficult for the user to control the environment (ie. to understand what happens behind the scenes). I am not sure what switching back will solve, here the wheel should have been correct, it's just the architecture of the card that is incompatible
Awesome! Thanks! π
Hey @<1523701205467926528:profile|AgitatedDove14> , Actually I just realised that I was confused by the fact that when the task is reset, because of the sorting it disappears, making it seem like it was deleted. I think it's a UX issue: When I click on reset.
- The pop shows "Deleting 100%"
- The task disappears in the list of tasks because of the sortingThis led me to thing that there was a bug and the task was deleted
I think clearml-agent tries to execute /usr/bon/python3.6 to start the task, instead of using the python used to start clearml-agent
interestingly, it works on one machine, but not on another one
super, thanks SuccessfulKoala55 !
AgitatedDove14 SuccessfulKoala55 I just saw that clearml-server 1.4.0 was released, congrats π π Was this bug fixed with this new version?
How about the overhead of running the training on docker on a VM?
Yes, perfect!!
AgitatedDove14 It was only on comparison as far as I remember
Yes! not a strong use case though, rather I wanted to ask if it was supported somehow
Hi SuccessfulKoala55 , not really wrong, rather I don't understand it, the docker image with the args after it
Hi CostlyOstrich36 , I am not using Hydra, only OmegaConf, so you mean just calling OmegaConf.load should be enough?
This works well when I run the agent in virtualenv mode (remove --docker
)
I am using an old version of the aws autoscaler, so the instance has the following user data executed:echo "{clearml_conf}" >>/root/clearml.conf ... python -m clearml_agent --config-file '/root/clearml.conf' daemon --detached --queue '{queue}' --docker --cpu-only
Ok, but that means this cleanup code should live somewhere else than inside the task itself right? Otherwise it won't be executed since the task will be killed
Ok, I could reproduce with Firefox and Chromium. Steps:
Add creds (either via the popup or in the settings) Go the /settings/webapp-configuration -> Creds should be there Hit F5 Creds are gone
that would work for pytorch and clearml yes, but what about my local package?
Thanks, the message is not logged in GCloud instances logs when using startup scripts, this is why I did not see it. π
Ok thanks! And for this?
Would it be possible to support such use case? (have the clearml-agent setting-up a different python version when a task needs it?)
Thanks AgitatedDove14 ! I created a project with a default output destination to a s3 bucket but I don't have local access to this bucket (only agents have access to it for security reasons). Because of that, I cannot create a task in this project programmatically locally because it tries to access the bucket and fails. And there is no easy way to change the default output location (not in the web UI, not in the sdk)
I was rather wondering why clearml was taking space while I configured it to use the /data volume. But as you described AgitatedDove14 it looks like an edge case, so I donβt mind π
I am now trying with agent.extra_docker_arguments: ["--network='host'", ]
instead of what I shared above