Reputation
Badges 1
606 × Eureka!My agent shows the same as before:
` ...
Environment setup completed successfully
Starting Task Execution:
DONE: Running task 'aff7c6605b7243d38968f95b4351b127', exit status 0 `
So actually deleting from client (e.g. an dataset with clearml-data) works.
Ah, perfect. Did not know this. Will try! Thanks again! 🙂
Ah, very cool! Then I will try this, too.
I got the idea from an error I got when the agent was configured to use pip and tried to install BLAS (for PyTorch I guess) and it threw an error.
It seems like the services-docker is always started with Ubuntu 18.04, even when I usetask.set_base_docker( "continuumio/miniconda:latest -v /opt/clearml/data/fileserver/:{}".format( file_server_mount ) )
Do you mean venv_update
?
Now the pip packages seems to ship with CUDA, so this does not seem to be a problem anymore.
I see. Thanks a lot!
I see. I was just wondering what the general approach is. I think PyTorch used to ship the pip package without CUDA packaged into it. So with conda it was nice to only install CUDA in the environment and not the host. But with pip, you had to use the host version as far as I know.
It is not explained there, but do you meanCLEARML_API_ACCESS_KEY: ${CLEARML_API_ACCESS_KEY:-} CLEARML_API_SECRET_KEY: ${CLEARML_API_SECRET_KEY:-}
?
How can I get the agent log?
I was wrong: I think it uses the agent.cuda_version
, not the local env cuda version.
I use fixed users!
I just updated my server to 1.0 and now the services agent is stuck in restarting:
Nvm. I think I understood. When the file has never been added to repository it is not tracked.
Shows some logs, but nothing of relevance I think. Only Infos and Warning about deprecated stuff that is still used ;D ...
With remote_execution it is command="[...]"
, but on local it is command='train'
like it is supposed to be.
What you mean by "Why not add the extra_index_url to the installed packages part of the script?"?
What I get for args
when I print it locally is not the same as what ClearML extracts from args
.
In my case I use the conda freeze option and do not even have CUDA installed on the agents.
However, to use conda as package manager I need a docker image that provides conda.
And in the web UI artifacts is still empty.
I don't know actually. But Pytorch documentation says it can make a difference: https://pytorch.org/docs/stable/distributions.html#torch.distributions.distribution.Distribution.set_default_validate_args
However, because of the import carla
it is added to the task requirements and clearml-agent tries to install it, although it is meant to be included at runtime.
Yes, that looks alright. Similar to before. Local execution works.