(only works for pyroch because they have diff wheeks for diff cuda versions)
Ah, perfect. Did not know this. Will try! Thanks again! 🙂
You'll need to set the agent key and secret using environment variables, as explained here (in step #11): https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_linux_mac.html#deploying
It is not explained there, but do you meanCLEARML_API_ACCESS_KEY: ${CLEARML_API_ACCESS_KEY:-} CLEARML_API_SECRET_KEY: ${CLEARML_API_SECRET_KEY:-}
?
However, to use conda as package manager I need a docker image that provides conda.
Oh, so I think I know what might have happened
I was wrong: I think it uses the agent.cuda_version
, not the local env cuda version.
Is it also possible to specify different user/api_token for different hosts? For example I have a github and a private gitlab that I both want to be able to access.
ReassuredTiger98 my apologies I just realize you can use ~/.git-credentials for that. The agent will automatically map the host .git-credentials into the docker :)
I got the idea from an error I got when the agent was configured to use pip and tried to install BLAS (for PyTorch I guess) and it threw an error.
I just updated my server to 1.0 and now the services agent is stuck in restarting:
No (this is deprecated and was removed because it was confusing)
https://github.com/allegroai/clearml-agent/blob/cec6420c8f40d92ab1cd6cbe5ca8f24cf351abd8/docs/clearml.conf#L101
It seems like the services-docker is always started with Ubuntu 18.04, even when I usetask.set_base_docker( "continuumio/miniconda:latest -v /opt/clearml/data/fileserver/:{}".format( file_server_mount ) )
Ah, very cool! Then I will try this, too.
You can simply generate another set of credentials in the profile page, and set them up in these environment variable.
Alternatively, you can add another fixed user, and use its username/password for these values
Also, what kind of authentication are you using? Fixed users?
In that case I suggest you turn on the venv cache, it will accelerate the conda environment building because it will cache the entire conda env.
` ocker-compose ps
Name Command State Ports
clearml-agent-services /usr/agent/entrypoint.sh Restarting
clearml-apiserver /opt/clearml/wrapper.sh ap ... Up 0.0.0.0:8008->8008/tcp, 8080/tcp, 8081/tcp
clearml-elastic /usr/local/bin/docker-entr ... Up 9200/tcp, 9300/tcp
clearml-fileserver /opt/clearml/wrapper.sh fi ... Up 8008/tcp, 8080/tcp, 0.0.0.0:8081->8081/tcp
clearml-mongo docker-entrypoint.sh --set ... Up 27017/tcp
clearml-redis docker-entrypoint.sh redis ... Up 6379/tcp
clearml-webserver /opt/clearml/wrapper.sh we ... Up 0.0.0.0:8080->80/tcp, 8008/tcp, 8080/tcp,
8081/tcp `
I see. I was just wondering what the general approach is. I think PyTorch used to ship the pip package without CUDA packaged into it. So with conda it was nice to only install CUDA in the environment and not the host. But with pip, you had to use the host version as far as I know.
In my case I use the conda freeze option and do not even have CUDA installed on the agents.
BTW: the agent will resolve pytorch based on the install CUDA version.
Now the pip packages seems to ship with CUDA, so this does not seem to be a problem anymore.
In the new version, we made it so that the default agent credentials embedded in the ClearML Server are disabled is the server is not in the open mode (i.e. requires user/password to login). This is since having those default credentials available in this mode basically means anyone without a password can actually send commands to the server (since these credentials are hard-coded)
Oh, you're right - I'll make sure we add it there 😄