Reputation
Badges 1
606 × Eureka!One question: Does clearml resolve the CUDA Version from driver or conda?
Okay. It works now. I don't know what went wrong before. Probably a user error 😅
When experimenting we use a entrypoint script which we pass the specific experiment to.
Quick question: Where again does clearml place the venv? I wanna take a look into it after the task has failed
My driver says "CUDA Version: 11.2" (I am not even sure this is correct, since I do not remember installing code in this machine, but idk) and there is no pytorch for 11.2, so maybe it fallbacks to cpu?
I installed my local conda environment from an environment.yml
without issues, so maybe clearml makes some changes that leads to conflicts which finally leads to the cpu-version install.
channels:
- defaults
- conda-forge
- pytorch
dependencies:
- cudatoolkit==11.1.1
- pytorch==1.8.0
Gives CPU version
Interesting. Will probably only matter for very small experiments or experiments, where validation is run very infrequently.
What's the reason for the shift?
Hi KindChimpanzee37 I was more asking about the general idea to make these settings task-specific, but thank you for the suggestion anyways, I will definitely apply it.
Wait, nvm. I just tried it again and now it worked.
Mhhm, now conda env creation takes forever since it probably resolves conflicts. At least that is what is happening when I tried to manually install my environment
Is this not something completely different?
This will just change the way to local repository is analyzed, but nothing about the agent.
It could be that either the clearml-server has bad behaviour while clean up is ongoing or even after.
Okay, it seems like it just takes some time to delete and to reflect in the WebUI. So when I try to delete again, actually a deletion process seems already to be running in the background.
Maybe deletion happens "async" and is not reflected in parts of clearml? It seems that if I try to delete often enough at some point it is successfull
I created an github issue because the problem with the slow deletion still exists. https://github.com/allegroai/clearml/issues/586#issue-1142916619