You can provide it in the extra configurations sections
DeliciousBluewhale87 , Hi 🙂
You mean you created a dataset task on a certain server and you want to move that dataset task to another server?
Hi @<1664079296102141952:profile|DangerousStarfish38> , I think the issue is resolving the versions of torch. Are you using an older python version on the agent?
I mean what python version did you initially run it locally?
@<1664079296102141952:profile|DangerousStarfish38> , are you running different python versions on the different machines? Remote vs local
Hi @<1547028031053238272:profile|MassiveGoldfish6> , what version of clearml
& pytorch-lightning
? Does this happen to you with the example as well? Are you on a self deployed or the community server?
you can edit the requirements section directly
Also, in the original experiment, what pytorch version is detected?
ExasperatedCrocodile76 , did you run the original experiment on linux machine with pip and the remote machine is linux with conda package manager?
I think that might be the issue. Transfering from pip to Conda package managers can sometimes be problematic. Try to manually edit the requirements to reflect the settings in https://pytorch.org/
or add requirements manually via code
I think these env varibles might be relevant to you:
CLEARML_AGENT_SKIP_PIP_VENV_INSTALL
CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL
https://clear.ml/docs/latest/docs/clearml_agent/clearml_agent_env_var
It looks like you are running on the community server. Can you right click the experiment in the experiments table and click on 'Share' on all the relevant experiments and send here?
What is the base task you are using? It looks like you're using one of the examples 🙂
Any chance you could provide a share-able link if you're running on the community server?
Please do. You can download the entire log from the UI 🙂
AbruptWorm50 , what optimization method are you using?
Disabling the VCS cache will no longer cache the cloned git folder You can filter by 'Running' Experiments in ClearML and search for one that hasn't reported for a while and start investigating those
Is there anything special about the parent dataset?
Regarding your questions:
disable VCS cache - https://github.com/allegroai/clearml-agent/blob/master/docs/clearml.conf#L120 I think lock is created when running an experiment, maybe it hung so the lock never got lifted
wdyt?
Hi AbruptWorm50 ,
You can check in the UI, does the model miss any data there? Can you download it properly?
AbruptWorm50 , that sounds like an interesting idea for a feature, please open a github issue 🙂
AbruptWorm50 , regarding the debug samples:
When were they registered?
Were you able to view them before?
Where are they registered?
AbruptWorm50 , I also see the HPO app is missing, I'm told this is under investigation.
Simply hover over one of the tags, and the small 'x' will come up 🙂