
Reputation
Badges 1
56 × Eureka!And an example of the missing comparison:
the two experiments 2. plot on the first one 3. plot on the second 4. comparison plot only shows other plots (only the confusion matrices)
I think I found the problem, if the file is untracked by git, it is not saved by clearml
However I have another problem, my git repo is installed with pip install -e .
and I import it in my script, but on a task executed by a clearml-agent the module appears not to be installed ?
Oh ok I thought it would be relative to the server, how do i run this migration ?
The script is inside a git repo (and it's the one I launch, I would get an importerror if it was something else missing)
Hello, sorry the second is for models and not images
And the comparison for the confusion matrices without the name of the experiments
Ok thanks I will check first for permission issues
WebApp: 1.2.0-153 • Server: 1.2.0-153 • API: 2.16
Hmm it's both better and worse, it does detect pyfunctional now (in INSTALLED PACKAGES and I can see it installed in the console logs) but it fails onimport torch ModuleNotFoundError: No module named 'torch'
In the logs:
` Found PyTorch version torch==1.7.1 matching CUDA version 110
2021-04-21 15:15:11
Found PyTorch version torchvision==0.8.2 matching CUDA version 110
Collecting torch==1.7.1+cu110
File was already downloaded /home/ubuntu/.clearml/pip-download-cache/cu110/torch-1.7.1+cu110...
It works with post_packages
Yes the setup.py imports torch unfortunately https://github.com/mapillary/inplace_abn/blob/master/setup.py
I think didn't understand, if I'm not at the root of the repo, I have to specify the working dir ?
I'm not using clearml-agent here, I use clearml.Task.init.
The exit(1) (or raised exception) is from a subprocess.
clearml==1.1.3
torch==1.9.0+cu111, torchvision==0.10, lightning not installed
python3.8
debian 10
I will try reproducing with a smaller code, it was a training with detectron2 which uses torch.,multiprocessing.spawn and torch.distributed.init_process_group
https://github.com/facebookresearch/detectron2/blob/c47167e4ac236a36895c294735a908b75f659f96/tools/train_net.py#L163
https...
Hmm apparently if I launch the script from the root of the repo (CWD: myrepo python train/classif-custom/train.py
) it works, but from its dir it doesn't work (CWD: myrepo/train/classif-custom python train.py
)
Hello AgitatedDove14 it does not throw an exception, but in the ui the link is broken so the image does not show
I welcome the day clearml saves relative urls by default ^^ it is supported by browsers (i.e. fetching /someurl is relative to the current hostname) so maybe only the clearml client would need to be updated right ? to push images with a relative url instead of the clearml server url.
The task is registered and is started by the agent, the env seems to be installed well, but then it fails on /home/ubuntu/.clearml/venvs-builds/3.8/bin/python: can't open file 'fastai_classifier.py': [Errno 2] No such file or directory
Do you have an idea of what could be wrong ? The agent launch the script in the wrong working dir ? The repo is not copied ? (This script is inside a private git repo, that clearml detects correctly).
I also tried launching the script from the root of th...
Yes I think it needs pytorch, but pytorch failed to install previously ?
Is there a way to store relative urls in clearml ? We can't connect to our server with a public address, it only works with the internal dns from GCE
ok so I reproduced it with this, it happens when I have colors (I got the error first with an exception printed with stackprinter None )
Task.init(project_name="test", task_name="test", reuse_last_task_id=False)
print("this is a test <hello world> rest of the text")
print("this is a test <hello world> rest of the text", file=sys.stderr)
print(colorama.Fore.RED + "this is a test <hello world> rest of the text" + colorama.Style.RESET_ALL)
![i...