Reputation
Badges 1
56 × Eureka!With matplotlib I only get the suptitle
Hmm it's both better and worse, it does detect pyfunctional now (in INSTALLED PACKAGES and I can see it installed in the console logs) but it fails onimport torch ModuleNotFoundError: No module named 'torch'
In the logs:
` Found PyTorch version torch==1.7.1 matching CUDA version 110
2021-04-21 15:15:11
Found PyTorch version torchvision==0.8.2 matching CUDA version 110
Collecting torch==1.7.1+cu110
File was already downloaded /home/ubuntu/.clearml/pip-download-cache/cu110/torch-1.7.1+cu110...
It works with post_packages
I suppose the images are in db.task but I can't find them
ok so I reproduced it with this, it happens when I have colors (I got the error first with an exception printed with stackprinter None )
Task.init(project_name="test", task_name="test", reuse_last_task_id=False)
print("this is a test <hello world> rest of the text")
print("this is a test <hello world> rest of the text", file=sys.stderr)
print(colorama.Fore.RED + "this is a test <hello world> rest of the text" + colorama.Style.RESET_ALL)
![i...
we still don't what was happening with the VM + docker compose + load balancers
quick video of the search not working
The task is registered and is started by the agent, the env seems to be installed well, but then it fails on /home/ubuntu/.clearml/venvs-builds/3.8/bin/python: can't open file 'fastai_classifier.py': [Errno 2] No such file or directory
Do you have an idea of what could be wrong ? The agent launch the script in the wrong working dir ? The repo is not copied ? (This script is inside a private git repo, that clearml detects correctly).
I also tried launching the script from the root of th...
Hello AgitatedDove14 it does not throw an exception, but in the ui the link is broken so the image does not show
I'm not using clearml-agent here, I use clearml.Task.init.
The exit(1) (or raised exception) is from a subprocess.
clearml==1.1.3
torch==1.9.0+cu111, torchvision==0.10, lightning not installed
python3.8
debian 10
I will try reproducing with a smaller code, it was a training with detectron2 which uses torch.,multiprocessing.spawn and torch.distributed.init_process_group
https://github.com/facebookresearch/detectron2/blob/c47167e4ac236a36895c294735a908b75f659f96/tools/train_net.py#L163
https...
I welcome the day clearml saves relative urls by default ^^ it is supported by browsers (i.e. fetching /someurl is relative to the current hostname) so maybe only the clearml client would need to be updated right ? to push images with a relative url instead of the clearml server url.
Is there a way to check how clearml gets the installed packages of the current env ?
Yes I think it needs pytorch, but pytorch failed to install previously ?
Ok, btw I used https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_agent_install_configure.html which was not updated so I didn't know there was a priority_packages and post_packages
I used scripts like https://github.com/allegroai/clearml-server/issues/83 previously for images but it doesn't migrate artifacts urls
Hello, sorry the second is for models and not images
hello, yes it’s like typos, I want to compare some experiments that were created by different versions of a script for instance, and the metrics names changed so I can’t compare it on clearml UI
Does clearml-agent install the repo with pip install -e .
if it should be ? (i.e. my local repo is installed with pip install -e .
where I launch my script which calls Task.init
and .execute_remotely()
).
Hmm apparently if I launch the script from the root of the repo (CWD: myrepo python train/classif-custom/train.py
) it works, but from its dir it doesn't work (CWD: myrepo/train/classif-custom python train.py
)
WebApp: 1.2.0-153 • Server: 1.2.0-153 • API: 2.16
And an example of the missing comparison:
the two experiments 2. plot on the first one 3. plot on the second 4. comparison plot only shows other plots (only the confusion matrices)