Reputation
Badges 1
56 × Eureka!The script is inside a git repo (and it's the one I launch, I would get an importerror if it was something else missing)
and ctrl-f (of the browser) doesn’t work as lines below not loaded (even when you scroll it will remove the other lines not visible, so you can’t ctrl-f them)
so if anybody needs this someday (migrating your hostname which is saved inside your experiments (debug images and plots with images)) you need this https://github.com/allegroai/clearml-server/issues/83
but it's slow , you can restrict the query to the items that are actually updated, with:
` # on index events-training_debug_image-yourid
OLDHOST/ should be something like
or
NEWHOST/ same
"script": {
"source": "ctx._source.url = ctx._source.url.replace('OLDHOST/', 'NEWHO...
Yes the setup.py imports torch unfortunately https://github.com/mapillary/inplace_abn/blob/master/setup.py
I think didn't understand, if I'm not at the root of the repo, I have to specify the working dir ?
oookay so we found that for kubernetes, if we allow only tls v1.3 on the ingress controller, clearml-inits breaks with 2022-03-04 10:32:02,814 - clearml.session - WARNING - SSLError Retrying HTTPSConnectionPool(host='
http://api.clear-ml.dev.monk.ai ', port=443): Max retries exceeded with url: /auth.login (Caused by SSLError(SSLError(1, '[SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:1129)')))
or sometimes just could not verify credentials
Thanks ! I think .execute_remotely()
is exactly what I need
managed a workaround thanks to the API doc, if someone encouters the same bug:tasks = [] page = 0 while True: page_tasks = Task._query_tasks(project_name=project, system_tags=[] if archived else ['-archived'], page=page, page_size=500) tasks += page_tasks page += 1 if len(page_tasks) < 500: break
Is there a way to store relative urls in clearml ? We can't connect to our server with a public address, it only works with the internal dns from GCE
We tried with a docker-compose on a GCE VM + load balancers, and then in kube, we get the same error: clearml-init
returns Error: could not verify credentials: key=241... secret=NhC...
However I have another problem, my git repo is installed with pip install -e .
and I import it in my script, but on a task executed by a clearml-agent the module appears not to be installed ?
not exactly, I want to launch the script (create a new experiment, not clone an existing one in the UI), how can I do it ?
And the comparison for the confusion matrices without the name of the experiments
Oh ok I thought it would be relative to the server, how do i run this migration ?
Also it would be awesome if the front-end integrated a small reverse-proxy to have everything on 1 address, I don't know if this is somewhere on the roadmap ? Or what are advantages of having 3 separate addresses ?
we managed to upgrade it but the volume claim thing changed somehow, it created new disks, i will backup from the old disks and upload to the new ones to migrate but the backup procedure is not detailed for kubernetes, do you have info for this?
should i only do mongodb?
I think I found the problem, if the file is untracked by git, it is not saved by clearml
The task is registered and is started by the agent, the env seems to be installed well, but then it fails on /home/ubuntu/.clearml/venvs-builds/3.8/bin/python: can't open file 'fastai_classifier.py': [Errno 2] No such file or directory
Do you have an idea of what could be wrong ? The agent launch the script in the wrong working dir ? The repo is not copied ? (This script is inside a private git repo, that clearml detects correctly).
I also tried launching the script from the root of th...
It works with post_packages
made a PR to help a bit loading console logs None
logs can be huge but are loaded 7kB at a time currently
100+ parameters is quite a lot indeed but very quickly achieved when using frameworks like detectron2, where you configure the model in the configuration (+dataloader, datasets, evaluators, augmentation, optimizer, lr_scheduling). anyway the search is broken as soon as one line you search is not currently visible, so already with 20+ ...