Reputation
Badges 1
35 × Eureka!if I squash, this will rewrite the datasets, right? I want a new dataset, but keeping those there
so I can run the experiments, I can see them, but no plots are saved because there is an upload problem when uploading to localhost:8085
please remove rmdatasets == 0.0.1
mmm, can you try the following:
create a new folder with no git repo, and copy those two notebooks launch the notebook with the base task and copy the task id launch the notebook with the hyperopt task modifying the TEMPLATE_TASK_ID variable accordingly
oh ok, I was wondering if this could have been an issue:agent.venvs_cache.free_space_threshold_gb = 2.0
Hi! What the error is saying is that it is looking for the the ctbc/image_classification_CIFAR10.py file in your repo.
So when you created the task you were inside a git repo, and ClearML assumed that all your files in it were commited and pushed. However your repo https://github.com/gradient-ai/PyTorch.git doesn’t contain these files
great! and I saw that there were some system packages needed for opencv that were installed automatically that could be turned off. Now I’m just wondering if I could remove the PIP install at the very beginning, so it starts straightaway
I also changed the permissions of /usr/share/elasticsearch according to this post: https://techoverflow.net/2020/04/18/how-to-fix-elasticsearch-docker-accessdeniedexception-usr-share-elasticsearch-data-nodes/ , but I’m getting the same error
I mean in the clearml-server docker
would the same experiment be called in either clearml server?
Is not direcly cached in the ~/.clearml folder. There are some directories inside (one for storage, one for pip, another for venvs, etc.
So in your case it would be stored in ~/.clearml/cache/storage_manager/datasets/ds_{ds_id}/my_file.json
but would installinggit+ <user>/rmdatasetsinstall rmdatasets == 0.0.1 ?
Aren’t they redundant?
Hey! When you say it wasn’t enough, what do you mean? Can you launch the web UI?
Thanks for the answer. You’re right. I forgot to add that this tasks runs inside a docker container and I’m currently only mapping the $PWD ( ml folder) into /app folder in the container.
there is no /usr/share/elasticsearch/logs/clearml.log file (neither inside the container nor in my server)
another thing: I had to change 8081 to 8085 since it was already used
the problem was docker, that had as entrypoint a bash script with python train.py --epochs=300 hardcoded, so I guess it was never acutally running the task setup from clearml.
Right, you don’t want ClearML to track that package, but there isn’t much you can do there I believe. I was trying to tackle how to run your code with an agent given those dependencies.
I think that if you clone the experiment, and remove that line in the dependencies sections in the UI you should be able to launch it correctly (as long as your clearml-agent has the correct permissions)
and btwif " @" in line: line = line.replace(" @", "https://")should be the same as
...
no because every user that is trying to write in the bucket has the same credentials
not that much, I was just wondering if it was possible :-)
before the repo was already in the docker, but now it is running the agent inside the docker (so setting a virtualenv, and cloning the repo, and installing the packages)
can you share your clearml.conf file (remove the critical information first)?
Currently I’m changing /opt/ for my home folder
oh but docker-ps shows me 8081 ports for webserver, apiserver and fileserver containers
` CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0b3f563d04af allegroai/clearml:latest "/opt/clearml/wrappe…" 7 minutes ago Up 7 minutes 8008/tcp, 8080-8081/tcp, 0.0.0.0:8080->80/tcp, :::8080->80/tcp clear...
That’s why I’m suggesting him to do that 🙂