Reputation
Badges 1
35 × Eureka!can you share you clearml.conf file? it should do that automatically if you set the development.default_output_uri key to “s3://{your_bucket}”
would it be possible to change de dataset.add_files to some function that moves your files to a common folder (local or cloud), and then use the last step in the dag to create the dataset using that folder?
I’m suggesting MagnificentWorm7 to do that yes, instead of adding the files to a ClearML dataset in each step
That’s why I’m suggesting him to do that 🙂
I also changed the permissions of /usr/share/elasticsearch
according to this post: https://techoverflow.net/2020/04/18/how-to-fix-elasticsearch-docker-accessdeniedexception-usr-share-elasticsearch-data-nodes/ , but I’m getting the same error
so I can run the experiments, I can see them, but no plots are saved because there is an upload problem when uploading to localhost:8085
another thing: I had to change 8081
to 8085
since it was already used
oh but docker-ps
shows me 8081 ports for webserver, apiserver and fileserver containers
` CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0b3f563d04af allegroai/clearml:latest "/opt/clearml/wrappe…" 7 minutes ago Up 7 minutes 8008/tcp, 8080-8081/tcp, 0.0.0.0:8080->80/tcp, :::8080->80/tcp clear...
Hi! What the error is saying is that it is looking for the the ctbc/image_classification_CIFAR10.py
file in your repo.
So when you created the task you were inside a git repo, and ClearML assumed that all your files in it were commited and pushed. However your repo https://github.com/gradient-ai/PyTorch.git doesn’t contain these files
great! and I saw that there were some system packages needed for opencv that were installed automatically that could be turned off. Now I’m just wondering if I could remove the PIP install at the very beginning, so it starts straightaway
how do I mount my local ssh folder into /root/.ssh/
docker when running clearml-agent?
also, is there a way for it to not install the requirements, and simply run the task?
so I removed the entrypoint, and now I can see that it tries to install the packages, but it fails because it can’t download the repo
before the repo was already in the docker, but now it is running the agent inside the docker (so setting a virtualenv, and cloning the repo, and installing the packages)
Thanks for the answer. You’re right. I forgot to add that this tasks runs inside a docker container and I’m currently only mapping the $PWD ( ml
folder) into /app folder in the container.
so when inside the docker, I don’t see the git repo and that’s why ClearML doesn’t see it
mmm, can you try the following:
create a new folder with no git repo, and copy those two notebooks launch the notebook with the base task and copy the task id launch the notebook with the hyperopt task modifying the TEMPLATE_TASK_ID
variable accordingly
right, I’m saying I had to do that in my MAC. In your case you would have to point it to somewhere else. Please check where openblas is installed on your ubuntu
if I were to run an agent that would require to install pandas at some point I’d run it:OPENBLAS="$(brew --prefix openblas)" clearml-agent daemon --queue default
not that much, I was just wondering if it was possible :-)
Hi AgitatedDove14 , I’m talking about the following pip install.
After that pip install, it displays agent’s conf, shows installed packages, and launches the task (no installation)
` Running in Docker mode (v19.03 and above) - using default docker image: spoter ['-e CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1', '-e CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1']
Running task '3ebb680b17874cda8dc7878ddf6fa735'
Storing stdout and stderr log to '/tmp/.clearml_agent_out.tsu2tddl.txt', '/tmp/.clearml_agent_o...
the problem was docker, that had as entrypoint a bash script with python train.py --epochs=300
hardcoded, so I guess it was never acutally running the task setup from clearml.
I could map the root folder of the repo into the container, but that would mean everything ends up in there
please remove rmdatasets == 0.0.1