Reputation
Badges 1
35 × Eureka!Thanks for the answer. You’re right. I forgot to add that this tasks runs inside a docker container and I’m currently only mapping the $PWD ( ml
folder) into /app folder in the container.
if I were to run an agent that would require to install pandas at some point I’d run it:OPENBLAS="$(brew --prefix openblas)" clearml-agent daemon --queue default
so I can run the experiments, I can see them, but no plots are saved because there is an upload problem when uploading to localhost:8085
I could map the root folder of the repo into the container, but that would mean everything ends up in there
oh but docker-ps
shows me 8081 ports for webserver, apiserver and fileserver containers
` CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0b3f563d04af allegroai/clearml:latest "/opt/clearml/wrappe…" 7 minutes ago Up 7 minutes 8008/tcp, 8080-8081/tcp, 0.0.0.0:8080->80/tcp, :::8080->80/tcp clear...
there is no /usr/share/elasticsearch/logs/clearml.log
file (neither inside the container nor in my server)
I’m suggesting MagnificentWorm7 to do that yes, instead of adding the files to a ClearML dataset in each step
That’s why I’m suggesting him to do that 🙂
but would installinggit+
<user>/rmdatasets
install rmdatasets == 0.0.1
?
Aren’t they redundant?
would the same experiment be called in either clearml server?
I also changed the permissions of /usr/share/elasticsearch
according to this post: https://techoverflow.net/2020/04/18/how-to-fix-elasticsearch-docker-accessdeniedexception-usr-share-elasticsearch-data-nodes/ , but I’m getting the same error
To give more context, he is running an hyper params optimization script, that internally clones a base task and runs it with certain params and checks if a metric increases or decreases. It is when the agent tries to run this task that the error raises.
ERROR: Could not install packages due to an EnvironmentError: [Errno 28] No space left on device
clearml_agent: ERROR: Could not install task requirements!
Command '['~/.clearml/venvs-builds/3.8/bin/python', '-m', 'pip', '--disable-pip-v...
Right, you don’t want ClearML to track that package, but there isn’t much you can do there I believe. I was trying to tackle how to run your code with an agent given those dependencies.
I think that if you clone the experiment, and remove that line in the dependencies sections in the UI you should be able to launch it correctly (as long as your clearml-agent
has the correct permissions)
and btwif "
@" in line: line = line.replace("
@", "https://")
should be the same as
...
before the repo was already in the docker, but now it is running the agent inside the docker (so setting a virtualenv, and cloning the repo, and installing the packages)
exactly, somewhere in the docker running
another thing: I had to change 8081
to 8085
since it was already used
Currently I’m changing /opt/ for my home folder
I don’t see an agent section there 😕
Can you move your current clearml.conf
file to another location and run clearml-agent init
?
Hi ExasperatedCrocodile76 , I guess that you were able to install Scikit-learn and you were able to run it locally, and now you want to try it with an agent on the same machine.
The error is that it can’t find OpenBLAS:
` Run-time dependency openblas found: NO (tried pkgconfig and cmake)
Run-time dependency openblas found: NO (tried pkgconfig)
../../scipy/meson.build:130:0: ERROR: Dependency "OpenBLAS" not found, tried pkgconfig `My question is: did you export some env variabl...
Hi! What the error is saying is that it is looking for the the ctbc/image_classification_CIFAR10.py
file in your repo.
So when you created the task you were inside a git repo, and ClearML assumed that all your files in it were commited and pushed. However your repo https://github.com/gradient-ai/PyTorch.git doesn’t contain these files
I agree, but setting the agent’s env variable TMPDIR didn’t seem to have any effect (check the log above, it is still using /tmp
)
it would be easier for a sysadmin to center the credentials of the bucket in the clearml-server, without the need to distribute them…every user in the server has the same credentials, and they don’t need to know them..makes sense?
any idea what could be the issue @<1523701087100473344:profile|SuccessfulKoala55> ?
no because every user that is trying to write in the bucket has the same credentials