None
This last solution @<1523701205467926528:profile|AgitatedDove14> proposed did the trick!
To give more context, he is running an hyper params optimization script, that internally clones a base task and runs it with certain params and checks if a metric increases or decreases. It is when the agent tries to run this task that the error raises.
ERROR: Could not install packages due to an EnvironmentError: [Errno 28] No space left on device
clearml_agent: ERROR: Could not install task requirements!
Command '['~/.clearml/venvs-builds/3.8/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/tmp/cached-reqsot4de9w6.txt']' returned non-zero exit status 1.
Is it possible that the agent is somehow limiting the space for the environment creation @<1523701087100473344:profile|SuccessfulKoala55> ?
Because if he runs the same command in a console the install works
@<1523703080200179712:profile|NastySeahorse61> so glad you managed to solve it 🎊 🚀
I don’t see an agent section there 😕
Can you move your current clearml.conf
file to another location and run clearml-agent init
?
Sure @<1523701087100473344:profile|SuccessfulKoala55> ! Here it is!
I agree, but setting the agent’s env variable TMPDIR didn’t seem to have any effect (check the log above, it is still using /tmp
)
Then the only other option is the /tmp
is out of space (pip uses it to uncompress the .whl files, then it deletes them)
wdyt?
I had tried adding those environment variables, but not in the agents 🙈
agree, but setting the agent’s env variable TMPDIR
I think this needs to be passed to the docker with -e TMPDIR=/new/tmp
as additional container args:
see example
None
wdyt?
any idea what could be the issue @<1523701087100473344:profile|SuccessfulKoala55> ?
and then run my script from terminal normally... (in the case of the environment variable I passed it before the python command)
oh sorry my bad, then you probably need to define all OS environment variable for python temp folder for the agent (the Task process itself is a child process so it will inherit it)
TMPDIR/new/tmp TMP=/new/tmp TEMP=/new/tmp clearml-agent daemon ...
Hi Lema Gabriel, thank you very much for your answer. I'm just using the defaults... Should I change something in the configuration?
I'll attach my config just in case
Thanks so much for all your help @<1523701205467926528:profile|AgitatedDove14> @<1523702868694011904:profile|AbruptCow41> @<1523701087100473344:profile|SuccessfulKoala55>
Then this is by default the free space on the home folder (`~/.clearml') that is missing free space
Hi @<1523701087100473344:profile|SuccessfulKoala55> . I'm trying to run an optimization task, based on a previous experiment. I ran the agent like this:
clearml-agent daemon --queue my_queue -d
also I suggested to change TMPDIR env variable, since /tmp/ didn’t have a lot of space.
agent.environment.TMPDIR = ****
is it ok to see *
**
*
instead of the actual path?
ERROR: Could not install packages due to an EnvironmentError:
[Errno 28] No space left on device
BTW: @<1523703080200179712:profile|NastySeahorse61> this sounds like docker out of space on the Main disk '/var/` where it stores all the images and temp file systems
This will cause you code to fail as any runtime change to the container file system will raise this out of disk space error
oh ok, I was wondering if this could have been an issue:agent.venvs_cache.free_space_threshold_gb = 2.0
Well, the agent actually can't limits this space even if we wanted to 🙂
Right, but there is a lot of free space (257 GB) in the home folder
@<1523703080200179712:profile|NastySeahorse61> / @<1523702868694011904:profile|AbruptCow41>
Is there a way to avoid each task to create a new environment?
You can just define CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1
it will just use whatever you have there (notice it will totally ignore requirements.txt and "installed packages" on the Task)
BTW I would recommend turning on the venv caching, this is per docker/python/packages caching so the next time you are using th exact requirements it just pulls it from the cache and attaches to the container.
Un-comment this line
None
@<1523703080200179712:profile|NastySeahorse61> how are you running the agent? What is the command line? And how are you passing the environment variable you mentioned?
line 120 says unmark to enable venv caching (it comes commented by default, but since I’m copying my conf it isn’t commented there)
can you share your clearml.conf
file (remove the critical information first)?
Can you share the agent's/task full log when running this task?