AgitatedDove14
hmmm... they are important, but only when starting the process. any specific suggestion ?
(and they are deleted after the Task is done, so they are temp)
Ah, then no, sounds temporary. If they're only relevant when starting the process though, I would suggest deleting them immediately when they're no longer needed, and not wait for the end of the task (if possible, of course)
my question is how to recover, must i recreate the agents or there is another way?
Yes you have to recreate the Task (I assume they failed, no?!)
So how do I solve the problem? Should I just relaunch the agents? Because they can't execute jobs now
Are you running in docker mode ?
If so you can actually delete mapped files (they will still be available inside the docker), just make sure you delete them X hours after they were created, and you should be fine.
wdyt?
I would suggest deleting them immediately when they're no longer needed,
This is the idea for the next RC, it will delete them after it is done using 🙂
I have a process that cleans the
/tmp
each day,
WackyRabbit7 the files (configuration etc.) that are mapped into the containers are stored there.
They should clean themselves, that said, we have noticed that the services-mode skips this cleanup, and it will be solved on the next RC of clearml-agent.
Make sense ?
Ohh... I would not delete them then ... 😞
Maybe kind of heuristics (files created a week ago can be deleted?!)
Maybe they shouldn't be placed under /tmp
if they're mission critical, but rather the clearml cache folder? 🤔
if they're mission critical, but rather the clearml cache folder?
hmmm... they are important, but only when starting the process. any specific suggestion ?
(and they are deleted after the Task is done, so they are temp)
I had to restart the agent and now everything is fine
I'll just exclude .cfg files from the deletion, my question is how to recover, must i recreate the agents or there is another way?
That's awesome, but my problem right now is that I have my own cronjob deleting the contents of /tmp
each interval, and it deletes the cfg files... So I understand I must skip deleting them from now on
So how do I solve the problem? Should I just relaunch the agents? Because they can't execute jobs now