Disabling the VCS cache will no longer cache the cloned git folder You can filter by 'Running' Experiments in ClearML and search for one that hasn't reported for a while and start investigating those
AbruptWorm50 , that's strange. I'll take a look as well. What version of clearml
are you using?
Regarding your questions:
disable VCS cache - https://github.com/allegroai/clearml-agent/blob/master/docs/clearml.conf#L120 I think lock is created when running an experiment, maybe it hung so the lock never got lifted
wdyt?
EDIT
I have disabled VCS-cache and it seems that the multiple cache files are still created when running a new task. Also still the lock is created once a new experiment is run: first image - cache after removing lock, second image - a few seconds later after running a new task. Also attached log output of the task uploaded (with ### replacing non relevant details).
Yeah this is a lock which is always in our cache, cant figure out why it's there, but when I delete the lock and the other files, they always reappear when I run a new clearml task.
Another thing I should note: I have recently had an error which fix was to run git config --global --add safe.directory /root/.clearml/vcs-cache/r__ (git repo name).d7f
Ever since, once I run a new task - a new file appears in the cache with the format of <git repo name.lock file name_a bunch of numbers>
My questions are:
- how can I avoid creating tens of new cache files?
- do you happen to know why this lock is created and how it is connected to the above error (in the link - regarding "failing to clone.. ")
This is not something that we defined or created- if I understand your question. It is created once a ClearML task is run, and there until the lock is deleted (which is something we do to handle another error I posted here about)
Hi AbruptWorm50 ,
The cached files are used by ClearML - Here is an example:
https://github.com/allegroai/clearml-agent/blob/master/docs/clearml.conf#L120
Regarding the first question - What is the lock from the 24th of April. It seems that this process is what is blocking cache usage
What will happen if I disable the cache? Is there a way to find out which experiment is hung and why? in order to avoid this?
Yeah this is a lock which is always in our cache, cant figure out why it's there, but when I delete the lock and the other files, they always reappear when I run a new clearml task.
Is the lock something that occurs on your machine regardless of ClearML?
EDIT CostlyOstrich36
third image - cache after running another task with new cache file created even though cache is disabled