Reputation
Badges 1
611 × Eureka!I just wanna avoid that ClearML leaves files lingering around. Btw: a better default behavior in my opinion would be to delete tasks only after files have been deleted. And only with the force option to delete the task anyways!
Yea, when the server handles the deletes everythings fine and imo, that is how it should always have been.
I don't think it is a viable option. You are looking at the best case, but I think you should expect the worst from the users 🙂 Also I would rather know there is a problem and have some clutter than to hide it and never be able to fix it because I cannot identify which artifacts are still in use without spending a lot of time comparing artifact IDs.
I guess this is the current way to do it: https://github.com/tensorflow/tensorboard/issues/39#issuecomment-568917607 so I would say: Yes, it supports gif.
@<1576381444509405184:profile|ManiacalLizard2> Yes, exactly. I just didn't know how, but now it is all working 🙂
And yes, I have multiple credentials in the clearml.conf of the agents. It's not a good solution, but since I am currently limited to the free version of ClearML, it is the best I could do.
Ah, perfect. Did not know this. Will try! Thanks again! 🙂
Maybe this is something that is only possible with the vault of the enterprise version?
MortifiedDove27 Sure did, but I do not understand it very well. Else I would not be asking here for an intuitive explanation 🙂 Maybe you can explain it to me?
I see. Thank you very much. For my current problem giving priority according to queue priority would kinda solve it. For experimentation I will sometimes enqueue a task and then later enqueue a another one of a different kind, but what happens is that even though this could be trivially solved, I will have to wait for the first one to finish. I guess this is only a problem for people with small "clusters" where SLURM does not make sense, but no scheduling at all is also suboptimal.
However, I...
==> 2021-03-11 13:54:59 <==
# cmd: /home/tim/miniconda3/condabin/conda create --yes --mkdir --prefix /home/tim/.clearml/venvs-builds/3.8 python=3.8
# conda version: 4.9.2
+defaults/linux-64::_libgcc_mutex-0.1-main
+defaults/linux-64::ca-certificates-2021.1.19-h06a4308_1
+defaults/linux-64::certifi-2020.12.5-py38h06a4308_0
+defaults/linux-64::ld_impl_linux-64-2.33.1-h53a641e_7
+defaults/linux-64::libedit-3.1.20191231-h14c3975_1
+defaults/linux-64::libffi-3.3-he6710b0_2
+defaults/linux-64...
drwxr-xr-x 10 root root 4096 Jul 31 2020 .
drwxr-xr-x 14 root root 4096 Jul 31 2020 ..
drwxr-xr-x 2 root root 4096 Feb 4 13:52 bin
drwxr-xr-x 2 root root 4096 Jul 31 2020 etc
drwxr-xr-x 2 root root 4096 Jul 31 2020 games
drwxr-xr-x 2 root root 4096 Jul 31 2020 include
drwxr-xr-x 4 root root 4096 Feb 3 13:40 lib
lrwxrwxrwx 1 root root 9 Dez 10 14:29 man -> share/man
drwxr-xr-x 2 root root 4096 Jul 31 2020 sbin
drwxr-xr-x 7 root root 4096 Jul 31 2020 share
drwxr-xr-x ...
But this seems like something that is not related to clearml 🙂 Anyways, thanks again for the explanations!
Ah, now I see. This sounds like a good solution.
Okay, no worries. I will check first. Thanks for helping!
Quick question: Where again does clearml place the venv? I wanna take a look into it after the task has failed
Sure, no problem!
Currently, my solution is to create an "agent-git" account and users can give read-access to this account which the clearml-agent then uses to clone. However, I find access-tokens to be a better solution. Unfortunately, clearml-agent removes the token from the git url
Local execution output:ClearML Task: created new task id=855948f5d73c47e2ae37bb821385e15b ======> WARNING! Git diff to large to store (2190kb), skipping uncommitted changes <====== ClearML results page: uploading artifact done uploading artifact 2021-02-05 16:24:56,112 - clearml.Task - INFO - Waiting to finish uploads 2021-02-05 16:24:58,499 - clearml.Task - INFO - Finished uploading
btw: With the ssh agent forwarding I do not have any issues ( https://github.com/allegroai/clearml-agent/issues/45 )
Oh, interesting!
So pip version on per task basis makes sense ;D?
@<1576381444509405184:profile|ManiacalLizard2> Just so I understand correctly:
You are saying that in your local, user-specific, clearml.conf you set the api.files_server , but in your remote, clearml-agent, clearml.conf you left it empty?
Maybe this opens up another question, which is more about how clearml-agent is supposed to be used. The "pure" way would be to make the docker image provide everything and clearml-agent should do not setup at all.
What I currently do instead is letting the docker image provide all system dependencies and let clearml-agent setup all the python dependencies. This allows me to reuse a docker image for more different experiments. However, then it would make sense to have as many configs as possib...
So clearml 1.0.1 clearml-agent 1.0.0 and clearml-server from master
I am currently on the Open Source version, so no Vault. The environment variables are not meant to used on a per task basis right?
I am referring to the UI. The default cleanup service should work with S3 with a correctly configured clearml service agent if I understand the workings correctly.
I ll add creating an issue to my todo list
Thanks for answering. I don't quite get your explanation. You mean if I have 100 experiments and I start up another one (experiment "101"), then experiment "0" logs will get replaced?
I think I still don't get how clearml is supposed to work/be used. Why wouldn't the following work currently?
Example:
` task = Task.init(...)
if not running_remotely:
task_dict = task.export_task()
requirements = task_dict["script"]["requirements"]["pip"].splitlines()
requirement_torch = [r for r in requirements if r.startswith("torch==")]
requirements.remove(requirement_torch[0])
requirements.append("torch >= 1.8.1")
task_dict["script"]["requirements"]["pip"] = "\n"....