Reputation
Badges 1
606 × Eureka!Tested with clearml-agent 1.0.1rc4/1.2.2 and clearml 1.3.2
I just wanna avoid that ClearML leaves files lingering around. Btw: a better default behavior in my opinion would be to delete tasks only after files have been deleted. And only with the force option to delete the task anyways!
I will read up on the services documentation then. Thank you very much for the help 🙂
No. Here is a better example. I have two types of workstations: Type X can execute tasks of type A and B. Type Y can execute tasks of type B. This could be the case if type X workstations have for example more VRAM, newer drivers, etc...
I have two queues. Queue A and Queue B. I submit tasks of type A to queue A and tasks of type B to queue B.
Here is what can happen:
Enqueue the first task of type B. Workstations of type X will run this task. Enqueue the second task of type A. Workstation ...
I see. Thank you very much. For my current problem giving priority according to queue priority would kinda solve it. For experimentation I will sometimes enqueue a task and then later enqueue a another one of a different kind, but what happens is that even though this could be trivially solved, I will have to wait for the first one to finish. I guess this is only a problem for people with small "clusters" where SLURM does not make sense, but no scheduling at all is also suboptimal.
However, I...
Makes sense, but it is not optimal if one of the agents is only able to handle tasks of a single queue (e.g. if the second agent can only work on tasks of type B).
To summarize: The scheduler should assign tasks the the agent first, which gives a queue the highest priority.
I think sometimes there can be dependencies that require a newer pip version or something like that. I am not sure though. Why can we even change the pip version in the clearml.conf?
Nvm, that does not seem to be a problem. I added a part to the logs in the post above. It shows that some packages are found from conda.
That I understand. But I think (old) pip versions will sometimes not resolve a package. Probably not the case the other way around.
Unfortunately, I do not know that. Must be before October 2021 at least. I know I asked here how to use the preinstalled version and AgitatedDove14 helped me to get it work. But I cannot find the old thread 😕
Oh, interesting!
So pip version on per task basis makes sense ;D?
I just manually went into the docker container and ran python -m venv env --system-site-packages
and activated the virtual env.
When I run pip list
then, it correctly shows the preinstalled packages including torch 1.12.0a0+2c916ef
The one I posted on top 22.03-py3
😄
Yea, but doesn't this feature make sense on a task level? If I remember correctly, some dependencies will sometimes require different pip versions. And dependencies are on task basis.
btw: I am pretty sure this used to work, but then stopped work some time ago.
Thank you very much for the fast work!
One last question: Is it possible to set the pip_version task-dependent?
- solves it. I did not know this is possible.
Thank you very much!
Could you guide me to the documentation for using the docker file? I am not able to find it. I only found task.set_base_docker
which I am not sure what it does.
Perfect, thank you 🙂
Alright, thanks. Would be a nice feature 🙂
clearml==0.17.4
` task dca2e3ded7fc4c28b342f912395ab9bc pulled from a238067927d04283842bc14cbdebdd86 by worker redacted-desktop:0
Running task 'dca2e3ded7fc4c28b342f912395ab9bc'
Storing stdout and stderr log to '/tmp/.clearml_agent_out.vjg4k7cj.txt', '/tmp/.clearml_agent_out.vjg4k7cj.txt'
Current configuration (clearml_agent v0.17.1, location: /tmp/.clearml_agent.us8pq3jj.cfg):
agent.worker_id = redacted-desktop:0
agent.worker_name = redacted-desktop
agent.force_git_ssh...
I see. Thanks for explaining!
My agent shows the same as before:
` ...
Environment setup completed successfully
Starting Task Execution:
DONE: Running task 'aff7c6605b7243d38968f95b4351b127', exit status 0 `
So actually deleting from client (e.g. an dataset with clearml-data) works.
Ah, perfect. Did not know this. Will try! Thanks again! 🙂
Ah, very cool! Then I will try this, too.
I got the idea from an error I got when the agent was configured to use pip and tried to install BLAS (for PyTorch I guess) and it threw an error.