Can You tell me which python version is running on the agent/docker and which docker image?
Perfect, thanks! Only issue that is left, is that it seems like .ssh
is used even when I provideSSH_AUTH_SOCK
. I created an issue here: https://github.com/allegroai/clearml-agent/issues/45
Okay, I didn't know that. I just saw that VSCode seems to use a similar setup for their docker devcontainers.
Is there a way for me to configure/add the run arguments for the docker run
call?
What exactly do you mean by docker run permissions?
Yes, but this seems pretty reasonable to assume imo.
I am using https://hub.docker.com/layers/nvidia/cuda/11.8.0-base-ubuntu22.04/images/sha256-88b85c6edd089acdf0cb7f3be020a1e812b009bafaf92c1715ab6677bd997ef1?context=explore
which has python 3.10.6 if I remember correctly.
Hi TimelyMouse69 Thank you for your answer.
I use 3.10.8 locally and 3.10.6 remotely. Everything is run in a docker container, locally and remotely on the docker-agent (exactly the same docker image).
Thank you for looking into the disappearing dev
. It seems like this should be the reason for pip trying to install a stable version of 1.14, which does only exist as nightly
Bonus question: Is there some clearml-agent mode that does not do "some magic" and instead just installs exactly what is shown in the "INSTALLED PACKAGES" editor in the web UI?
Here is some code that shows exactly what goes wrong. I do local execution only. It seems not to be related to remote execution as I thought, but more related to clearml.Task:
` args = parser.parse_args()
print(args) # FIRST OUTPUT
command = args.command
enqueue = args.enqueue
track_remote = args.track_remote
preset_name = args.preset
type_name = args.type
environment_name = args.environment
nvidia_docker = args.nvidia_docker
# Initialize ClearML Tas...
I just wanna avoid that ClearML leaves files lingering around. Btw: a better default behavior in my opinion would be to delete tasks only after files have been deleted. And only with the force option to delete the task anyways!
I will read up on the services documentation then. Thank you very much for the help 🙂
No. Here is a better example. I have two types of workstations: Type X can execute tasks of type A and B. Type Y can execute tasks of type B. This could be the case if type X workstations have for example more VRAM, newer drivers, etc...
I have two queues. Queue A and Queue B. I submit tasks of type A to queue A and tasks of type B to queue B.
Here is what can happen:
Enqueue the first task of type B. Workstations of type X will run this task. Enqueue the second task of type A. Workstation ...
I see. Thank you very much. For my current problem giving priority according to queue priority would kinda solve it. For experimentation I will sometimes enqueue a task and then later enqueue a another one of a different kind, but what happens is that even though this could be trivially solved, I will have to wait for the first one to finish. I guess this is only a problem for people with small "clusters" where SLURM does not make sense, but no scheduling at all is also suboptimal.
However, I...
To summarize: The scheduler should assign tasks the the agent first, which gives a queue the highest priority.
I think sometimes there can be dependencies that require a newer pip version or something like that. I am not sure though. Why can we even change the pip version in the clearml.conf?
Nvm, that does not seem to be a problem. I added a part to the logs in the post above. It shows that some packages are found from conda.
That I understand. But I think (old) pip versions will sometimes not resolve a package. Probably not the case the other way around.
Unfortunately, I do not know that. Must be before October 2021 at least. I know I asked here how to use the preinstalled version and AgitatedDove14 helped me to get it work. But I cannot find the old thread 😕
Oh, interesting!
So pip version on per task basis makes sense ;D?
I just manually went into the docker container and ran python -m venv env --system-site-packages
and activated the virtual env.
When I run pip list
then, it correctly shows the preinstalled packages including torch 1.12.0a0+2c916ef
The one I posted on top 22.03-py3
😄
Yea, but doesn't this feature make sense on a task level? If I remember correctly, some dependencies will sometimes require different pip versions. And dependencies are on task basis.
btw: I am pretty sure this used to work, but then stopped work some time ago.
Thank you very much for the fast work!
One last question: Is it possible to set the pip_version task-dependent?
- solves it. I did not know this is possible.
Thank you very much!
Could you guide me to the documentation for using the docker file? I am not able to find it. I only found task.set_base_docker
which I am not sure what it does.
Perfect, thank you 🙂
Alright, thanks. Would be a nice feature 🙂