Hi BurlyRaccoon64 , where is this config file located? Also, when you trigger the experiment from the UI, where is it executed? Do you have clearml-agent configured somewhere? If so, where is it running?
I have a gpu machine in the local network where clearml-agent is running. I send tasks for execution to the queue configured on the agent (either through the UI, or through the script with lines Task.execute_remotely(queue_name=...)
running on another machine in the same network). Config file is located in the /home/{username}
folder on the machine where agent is running
Do you perhaps have some arguments as part of the docker image itself?
Image was built with the following command docker build --build-arg PAT=$(shell echo ${PAT}) -t $(IMAGE_NAME) .
Do you think the fact that --build-arg
argument is used may be a problem here? I was thinking that ARGUMENTS
parameter is used in combination with docker run
to start a container and has nothing to do with image build arguments
when I trigger such experiment through the the UI after that it only uses
IMAGE
from config file
BurlyRaccoon64 I was actually asking about the value you have configured in the default image in the config file
Ahh, sorry about that. I have both image and arguments values in the config file:default_docker: { image: {our_custom_image_name} arguments: ["--ipc=host", "-v", "/home/{username}/clearml.conf:/workdir/clearml.conf", "-v", "/home/{username}/.ssh:/root/.ssh", "-e", "CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1"] }
And, as I said, when the Task is cloned and sent for execution in the UI (and it doesn't have anything in the IMAGE
or ARGUMENTS
fields on the execution tab initially) only IMAGE
field is parsed from the config file and filled with {our_custom_image_name}
but ARGUMENTS
field stays empty
But when a new task is created in the code with Task.init()
both fields are parsed correctly
And the original task (the one you clone) has the correct image/arguments?
Yes, and if I clone it as it is, everything works as expected, but if I clone it and clear both IMAGE
and ARGUMENTS
only IMAGE
will be parsed when such task is sent for execution. You may ask why do I need to clear something if it works fine with just cloned tasks, but the reason to do it, as we have some old template tasks added by other members of my team (before we switched to user custom image and docker running agent) with empty IMAGE
and ARGUMENTS
fields and they won't be executed correctly, because of the issue above
I'm trying to understand if the clear itself is bad... After you do the clear, can you perhaps check the task's entry in mongodb?
Hi again SuccessfulKoala55 Sorry for a late response. Thanks for your help so far! I understand that it's a weird problem and probably it won't be resolved in this discussion but just in case I've checked the mongodb entry for the task after cloning, after clearing of the fields and after sending it for execution.
After clone:{ "_id" : "b94657b00adf41dd971426f51d7b9373", "container" : { "image" : "{image_name}", "arguments" : "--ipc=host -v /home/{username}/clearml.conf:/workdir/clearml.conf -e CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1", "setup_shell_script" : "" } }
After clear:{ "_id" : "b94657b00adf41dd971426f51d7b9373", "container" : { "image" : "", "arguments" : "", "setup_shell_script" : "" } }
After enqueue:{ "_id" : "b94657b00adf41dd971426f51d7b9373", "container" : { "image" : "{image_name}", "arguments" : "", "setup_shell_script" : "" } }
(actual image name and username are replaced with {image_name}
and {username}
in the message)
I've sent it just in case. Anyway, appreciate you spent a lot of your time on me already, so it's ok to drop it I guess.