Hi @<1749965229388730368:profile|UnevenDeer21> , can you add the log of the job that failed?
Also, note that you can set these arguments from the webUI on the task level itself as well, Execution tab and then container section
@<1523701070390366208:profile|CostlyOstrich36> Thanks, I know about editing from webUI but for some arg like network and ipc I want to set to default for clearml-agent so our enduser don't need to worry about it. When changing agent.default_docker.arguments in clearml-agent to
["--network=host", "--ipc=host"]
Then when our user init a tasks with a custom image (different than agent.default_docker.image) => I check the console log and see that in the docker run command, there is no network and ipc option.
Here is the clearml.conf
default_docker: {
# default docker image to use when running in docker mode
image: "python3.10-cuda12.2:latest"
# optional arguments to pass to docker image
arguments: ["--network=host", "--ipc=host" ]
}
And here is the code that user used to run task (with python3.8 image)
from clearml import Task
task = Task.init(project_name='my_project', task_name='my_remote_task')
task.set_base_docker('python3.10-cuda12.2:latest')
task.execute_remotely(queue_name=Default)
When task run, this is the docker run command
Executing: ['docker', 'run', '-t', '--gpus', 'all', '-l', 'clearml-worker-id=ubuntu-s-3090ti:0', '-l', 'clearml-parent-worker-id=ubuntu-s-3090ti:0', '-e', 'CLEARML_WORKER_ID=ubuntu-s-3090ti:0', '-e', 'CLEARML_DOCKER_IMAGE=registry.torus.ai/taureau/python3.8-cuda12.2:latest', '-e', 'CLEARML_TASK_ID=49d14d9ff5f944adad588ab5f05ecebe', '-v', '/root/.gitconfig:/root/.gitconfig', '-v', '/tmp/.clearml_agent.fk2xon5d.cfg:/tmp/clearml.conf', '-e', 'CLEARML_CONFIG_FILE=/tmp/clearml.conf', '-v', '/tmp/clearml_agent.ssh.j2xj7jb_:/.ssh', '-v', '/root/.clearml/apt-cache:/var/cache/apt/archives', '-v', '/root/.clearml/pip-cache:/root/.cache/pip', '-v', '/root/.clearml/pip-download-cache:/root/.clearml/pip-download-cache', '-v', '/root/.clearml/cache:/clearml_agent_cache', '-v', '/root/.clearml/vcs-cache:/root/.clearml/vcs-cache', '-v', '/root/.clearml/venvs-cache:/root/.clearml/venvs-cache', '--rm', 'python3.8-cuda12.2:latest', 'bash', '-c', 'echo \'Binary::apt::APT::Keep-Downloaded-Packages "true";\' > /etc/apt/apt.conf.d/docker-clean ; chown -R root /root/.cache/pip ; export DEBIAN_FRONTEND=noninteractive ; export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL libsm6 libxext6 libxrender-dev libglib2.0-0" ; cp -Rf /.ssh -T ~/.ssh ; [ ! -z $(which git) ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL git" ; declare LOCAL_PYTHON ; [ ! -z $LOCAL_PYTHON ] || for i in {15..5}; do which python3.$i && python3.$i -m pip --version && export LOCAL_PYTHON=$(which python3.$i) && break ; done ; [ ! -z $LOCAL_PYTHON ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL python3-pip" ; [ -z "$CLEARML_APT_INSTALL" ] || (apt-get update -y ; apt-get install -y $CLEARML_APT_INSTALL) ; [ ! -z $LOCAL_PYTHON ] || export LOCAL_PYTHON=python3 ; $LOCAL_PYTHON -m pip install -U "pip<20.2 ; python_version < \'3.10\'" "pip<22.3 ; python_version >= \'3.10\'" ; $LOCAL_PYTHON -m pip install -U clearml-agent ; cp /tmp/clearml.conf ~/default_clearml.conf ; NVIDIA_VISIBLE_DEVICES=all $LOCAL_PYTHON -u -m clearml_agent execute --disable-monitoring --id 49d14d9ff5f944adad588ab5f05ecebe']