Reputation
Badges 1
282 × Eureka!Its actually in your documentation. Its removed since 0.17 apparently.
https://allegro.ai/clearml/docs/docs/release_notes/ver_0_17.html#clearml-agent-0-17-2
And this is my logs, it tried to install something and encountered permission denied. It wouldn't if it obeyed the force_repo_requirements_txt.
1620664917916 Kahs-MacBook-Pro.local info ClearML Task: created new task id=024a421c0e174650a1c7ff64af756c26 ClearML results page:
`
1620664920359 Kahs-MacBook-Pro.local info ClearML Mon...
which clearml.conf is it refering to? I'm executing on my client, which is then remotely executed by the agent. Both of them has ~/clearml.conf.
We are using k8s glue to spawn the job. Would you be able to advise in detail of steps on what goes on when the above code executes?
AgitatedDove14 , will these be fixed?
Passing env via the code Passing env via template yaml
Executing task id [228caa5d25d94ac5aa10fa7e1d02f03c]:
repository = https://192.168.50.88:18443/tkahsion/pytorchmnist
branch = master
version_num = cfb833bcc70f3e10d3b6a96cfad3225ed682382b
tag =
docker_cmd = nvidia/cuda:10.1-runtime-ubuntu18.04
entry_point = pytorch_mnist.py
working_dir = .
Warning: could not locate requested Python version 3.9, reverting to version 3.6
Using base prefix '/usr'
New python executable in /root/.clearml/venvs-builds/3.6/bin/python3.6
Also creating executable i...
After some churning, this is the answer. Change it in the clearml-agent init
generated clearml.conf.
` default_docker: {
# default docker image to use when running in docker mode
image: "nvidia/cuda:10.1-runtime-ubuntu18.04"
# optional arguments to pass to docker image
# arguments: ["--ipc=host", ]
arguments: ["--env GIT_SSL_NO_VERIFY=true",]
} `
Sorry take back. Just realised that this argument only worked on running the agent, but when you enqueue a task into this agent, the argument is not passed on to the container that the agent spawned.
This is the same issue for the docker image. It reverts back to nvidia/cuda:10.1-runtime-ubuntu18.04 despite me setting something else.
Hi AgitatedDove14 , i dug a bitt deeper. I saw this in installed packages
in the original completed task. When the task is cloned, this is copied over and thus the problem. Can i ask, how ClearML create the list of installed packages? Why is it that some of them (E.g. attr is being pulled from @ file:///tmp/build/80754af9/attrs_1604765588209/work)
` absl-py==0.11.0
alabaster==0.7.12
antlr4-python3-runtime==4.8
apex==0.1
appdirs==1.4.4
argon2-cffi==20.1.0
ascii-graph==1.5.1
async-gener...