Reputation
Badges 1
282 × Eureka!Ok, i guess i will have to kill the whole thing and refresh it.
AgitatedDove14 , would you elaborate on this resolution process?
Although I think you can also pull specific chunks of dataset
How do you do that with clearml-data?
Unfortunately it's not. The problem previously encountered with the docker method surfaced again. In this case, the BASE DOCKER IMAGEnvidia/cuda:10.1-runtime-ubuntu18.04 --env GIT_SSL_NO_VERIFY=true is not taking effect with the k8s glue.
Its. 0.17-63.
It doesn't appear in profile page.
Hi, how may i task.init() within these sub processes without write access to the 3rd party scripts and python executables?
This is a env var?
CLEARML_CONFIG_FILE
Hi AgitatedDove14 , thanks.
In this case i am running k8s glue (machine glue), which will then spawn off pods in kubernetes worker (machine worker). So when you say direct access, are you refering to the Glue machine or K8S Worker machine?
Hi,
It did, nvidia/cuda:10.1-runtime-ubuntu18.04.
So if i need to set this every time, what is the following config for? And how do i pass in new env parameters?
` default_docker: {
# default docker image to use when running in docker mode
image: "dockerrepo/mydocker:custom"
# optional arguments to pass to docker image
# arguments: ["--ipc=host", ]
arguments: ["--env GIT_SSL_NO_VERIFY=true",]
} `
I also see this on my logs, noting that the config is read in but its still printing the supposedly hidden keys on the logs and UI.agent.hide_docker_command_env_vars.enabled = true agent.hide_docker_command_env_vars.extra_keys.0='TRAINS_AGENT_GIT_USER' ..... docker_cmd=harbor.ai/public/detectron2:v3 --env TRAINS_AGENT_GIT_USER=gituser
Ok i get the logic now. extra_docker_shell_script executes before clearml-agent talks to clearml server.
I see. Is there a more elaborate codeset that describes the above interactions?
Thanks AgitatedDove14 , will take a look.
Ok. That brings me back to the spawned pod. At this point, clearml-agent and its config would be a controbuting factor. Is the absence of /tmp/.clearml_agent.xxxxxx.cfg an issue?
Hi, the latest k8sglue-example.py was last commited about 4 months ago. Are you refering to that version?
That didn't work as well...
Its running as a long running POD on K8S. I'm using log -f to track its stdout.
Hi TimelyPenguin76 , i am adding a debug sample to an existing task using the above method. What should i put for the iteration? I do not want to overwrite existing ones but i do not know what's the last count. This is for both scalar and media reporting.
Hi ResponsiveHedgehong88 , I was trying to do the same thing but the loggerhook doesn't seem to work. The console log and scalar logs didn't come out when I registered via init.py and load via log_config. Are you able to share how you configure it?
Create immutable and differentiable versions on-prem or in the cloud with our data agnostic solution.
So i kept trying, but i'm stuck on this when i run python k8s_glue_example.py
TypeError: init () got an unexpected keyword argument 'base_pod_num'
Reply…
Hi, any advice on this? thanks.
docker exec clearml-elastic curl zsh: no matches found:
Just to put a ping for those on this side of the timezone to look at. Thanks.
Hi, by deployment strategies I meant by canary, blue-green...etc..etc. I figured this should be done by clearml-serving and maybe seldon as well.
Try set docker_force_pull: true under agent section of your agent's clearml.conf.
Hi AgitatedDove14 , i've got the same error. It would appear that the code references clearml_agent/helper/base.py which i believe is part of clearml-agent v0.17.1. Could that be the issue?
thanks. That seems to work. I got a question, does it save the best model or the model in the last epoch?