Reputation
Badges 1
45 × Eureka!Great, it is quite important for my use case. If you could also allow task.get_reported_console_output()
to get a log level as input (or minimal log level), I'd be grateful.
SuccessfulKoala55 , meanwhile I try that, I encounter something weird. I am using a clearml agent with the following
clearml-agent daemon --detached --docker --gpus 0,1,2,3 --dynamic-gpus --queue kenny_1_gpu_queue=1
But for some reason although all the gpus are free and no other agent is on the machine, only one task is executed at the time instead of 4. Why is that?
I am not sure what you mean. This is text, while I grab it from the artifact via python and print it, newlines are printed as expected
Latest allegro POC server (saips)
I am also running from a NVIDIA container and I get
ERROR: No matching distribution found for tensorflow==2.4.0+nv
clearml_agent: ERROR: Could not install task requirements!
docker image is
http://nvcr.io/nvidia/tensorflow:21.10-tf2-py3
What should I do?
I know you can download the data like a json from the plots tab in the UI, but I want the data programmatically
Hi SuccessfulKoala55 ,
failed. I read in the docs I can use mark_failed .
How should I use it correctly with task.close()?
Thanks. But I am not talking about scalars. I am talking about plots I've reported to ClearML using .report_histogram or .report_scatter2d or .report_table
Well, on the first task it grabs it opens a different WORKER:gpu0 worker entry as expected while the agent stays with WORKER:dgpu0,1,2,3
but the other tasks on queue won't start and upon the first task's completion the following are not being run on WORKER:gpu0 but on WORKER:dgpu0,1,2,3 instead using only 1 GPU (the task execution says it runs on WORKER:gpu0)
clearml-agent daemon --detached --gpus 0,1,2 --dynamic-gpus --queue 2_gpu_queue=2 --docker --stop
I was told not to kill the process, also, finding it on my own seems very un-user-friendly
I'd like if possible a command line, same as I'd just sent, to recognize the specific worker that was brought up in this manner and kill only it
AgitatedDove14 , could it be that the GitHub is not synchronized? I can find only up to 1.2.0.rc3 in it.