
Reputation
Badges 1
45 × Eureka!project name is: RemoteStorage06/saips06/rdekel/hackathon_baselines/DATA_DIR/
No strange characters as far as I can tell
But this is not the data I want
A task can also have plots - for example 2d scatter plots and histograms
Yes, fail it and then close it
Hi SuccessfulKoala55 ,
failed. I read in the docs I can use mark_failed .
How should I use it correctly with task.close()?
My own agent.
I want to clarify:
I was asking if such a feature exists (that limits number of simultaneous service tasks that can be brought up when using service mode) and if so how can I utilize it.
It should be possible somehow, as they are attached to the Task and displayed in the Task's results tab
SuccessfulKoala55 On another note, I'm also getting
ERROR: Could not find a version that satisfies the requirement pandas==1.3.4 (from versions: 0.1, 0.2, 0.3.0, 0.4.0, 0.4.1, 0.4.2, 0.4.3, 0.5.0, 0.6.0, 0.6.1, 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.8.0, 0.8.1, 0.9.0, 0.9.1, 0.10.0, 0.10.1, 0.11.0, 0.12.0, 0.13.0, 0.13.1, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.15.2, 0.16.0, 0.16.1, 0.16.2, 0.17.0, 0.17.1, 0.18.0, 0.18.1, 0.19.0, 0.19.1, 0.19.2, 0.20.0, 0.20.1, 0.20.2, 0.20.3, 0.21.0, 0.21.1, 0.22.0, 0.23....
with a self-hosted clearml server
SuccessfulKoala55 , meanwhile I try that, I encounter something weird. I am using a clearml agent with the following
clearml-agent daemon --detached --docker --gpus 0,1,2,3 --dynamic-gpus --queue kenny_1_gpu_queue=1
But for some reason although all the gpus are free and no other agent is on the machine, only one task is executed at the time instead of 4. Why is that?
Thanks. But I am not talking about scalars. I am talking about plots I've reported to ClearML using .report_histogram or .report_scatter2d or .report_table
Well the requirements were automatically filled, not by me
TimelyPenguin76
Wouldn'ttask.mark_failed() task.close()
Work?
Latest allegro POC server (saips)
try making two tasks, both with the same project name (While the project name contains '//') and you will get the same error.
Well, on the first task it grabs it opens a different WORKER:gpu0 worker entry as expected while the agent stays with WORKER:dgpu0,1,2,3
but the other tasks on queue won't start and upon the first task's completion the following are not being run on WORKER:gpu0 but on WORKER:dgpu0,1,2,3 instead using only 1 GPU (the task execution says it runs on WORKER:gpu0)
I want to access their data
I was told not to kill the process, also, finding it on my own seems very un-user-friendly