Reputation
Badges 1
981 × Eureka!Ok, I got the following error when uploading the table as an artifact:ValueError('Task object can only be updated if created or in_progress')
can it be that the merge op takes so much filesystem cache that the rest of the system becomes unresponsive?
I am running on bare metal, and cuda seems to be installed at /usr/lib/x86_64-linux-gnu/libcuda.so.460.39
Hi PompousParrot44 , you could have a Controller task running in the services queue that periodically schedules the task you want to run
AgitatedDove14 any chance you found something interesting? π
AgitatedDove14 I think itβs on me to take the pytorch distributed example in the clearml repo and try to reproduce the bug, then pass it over to you π
clearml doesn't change the matplotlib backend under the hood, right? Just making sure π
@<1523701205467926528:profile|AgitatedDove14> I see other rc in pypi but no corresponding tags in the clearml-agent repo? are these releases legit?
yes, here is the error (the space at the end of the line is there)
` Applying uncommitted changes
Executing: ('git', 'apply'): b'error: corrupt patch at line 13\n'
Failed applying diff
trains_agent: ERROR: Failed applying git diff:
diff --git a/configs/2.2.2_from_scratch.yaml b/configs/2.2.2_from_scratch.yaml
index 9fece48..5816f78 100644
--- a/configs/2.2.2_from_scratch.yaml
+++ b/configs/2.2.2_from_scratch.yaml
@@ -136,7 +136,7 @@ data_processing:
optimizer:
type: 'RMSprop'
args:
- lr: 2.5e...
Awesome, thanks!
Is there any logic on the server side that could change the iteration number?
No, I want to launch the second step after the first one is finished and all its artifacts are uploaded
alright I am starting to get a better picture of this puzzle
But we can easily extend, right?
Usually one or two tags, indeed, task ids are not so convenient, but only because they are not displayed in the page, so I have to go back to another page to check the ID of each experiment. Maybe just showing the ID of each experiment in the SCALAR page would already be great, wdyt?
I don't think there is an example for this use case in the repo currently, but the code should be fairly simple (below is a rough draft of what it could look like)
` controller_task = Task.init(...)
controller_task.execute_remotely(queue_name="services", clone=False, exit_process=True)
while True:
periodic_task = Task.clone(template_task_id)
# Change parameters of {periodic_task} if necessary
Task.enqueue(periodic_task, queue="default")
time.sleep(TRIGGER_TASK_INTERVAL_SECS) `
Also maybe we are not on the same page - by clean up, I mean kill a detached subprocess on the machine executing the agent
Hi CostlyOstrich36 , most of the time I want to compare two experiments in the DEBUG SAMPLE, so if I click on one sample to enlarge it I cannot see the others. Also once I closed the panel, the iteration number is not updated
Sorry, I didn't get that π
AgitatedDove14 Unfortunately no, I already had the problem before using the function, I added it hoping it would fix the issue but it didnβt
btw I monkey patched igniteβs function global_step_from_engine to print the iteration and passed the modified function to the ClearMLLogger.attach_output_handler(β¦, global_step_transform=patched_global_step_from_engine(engine)) . It prints the correct iteration number when calling ClearMLLogger.OutputHandler.__ call__ .
` def call(self, engine: Engine, logger: ClearMLLogger, event_name: Union[str, Events]) -> None:
if not isinstance(logger, ClearMLLogger):
...
I ended up dropping omegaconf altogether
Otherwise I can try loading the file with custom loader, save as temp file, pass the temp file to connect_configuration, it will return me another temp file with overwritten config, and then pass this new file to OmegaConf
