@<1523701070390366208:profile|CostlyOstrich36> just opened an issue on this: None
This is an example of the console output of a task aborted via the webUI:
Epoch 1/29 ━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 699/16945 0:04:53 • 1:55:25 2.35it/s v_num: 0.000
2024-09-16 12:52:57,263 - clearml.Task - WARNING - ### TASK STOPPING - USER ABORTED - LAUNCHING CALLBACK (timeout 30.0 sec) ###
[2024-09-16 12:52:57,284][core.callbacks.model_checkpoint][INFO] - Marking task as `in_progress`
[2024-09-16 12:52:57,309][core.callbacks.model_checkpoint][INFO] - Saving last checkpoint
Epoch 1/29 ━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 701/16945 0:04:54 • 1:55:03 2.35it/s v_num: 0.000
[2024-09-16 12:52:58,214][core.callbacks.model_checkpoint][INFO] - Marking task as `stopped` again
2024-09-16 12:52:58,260 - clearml.storage - INFO - Uploading: 49.56MB to /tmp/.clearml.upload_model_0zr9bxdd.tmp
0% | 0.00/49.56 MB [00:00]:
Epoch 1/29 ━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 701/16945 0:04:54 • 1:55:03 2.35it/s v_num: 0.000
2024-09-16 12:52:58,330 - clearml.Task - WARNING - ### TASK STOPPING - USER ABORTED - CALLBACK COMPLETED (1.07 sec) ###
2024-09-16 12:52:58,330 - clearml.Task - WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ###
######### 30% | 15.00/49.56 MB [00:00<00:00, 138.05MB/s]:
##################7 61% | 30.00/49.56 MB [00:00<00:00, 48.89MB/s]:
############################1 91% | 45.00/49.56 MB [00:00<00:00, 65.85MB/s]:
############################### 100% | 49.56/49.56 MB [00:00<00:00, 55.91MB/s]:
Epoch 1/29 ━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 702/16945 0:04:54 • 1:55:13 2.35it/s v_num: 0.000
2024-09-16 12:52:59,154 - clearml.Task - INFO - Completed model upload to
If I don't mark the task in_progress, then the output looks like this:
Epoch 0/29 ━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 508/2532 0:03:34 • 0:13:55 2.42it/s v_num: 0.000
2024-09-13 12:12:26,973 - clearml.Task - WARNING - ### TASK STOPPING - USER ABORTED - LAUNCHING CALLBACK (timeout 30.0 sec) ###
[2024-09-13 12:12:26,974][core.callbacks.model_checkpoint][INFO] - Saving checkpoint
Epoch 0/29 ━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 509/2532 0:03:34 • 0:13:56 2.42it/s v_num: 0.000
2024-09-13 12:12:27,581 - clearml.model - WARNING - Could not update last created model in Task b281b21329e3470ebc8959e831f28ff8, Task status 'stopped' cannot be updated
Epoch 0/29 ━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 510/2532 0:03:35 • 0:13:57 2.42it/s v_num: 0.000
2024-09-13 12:12:27,678 - clearml.storage - INFO - Uploading: 49.56MB to /tmp/.clearml.upload_model_mdw9vemq.tmp
0% | 0.00/49.56 MB [00:00]:
Epoch 0/29 ━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 510/2532 0:03:35 • 0:13:57 2.42it/s v_num: 0.000
2024-09-13 12:12:27,700 - clearml.Task - WARNING - ### TASK STOPPING - USER ABORTED - CALLBACK COMPLETED (0.73 sec) ###
2024-09-13 12:12:27,701 - clearml.Task - WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ###
I just tried and the result is the same. The other method only triggers on exceptions
But for sure it was aborted via the webUI? Is it possible that your method method might be interfering with this somehow? Can you disable it and check the behaviour?
Hi @<1523701070390366208:profile|CostlyOstrich36> , the task is being aborted via the web UI - I have another method that catches local interrupts (exceptions like keyboard interrupts and crashes). The case is equal for running tasks via agents or just local cli
Hi @<1523701601770934272:profile|GiganticMole91> , how is the task being stopped in your case? Is it aborted via the web UI or through some other method? Is the task running via the agent?