Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
[Task Gets Interrupted / Aborted / Reset When In Offline Mode] For Local Testing, We Have Added A

[Task gets interrupted / aborted / reset when in offline mode]
For local testing, we have added a --no-clearml option to our code that sets task.set_offline(True) directly after the task is created.
However, this results in the process getting interrupted, the outputs show:
2022-11-04 02:31:24,897 - clearml.Task - WARNING - Task d627ee8da785410d91bce2309a4c1b8a was reset! if state is consistent we shall ... SOME OTHER LOGS ... 2022-11-04 02:31:26,899 - clearml.Task - WARNING - Task d627ee8da785410d91bce2309a4c1b8a was reset! if state is consistent we shall terminate. 2022-11-04 02:31:28,900 - clearml.Task - WARNING - Task d627ee8da785410d91bce2309a4c1b8a was reset! if state is consistent we shall terminate. 2022-11-04 02:31:30,902 - clearml.Task - WARNING - ### TASK STOPPED - USER ABORTED - RESET ###All function call to the task are wrapped like if not task.is_offline(): task.XXX() and it does not seem to matter when set_offline() gets set. The same program runs through without setting offline mode of course. Any ideas?

  
  
Posted one year ago
Votes Newest

Answers 12


Any idea where that could come from? Could we turn off the local logging as well - in these kinds of runs we don’t need it?

It is supposed to create it automatically... I tested with other examples (clearml version 1.7.3rc1) everything seems to work
What am I missing? how do we recreate the issue ? can you verify it is still not working with the latest RC?

  
  
Posted one year ago

Hi AgitatedDove14 , so it took some time but I’ve finally managed to reproduce. The issue seems to be related to writing images via Tensorboard:
` from torch.utils.tensorboard import SummaryWriter
import torch
from clearml import Task, Logger

if name == "main":
task = Task.init(project_name="ClearML-Debug", task_name="[Mac] TB Logger, offline")
tb_logger = SummaryWriter(log_dir="tb_logger/demo/")
image_tensor = torch.rand(256, 256, 3)
for iter in range(10):
tb_logger.add_image(f"images/image123/img", image_tensor, iter, dataformats="HWC")

task.flush(wait_for_uploads=True) `Again the errors show up as

2022-11-09 09:47:27,602 - clearml.metrics - WARNING - Failed uploading to /Users/manuel/.clearml/cache/offline/offline-028b2df9167049eba4bdce7c6f89f39e/data (Target path "/Users/manuel/.clearml/cache/offline/offline-028b2df9167049eba4bdce7c6f89f39e/data" does not existAny idea about that? I’m also happy to open an issue on GitHub with the details if you like 🙂
By the way no rush about this - we will turn off TB logging in the meantime.
Thanks for all the help!

  
  
Posted one year ago

I meant maybe me activating offline mode, somehow changes something else in the runtime and that in turn leads to the interruption. Let me try to build a minimal reproducible version 🙂

  
  
Posted one year ago

By the way, if we don’t wrap other calls in is_offline() we get errors like “DateTime object is not serializable”, but that’s a secondary issue.

  
  
Posted one year ago

It might be broken for me, as I said the program works without the offline mode but gets interrupted and shows the results from above with offline mode.

How could I reproduce this issue ?

But there might be another issue in between of course - any idea how to debug?

I think I missed this one, what exactly is the issue ?

  
  
Posted one year ago

Let me try to build a minimal reproducible version

Thank you!

  
  
Posted one year ago

For local testing, we have added a

ScantChimpanzee51 there is already an environment variable for that, you can just set CLEARML_OFFLINE_MODE 🙂

By the way, if we don’t wrap other calls in

is_offline()

we get errors like “DateTime object is not serializable”, but that’s a secondary issue.

I think this was fixed, can you verify with the latest RC 1.7.3rc0 ? If this still happens can you share the code

However, this results in the process getting interrupted, the outputs show:

Are you saying offlinemode is broken ?

  
  
Posted one year ago

It might be broken for me, as I said the program works without the offline mode but gets interrupted and shows the results from above with offline mode. But there might be another issue in between of course - any idea how to debug?
The environment variable is good to know, I will try with that as well and report back.

  
  
Posted one year ago

Happy to and thanks!

  
  
Posted one year ago

BTW, this one seems to work ....
` from time import sleep
from clearml import Task

Task.set_offline(True)
task = Task.init(project_name="debug", task_name="offline test")

print("starting")

for i in range(300):
print(f"{i}")
sleep(1)

print("done") `

  
  
Posted one year ago

So AgitatedDove14 if we use the CLEARML_OFFLINE_MODE environment variable instead the program runs through again.
The only thing is that now we get errors of the form
0%| | 0/18 [00:00<?, ?image/s]ClearML running in offline mode, session stored in /home/manuel/.clearml/cache/offline/offline-167ceb1cd3c946df8abc7206b781b486 2022-11-07 07:49:06,986 - clearml.metrics - WARNING - Failed uploading to /home/manuel/.clearml/cache/offline/offline-167ceb1cd3c946df8abc7206b781b486/data (Target path "/home/manuel/.clearml/cache/offline/offline-167ceb1cd3c946df8abc7206b781b486/data" does not exist)I’ve checked the path and it does exist but for the data subdirectory, i.e. /home/manuel/.clearml/cache/offline/offline-167ceb1cd3c946df8abc7206b781b486/ exists but in there is no data directory. Any idea where that could come from? Could we turn off the local logging as well - in these kinds of runs we don’t need it?

  
  
Posted one year ago

Thanks ScantChimpanzee51 !
Let me see what I can find, should be easy enough to fix now 🙂

  
  
Posted one year ago
604 Views
12 Answers
one year ago
one year ago
Tags