Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
When I Run Experiments I Set


I can confirm that marking out Task.init(…) fixes it.

You can reproduce simply by taking the ClearML PyTorch MNIST example https://github.com/allegroai/clearml/blob/master/examples/frameworks/pytorch/pytorch_mnist.py .

To clearly see it happening, it’s easiest if you get the GPU allocated before calling task = Task.init(…) and to avoid crashing because you’re missing the task variable, you can embed just before and after Task.init(…) using IPython . You also need the process ID of the main process to use to check against sudo fuser -v /dev/nvidia* .

Summarizing, I move task = Task.init(…) to just before the for epoch in range(…) loop and replace it with
import psutil current_process_pid = psutil.Process().pid print(current_process_pid) # e.g 12971 import IPython; IPython.embed() task = Task.init(project_name='examples', task_name='pytorch mnist train') import IPython; IPython.embed()
You can then run the example until it reaches the embed and check that the main process printed is only visible on your designated device. Then you can quite the embed to see the Task.init giving the problem after which you are waiting in the second embed. You can then quit that one to see training work fine.

You can then try the whole thing again without Task.init but you need to remove reporting in that case (otherwise you get

Logger.current_logger().report_scalar( AttributeError: 'NoneType' object has no attribute 'report_scalar'
I haven’t tested on any other versions than trains 0.16.4 so I don’t know if it happens in the new clearml package.

  
  
Posted 3 years ago
87 Views
0 Answers
3 years ago
one year ago