Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With


Hi AgitatedDove14 , so sorry, I have to re-open this issue as the same issue is still happening when I incorporate clearml in my detectron2 training in our setup. In our setup, we are using K8s-glue agent, and I am sending training jobs to be executed remotely. For single gpu training, everything works as intended, tensorboard graphs show up auto-magically on clearml dashboard.

However, when train with multi-gpu (same machine), the tensorboard graphs does not show up on the clearml dashboard. However, everything else still trains correctly and the tensorboard logs written in the k8s container are correct as well. The logging is also showing up normally on the clearml dashboard, which shows that the training process is "connected" to clearml. Also, when I explicitly report scalars, in the training process, it does not show up as well.

I've attached a zip file which contains 2 folders (single-gpu, multi-gpu). They contain the respective codes and logs (as well as screenshots of the clearml dashboard).

Thank you so much! Looking forward to your reply.

  
  
Posted 2 years ago
108 Views
0 Answers
2 years ago
one year ago
Tags