Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
NonchalantDeer14
Moderator
3 Questions, 18 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

18 × Eureka!
0 Votes
3 Answers
542 Views
0 Votes 3 Answers 542 Views
Hi, I am currently trying to train with https://github.com/open-mmlab/mmdetection using ClearML, and executing remotely. The recommended way of training mult...
2 years ago
0 Votes
7 Answers
634 Views
0 Votes 7 Answers 634 Views
Hi, another issue is faced when using mmdetection/mmcv with clearml. The automatic uploading of checkpoint meets the following error: clearml.storage - ERROR...
2 years ago
0 Votes
19 Answers
594 Views
0 Votes 19 Answers 594 Views
Hi! I am currently using clearml (with remote execution), to train an object detection model with https://github.com/facebookresearch/detectron2 . It was wor...
2 years ago
2 years ago
0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

AgitatedDove14 I see! I will try adding Task.current_task() and see how it goes.

That said, I already have a Task.get_task() in the main function which each subprocess runs. Is that not enough to trigger clearml? https://github.com/levan92/det2_clearml/blob/2634d2c6f898f8946f5b3379dba929635d81d0a9/trainer.py#L206

2 years ago
0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

Yup i could view the tensorboard logs through a local tensorboard with all the metrics in

2 years ago
0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

Hi AgitatedDove14 , so sorry, I have to re-open this issue as the same issue is still happening when I incorporate clearml in my detectron2 training in our setup. In our setup, we are using K8s-glue agent, and I am sending training jobs to be executed remotely. For single gpu training, everything works as intended, tensorboard graphs show up auto-magically on clearml dashboard.

However, when train with multi-gpu (same machine), the tensorboard graphs does not show up on the clearml dashboar...

2 years ago
0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

TimelyPenguin76 AgitatedDove14 so sorry for pressing, just bumping this up, do you all have any ideas why this happens? Otherwise I will have to proceed with using the clearml task logging to manually report the metrics

2 years ago
0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

Hi AgitatedDove14 sorry for the late reply. Yes, pod does get allocated 2 gpus. "script path" is "train_net_clearml.py"

2 years ago
0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

Oh! Thank you for pointing that out! Didn’t notice that. Yes, it turns out in my requirements.txt i specified that version. Once I changed it to the latest version of clearml, the tensorboard graphs shows up in the dashboard.

2 years ago
0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

AgitatedDove14 you can ignore my last question, I've tried it out on a minimal example here: https://github.com/levan92/clearml_test_mp

I've ascertain that I need Task.current_task() in order to trigger clearml ( Task.get_task() is not enough). Thank you!

2 years ago
0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

i submitted the job through the bash script "train_coco.sh", which basically runs the python script "train_net_clearml.py" with various arguments.

2 years ago
0 Hi, I Am Currently Trying To Train With

you can take a look at the log, that's what I see on the UI

2 years ago