Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
PanickyBee11
Moderator
3 Questions, 8 Answers
  Active since 17 April 2023
  Last activity 6 months ago

Reputation

0

Badges 1

8 × Eureka!
0 Votes
5 Answers
587 Views
0 Votes 5 Answers 587 Views
hi, I'm using huggingface trainer, is there a way to capture grad_norm per layer? Thanks!
6 months ago
0 Votes
9 Answers
2K Views
0 Votes 9 Answers 2K Views
Hi, I’m training on multi-node, clearml captures only a single machine utility (memory/cpu/etc.). I assume it captures node 0. Is there a way to make it repo...
2 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Is it possible to run in offline mode and still save the machine monitoring metrics? By default it is monitored for me in online mode but not in offline mode.
2 years ago
0 Hi, I'M Using Huggingface Trainer, Is There A Way To Capture Grad_Norm Per Layer? Thanks!

I guess they don’t, is there an easy way to add to the HF trainer some callbacks for reporting extra info?

6 months ago
0 Hi, I'M Using Huggingface Trainer, Is There A Way To Capture Grad_Norm Per Layer? Thanks!

I mean that HF trainer by default reports to clearml a single grad_norm scalar for the whole model. I wonder if I can extend this to reporting grad_norm per layer.

6 months ago
0 Hi, I’M Training On Multi-Node, Clearml Captures Only A Single Machine Utility (Memory/Cpu/Etc.). I Assume It Captures Node 0. Is There A Way To Make It Report All Nodes?

It's launched with torchrun https://pytorch.org/docs/stable/elastic/run.html

I think prefix would be great. It can also make it easier for reporting scalars in general (save the users the need to manually add the rank label). It can also be great to support adding the average of all nodes at the UI level, currently we need a barrier to sync all nodes before reporting a scalar which makes it slower.

2 years ago
2 years ago