Unanswered
Hi, I’M Training On Multi-Node, Clearml Captures Only A Single Machine Utility (Memory/Cpu/Etc.). I Assume It Captures Node 0. Is There A Way To Make It Report All Nodes?
multiple machines and reporting to the same task.
Out of curiosity , how do you launch it on multiple machines?
reporting to the same task.
So the "funny" think is, they all report on on top (overwriting) the other...
In order for them to report individually, it might be that you need multiple Tasks (i.e. one per machine)
Maybe we could somehow have prefix with rank on the cpu/network etc?! or should it be a different "title", wdyt?
156 Views
0
Answers
one year ago
one year ago