The latest RC (0.17.5rc6) moved all logs into separate subprocess to improve speed with pytorch dataloaders
I don’t think it is, I was rather wondering how you handled it to understand potential sources of slow down in the training code
Why do you ask? is your server sluggish ?
Is there one?
No, I rather wanted to understand how it worked behind the scene 🙂
The latest RC (0.17.5rc6) moved all logs into separate subprocess to improve speed with pytorch dataloaders
That’s awesome!
JitteryCoyote63
are the calls from the agents made asynchronously/in a non blocking separate thread?
You mean like request processing on the apiserver are multi-threaded / multi-processed ?
I mean when sending data from the clearml-agents, does it block the training while sending metrics or is it done in parallel from the main thread?
potential sources of slow down in the training code
Is there one?
Multi-threaded multi-processes multi-nodes 🙂