ExcitedFish86 regarding the <1 version - are you talking about ClearML Server or ClearML SDK?
Task.init should be called before pytorch distribution is called, then on each instance you need to call Task.current_task() to get the instance (and make sure the logs are tracked).
An easier fix for now will probably be some kind of warning to the user that a task is created but not connected
Maybe we should rename it?! it actually creates a Task but will not auto connect it...
I think so. IMHO all API calls should maybe reside in a different module since they usually happen inside some control code
An easier fix for now will probably be some kind of warning to the user that a task is created but not connected
That is a good point, maybe if you do not have a "main" Task, then we print the warning (with some flag to disable the warning) ?
sounds great.
BTW the code is working now out-of-the-box. Just 2 magic line - import
+ Task.init
Hi ExcitedFish86
In Pytorch-Lightning I use DDP
I think a fix for pytorch multi-node / process distribution was commited to 1.0.4rc1, could you verify it solves the issue ? (rc1 should fix this specific issue)
BTW: no problem working with cleaml-server < 1
ExcitedFish86 You came to ClearML because it's free, you stayed because of the magic 🎊 🎉