Do you tend to create separate experiments for each fold?
If you really want to parallelized the workload, then splitting it to multiple executions (i.e. passing an argument of the index of the same CV) makes sense, then you can compare / sort the results based on a specific metric. That said if speed is not important, just having a single script with multiple CVs might be easier to implement?!
Thanks AgitatedDove14 . I am not using ClearML for scheduling/execution at this stage. I am evaluating ClearML for adding reporting to our current workflow. We have existing (parallelised) code for cross-validating models and I am playing with how best to log training/testing to ClearML. One thought is to initialise a new clearML task in each fold to capture the iteration-level metrics, and then create another task/experiment at the end to capture the aggregated metrics across folds. Alternatively, I could simply dump all fold and aggregated metrics into a single experiment. I don't have a good feel yet as to the pros and cons and was wondering if anyone had any advice.
One thought is to initialise a new clearML task in each fold to capture the iteration-level metrics, and then create another task/experiment at the end to capture the aggregated metrics across folds.
That is probably the easiest, and the most scalable.
BTW: with the mew reporting feature, you can integrate the comparison of the CV directly into your final report 🙂