hard to say, maybe just “related experiments” in experiment info would be enough. I’ll think about it
more like collapse/expand, I guess. or pipelines that you can compose after running experiments to see that experiments are connected to each other
I guess. or pipelines that you can compose after running experiments to see that experiments are connected to each other
hmm what do you mean by "compose after running experiments" ? like a way to group them? what is the relation between one "item" to another ?
If this is a sequence of Tasks , are they executed by a controller ?
that's right
for example, there are tasks A, B, C
we run multiple experiments for A, finetune some of them in separate tasks, then choose one or more best checkpoints, run some experiments for task B, choose the best experiment, and finally run task C
so we get a chain of tasks: A - A-ft - B- C
ClearML pipeline doesn't quite work here because we would like to analyze results of each step before starting next task
but it would be great to see predecessors of each experiment in the chain
nope, that's the point, quite often we run experiments separately, but they are related to each other. currently there's no way to see that one experiment is using checkpoint from the previous experiment since we need to manually insert S3 link as a hyperparameter. it would be useful to see these connections. maybe instead of grouping we could see which experiments are using artifacts of this experiment
DilapidatedDucks58 so is this more like a pipeline DAG that is built ?
I'm assuming this is more than just grouping ?
(by that I mean, accessing a Tasks artifact does necessarily point to a "connection", no? Is it a single Task everyone is accessing, or a "type" of a Task ?
Is this process fixed, i.e. for a certain project we have a flow (1) executed Task of type A, then Task of type (B) using the artifacts fro Task (A). This implies we might have multiple Tasks of types A/B but they are always used this way. wdyt?
parents and children. maybe tags, maybe separate tab or section, idk. I wonder if anyone else is interested in this functionality, for us this is a very common case
it would be nice to group experiments within projects
DilapidatedDucks58 you mean is collapse/expand ? or in something like "sub-project" ?
Could you use tags for that? In that case you can easily filter on which group you're interested in, or do you have a more impactful UI change in mind to implement groups? 🙂
The built in HPO uses tags to group experiment runs together and actually use the original optimizer task ID as tag to be able to quickly go back and see where they came from. You can find an example in the ClearML Examples project.
DilapidatedDucks58 Nice!
but it would be great to see predecessors of each experiment in the chain
So maybe we should add "manual pipeline" to create the connection post execution ? is this a one time thing ?
Maybe a service creating these flow charts ?
Should we put them in the Project's readme ? Or in the Pipeline section (coming soon)
tags are somewhat fine for this, I guess, but there will be too many of them eventually, and they do not reflect sequential nature of the experiments