Reputation
Badges 1
186 × Eureka!hard to say, maybe just โrelated experimentsโ in experiment info would be enough. Iโll think about it
I'm so happy to see that this problem has been finally solved!
more like collapse/expand, I guess. or pipelines that you can compose after running experiments to see that experiments are connected to each other
maybe I should use explicit reporting instead of Tensorboard
I guess I could manually explore different containers and their content ๐ as far as I remember, I had to update Elastic records when we moved to the new cloud provider in order to update model URLs
sorry, my bad, after some manipulations I made it work. I have to manually change HTTP to HTTPS in config file for Web and Files (not API) server after initialization, but besides that it works
yeah, I am aware of trains-agent, we are planning to start using it soon, but still, copying original training command would be useful
Error 12 : Validation error (value โ['13b46b9325954517ab99381d5f45237dโ, โbc76c3a7f0f6431b8e064212e9bdd2c0โ, โ5d2a57cd39b94250b8c8f52303ccef92โ, โe4731ee5b33e41d992d6d3fdb2913045โ, โ698d9231155e41fbb61f8f3faa605727โ, โ2171b190507f40d1be35e222045c58eaโ, โ55c81a5db0ad40bebf72fdcc1b3be2a4โ, โ94fbdbe26ef242d793e18d955cb3de58โ, โ7d8a6c8f2ae246478b39ae5e87def2adโ, โ141594c146fe495886d477d9a27c465fโ, โ640f87b02dc94a4098a0aba4d855b8f5โ]' length is bigger than allowed maximum โ10โ.)
new icons are slick, it would be even better if you could upload custom icons for the different projects
not quite. for example, Iโm not sure which info is stored in Elastic and which is in MongoDB
1 - yes, of course =) but it would be awesome if you could customize the content - to include key metrics and hyperparameters, for example
3 - hooooooraaaay
the weird part is that the old job continues running when I recreate the worker and enqueue the new job
nope, that's the point, quite often we run experiments separately, but they are related to each other. currently there's no way to see that one experiment is using checkpoint from the previous experiment since we need to manually insert S3 link as a hyperparameter. it would be useful to see these connections. maybe instead of grouping we could see which experiments are using artifacts of this experiment
tags are somewhat fine for this, I guess, but there will be too many of them eventually, and they do not reflect sequential nature of the experiments
that's right
for example, there are tasks A, B, C
we run multiple experiments for A, finetune some of them in separate tasks, then choose one or more best checkpoints, run some experiments for task B, choose the best experiment, and finally run task C
so we get a chain of tasks: A - A-ft - B- C
ClearML pipeline doesn't quite work here because we would like to analyze results of each step before starting next task
but it would be great to see predecessors of each experiment in the chain