Reputation
Badges 1
46 × Eureka!Great, thanks both! I suspect this might need an extra option to be passed via the SDK, to save the iteration scaling at logging time, which the UI can then use at rendering time.
(apologies for delay @<1523701087100473344:profile|SuccessfulKoala55> , we got called into meetings. Really appreciate your reactivity!)
Thanks @<1523701070390366208:profile|CostlyOstrich36> ! I'll do - and might even peek under the hood see if I can make a PR. What's the best repo for that? Is it that of the ClearML python package?
Thanks. That would be very helpful. Some of our graphs are logged by optimization steps, whereas some by epochs, so having all called "Iterations" is not ideal.
Thanks @<1523703436166565888:profile|DeterminedCrab71> . Yes, I've seen the three options to plot different things. What I'm trying to do is for the "Iterations" plot to have the same plot but just change the X label, not the time series. In matplotlib that would be a call to xlabel
.
Logging scalars also leverages ClearML automatic logging. One problem is that this automatic logging seems to keep its own internal "iteration" counter for each scalar, as opposed to keeping track of, say, the optimizer's number of steps.
That can be simply fixed on clearML python lib by allowing to set a per-scalar iteration-multiplier.
Can the “multiple agents on a single queue” scenario, combined with the autoscaler, spawn multiple agents on a single EC2 instance, by chance, please? (thinking e.g. 8 agents on a 8xGPU machine)
OK, so no way to have an automatic dispatch to different, correctly-sized instances, it’s only achievable by submitting to different queues?
Thanks @<1523701070390366208:profile|CostlyOstrich36> !
- I hadn’t found the multiple-resources within the same autoscaler. Could you point me to the right place please? Are they all used interexchangeably based upon availability, rather than based on job needs?
- We thought of using separate queues (we do that for CPU vs GPU queues), but having ClearML automatically dispatch to the right based on a job specification would be more flexible. (for example, we could then think to dispath dynami...
@<1523701205467926528:profile|AgitatedDove14> great! (I'm on the Pro version :) ).
@<1523701070390366208:profile|CostlyOstrich36> Any idea please? We could use our 8xA100 as 8 workers, for 8 single-gpu jobs running faster than on a single 1xV100 each.
Tagging @<1529271085315395584:profile|AmusedCat74> my colleague with whom we ran into this issue.
It was a debugging session. We haven’t yet tried a “Standard” non-debugging clearml session.
And yes, I was also referring to tasks ran by the Autoscaler (potentially via the HPO) app, too.
Tagging my colleague @<1529271085315395584:profile|AmusedCat74> who needs this with me 🙂
This great tool is worth paying for!
Do Pipelines work with Hyperparameter search, and with single training jobs?
Yes, we love the HPO app, and are using it :)
(do you welcome PRs?)
The problem with logging as a 2D plot is we lose the streaming: if I understand correctly the documentation, Logger.current_logger().report_scatter2d
logs a single, frozen 2D plot when you know the full X and Y data. And you would do that at each evaluation step.
Logging scalars allows to log a growing time series, i.e. add to the existing series/plot at every "iteration", thus being able to monitor the progress over time in one single plot. It's a much more logical setting.
Tagging my colleague @<1529271085315395584:profile|AmusedCat74> who made that report.
Dang, so unlike screenshots, reports do not survive task deletion :/