Ah okay π Was confused by what you quoted haha π
Any simple ways around this for now? @<1523701070390366208:profile|CostlyOstrich36>
That's exactly what I meant AgitatedDove14 π It's just that to access that comparison page, you have to make a comparison first. It would be handy to have a link (in the side bar?) to an empty comparison
So caching results for steps with the same arguments is trivial. Ultimately I would say you can combine the task-based pipeline with a function-based pipeline to achieve such dynamic control as you specified in the first two scenarios.
About the third scenario I'm not sure. If the configuration has changed, shouldn't the relevant steps (the ones where the configuration changed and their dependent steps) be rerun?
At any case, I think if you stay away from the decorators, at the cost of a bi...
Basically you have the details from the Dataset page, why should it be mixed with the others ?
Because maybe it contains code and logs on how to prepare the dataset. Or maybe the user just wants increased visibility for the dataset itself in the tasks view.
why would you need the Dataset Task itself is the main question?
For the same reason as above. Visibility and ease of access. Coupling relevant tasks and dataset in the same project makes it easier to understand that they're...
I think you're looking for the execute_remotely
function?
Hmmm, what π
Weβd be happy if ClearML captures that (since it uses e.g. pip, then we have the git + commit hash for reproducibility), as it claims it would π
Any thoughts CostlyOstrich36 ?
Yes, Iβve found that too (as mentioned, Iβm familiar with the repository). My issue is still that there is documentation as to what this actually offers.
Is this simply a helm chart to run an agent on a single pod? Does it scale in any way? Basically - is it a simple agent (similiar to on-premise agents, running in the background, but here on K8s), or is it a more advanced one that offers scaling features? What is it intended for, and how does it work?
The official documentation are very spa...
But... Which queue does it listen to, and which type of instances will it use etc
Weβre using karpenter
(more magic keywords for me), so my understanding is that that will manage the scaling part.
Yes exactly π Good news.
Anything else youβd recommend paying attention to when setting the clearml-agent helm chart?
Much much appreciated π
I guess it depends on what you'd like to configure.
Since we let the user choose parents, component name, etc - we cannot use the decorators. We also infer required packages at runtime (the autodetection based on import statements fails with a non-trivial namespace) and need to set that to all components, so the decorators do not work for us.
Right, so where can one find documentation about it?
The repo just has the variables with not much explanations.
Either, honestly, would be great. I meant even just a link to a blank comparison and one can then add the experiments from that view
One more UI question TimelyPenguin76 , if I may -- it seems one cannot simply report single integers. The report_scalar
feature creates a plot of a single data point (or single iteration).
For example if I want to report a scalar "final MAE" for easier comparison, it's kinda impossible π
Also I can't select any tasks from the dashboard search results π
I'd like to remove the hidden
system tag from a project
Hm, I did not specify any specific versions previously. What was the previous default?
Yes, you're correct, I misread the exception.
Maybe it hasn't completed uploading? At least for Datasets one needs to explicitly wait IIRC
Generally, really. I've struggled recently (and in the past), because the documentation seems:
Very complete wrt available SDK (though the formatting is sometimes off) Very lacking wrt to how things interact with one anotherA lot of what I need I actually find from pluging into the source code.
I think ClearML would benefit itself a lot if it adopted a documentation structure similar to numpy ecosystem (numpy, pandas, scipy, scikit-image, scikit-bio, scikit-learn, etc)
I'm trying to build an easy SDK that would fit DS work and fit the concept of clearml pipelines.
In doing so, I'm planning to define various Step
classes, that the user can then experiment with, providing Steps as input to other steps, etc.
Then I'd like for the user to be able to run any such step, either locally or remotely. Locally is trivial. Remotely is the issue. I understand I'll need to upload additional data to the remote instance, and pull a specific artifact back to the notebo...
Yes, thanks AgitatedDove14 ! It's just that the configuration
object passed onwards was a bit confusing.
Is there a planned documentation overhaul? π€
It should store it on the fileserver, perhaps you're missing a configuration option somewhere?
First bullet point - yes, exactly
Second bullet point - all of it, really. The SDK documentation and the examples.
For example, the Task
object is heavily overloaded and its documentation would benefit from being separated into logical units of work. It would also make it easier for the ClearML team to spot any formatting issues.
Any linked example to github is welcome, but some visualization/inline code with explanation is also very much welcome.