![Profile picture](https://clearml-web-assets.s3.amazonaws.com/scoold/avatars/UnevenDolphin73.png)
Reputation
Badges 1
662 × Eureka!Yes that's what I thought, thanks for confirming.
No that does not seem to work, I get
task.execute_remotely(queue_name="default")
2024-01-24 11:28:23,894 - clearml - WARNING - Calling task.execute_remotely is only supported on main Task (created with Task.init)
Defaulting to self.enqueue(queue_name=default)
Any follow-up thoughts, @<1523701070390366208:profile|CostlyOstrich36> , or maybe @<1523701087100473344:profile|SuccessfulKoala55> ? 🤔
Thanks @<1537605940121964544:profile|EnthusiasticShrimp49> ! That’s definitely the route I was hoping to go, but the create_function_task
is still a bit of a mystery, as I’d like to use an entire class with relevant logic and proper serialization for inputs, and potentially I’ll need to add more “helper functions” (as in the case of DataTransformationStep
, for example). Any thoughts on that? 🤔
I can elaborate in more detail if you have the time, but generally the code is just defined in some source files.
I’ve been trying to play around with pipelines for this purpose, but as suspected, it fails finding the definition for the pickled object…
Then I wonder:
- How to achieve this? The pipeline controller seems to only work with functions, not classes, so running smaller steps remotely seems more difficult then I imagined. I was already prepared to upload artifacts myself etc, but now I’m not sure?
- Do I really need to recreate the pipeline everytime from scratch? Or can I remove/edit steps? It’s mostly used as a… controller for notebook-based executions and experimentations, before the actual pipeline is known. That is, it will ...
Hey @<1537605940121964544:profile|EnthusiasticShrimp49> ! You’re mostly correct. The Step
classes will be predefined (of course developers are encouraged to add/modify as needed), but as in the DataTransformationStep
, there may be user-defined functions specified. That’s not a problem though, I can provide these functions with the helper_functions
argument.
- The
.add_function_step
is indeed a failing point. I can’t really create a task from the notebook because calling `Ta...
No worries @<1537605940121964544:profile|EnthusiasticShrimp49> ! I made some headway by using Task.create
, writing a temporary Python script, and using task.update
in a similar way to how pipeline steps are created.
I'll try and create an MVC to reproduce the issue, though I may have strayed from your original suggestion because I need to be able to use classes and not just functions.
Yeah, and just thinking out loud what I like about the numpy/pandas documentation
Yes, thanks AgitatedDove14 ! It's just that the configuration
object passed onwards was a bit confusing.
Is there a planned documentation overhaul? 🤔
I guess following the example https://github.com/allegroai/clearml/blob/master/examples/advanced/execute_remotely_example.py , it's not clear to me how the server has access to the data loaders location when it hits execute_remotely
I'll kill the agent and try again but with the detached mode 🤔
That could work, given that:
Could we add a preview section? One reason I don't like using the configuration section is that it makes debugging much much harder. Will the clearml-agent download and unzip the files, placing them into the same local folder as needed for execution? What if we want to include non-configuration objects? (i.e. the model case I listed)
IIRC, get_local_copy()
downloads a local copy and returns the path to the downloaded file. So you might be interested in e.g.local_csv = pd.read_csv(a_task.artifacts['train_data'].get_local_copy())
With the models, you're looking for get_weights()
. It acts the same as get_local_copy()
, so it returns a path.
EDIT: I think also get_local_copy()
for a model should work 👍
Hey AgitatedDove14 🙂
Finally managed; you keep saying "all projects" but you meant the "All Experiments" project instead. That's a good start 👍 Thanks!
Couple of thoughts from this experience:
Could we add a comparison feature directly from the search results (Dashboard view -> search -> highlight some experiments for comparison)? Could we add a filter on the project name in the "All Experiments" project? Could we add the project for each of the search results? (see above pictur...
I just used this to create the dual_gpu
queue:clearml-agent daemon --queue dual_gpu --create-queue --gpus 0,1 --detached
Thanks for your help SuccessfulKoala55 ! Appreciate the patience 🙏
That will come at a later stage
I was thinking of using the --volume
settings in clearml.conf
to mount the relevant directories for each user (so it's somewhat customizable). Would that work?
It would be amazing if one can specify specific local dependencies for remote execution, and those would be uploaded to the file server and downloaded before the code starts executing
Hah. Now it worked.
Is there a preferred way to stop the agent?
Okay trying again without detached
Seemed to work fine again in detached mode, what went wrong there :shocked_face_with_exploding_head:
SuccessfulKoala55 That string was autogenerated by pyhocon and matches their documentation too - https://github.com/lightbend/config/blob/master/HOCON.md#substitutions
The first example won't work (it will treat ${...}
as a string literal and won't replace it). The second does work, but as mentioned anyway, these were not hand typed, but rather generated from pyhocon, so I don't think that's the issue 🤔
Parquet file in this instance (used to be CSV, but that was even larger as everything is stored as a string...)
Yeah that works too. So one can override the queue ID but not the worker 🤔
But since this has come up a lot recently, any updates on #340? 😍
nevermind! Found and answered (solution in the issue linked above)
I’ll give the create_function_task
one more try 🤔
I guess it does not do so for all settings, but only those that come from Session()