Reputation
Badges 1
90 × Eureka!sure CostlyOstrich36
I have something like the following:
@PipelineDecorator.component(....) def my_task(...) from my_module1 import my_func1 from my_modeul2 import ....my_module1 and 2 are modules that are a part of the same project source. they don’t come as a separate package.
Now when I run this in clearml, these imports don’t work.
These functions may require transitive imports of course, so the following doesn’t work:
` PipelineDecorator.component(helper_function=[my_fu...
AgitatedDove14 thanks, good idea.
My main issue with this approach is that it breaks the workflow into “a-sync” set of tasks:
One task sends a list of images for labeling and terminates an external webhook calls http://clear.ml and creates a dataset from the labels returned from the labeling task a trigger wakes up the label post processing/splitting logic.
It will be hard to understand where things are standing from looking at the UI.
I was wondering if the “waiting” operator can actua...
Re. “which task did I clone from” - to my understanding “parent’ field is used for “runtime parent” - i.e. what task started me.
This is not the same as “which task was I cloned from”
I tested it again with much smaller data and it seems to work.
I am not sure what is the difference between the use-cases. it seems like something specifically about the particular (big) parent doesn’t agree with clearml…
Engineer B is in charge of running Engineer A’s pipeline with different parameters and investigate the results
I think that in principal, if you “intercept” the calls to Model.get() or Dataset.get() from within a task, you can collect the ID’s and do various stuff with them. You can store and visualize it for lineage, or expose it as another hyper parameter I suppose.
You’ll just need the user to name them as part of loading them in the code (in case they are loading multiple datasets/models).
I want to have a CI/CD pipeline that, upon Engineer A commit, ensures that the pipeline is re-deployed such that with Engineer B uses it as template, it’s definitely the latest version of the code and process
is that because you couldn’t find a good way to have a “manual approval/selection” step in http://clear.ml ?
Apart from that seems that pipeline task could have worked?
I don’t think so.
In most cases I woudl have multiple agents pulling from the same queue. I can’t have a queue per pipeline execution.
So if I submit A and B to the same queue, it still doesn’t gurantee that they will be pulled by the same agent….
AgitatedDove14 yes, i am passing this flag to the agent with CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1 clearml-agent….
running inside docker
and it still tries to install the requirements.txt
Using 1.3.1
It seems to work fine when the parent is on clear.ml storage (tried with toy example of data)
However I see I should really have made my question clearer.
My workflow is as follows:
Engineer A develops a pipeline with a number of steps. She experiments with this pipeline until she is happy with the flow and her code
AgitatedDove14 I see the continue_pipeline f flag.
I want to resume the same instance of the pipeline.
When I want to resume the pipeilne, I can only re-enqueue it - I cannot reset parameters (right?)
So it seems that for the pipeline to resume with the “continue pipeline” mode,
I need to pass the “continue_pipeline” first time I submit the pipeline.
Hopefully it will be ignored during the first run and just behave like a new run, and only really kick in when the pipeline is resumed....
not sure I follow.
how can a cronjob solve this for me?
I want to manage the dataset creation task(s) in http://clear.ml .
This flow is triggered say manually whenever I want to create a train/test set for my model.
it just so happens that somewhere in this flow, the code needs to “wait” for days/weeks for the assignment to be ready.
But you already have all the entries defined here:
yes but it’s missing a field that is actually found and parsed from my local autoscaler.yaml….
Trust me, I had to add this field to this default dict just so that clearml doesn’t delete it for me
it does appear on the task in the UI, just somehow not repopulated in the remote run if it’s not a part of the default empty dict…
AgitatedDove14 the emphasis is that the imports I am doing are not from external/pipe packages, they are just neighbouring modules to the function I am importing. Imports that rely on pip installed packages work well
CostlyOstrich36pipe.add_step(name='train', parents=['data_pipeline', ], base_task_project='xxx', base_task_name='yyy', parameter_override={'OmegaConf': cfg.trainer})
So “The” pipeline Engineer A creates, once updated with the latest code, and perhaps ran once as test by CI CD, should be “tainted” as “The production” version of that pipeline, so that Engineer B’s code always uses the latest released pipeline code
I think it has something to do with clearml since I can run this code as pure python without clearml, and when I activate clearml, I see that torch.load() hits the
import_bind
.
__patched_import3
when trying to deserialize the saved model
AgitatedDove14 Not sure the pipeline decorator is what I need.
Here’s a very simplified example to my question.
Say I want to train my model on some data.
Before adding http://clear.ml , the code looks something like:def train(data_dir, ...): ...
Now I want to leverage the data versioning capability in http://clear.ml
So now, the code needs to fetch dataset by ID, save it locally, and let the model train on it as before:
` from clearml import Dataset
def train_clearml(dataset_id...
Sure, but was wondering if it has more of a “first class citizen” status for tracking… e.g. something you can visualize in the UI or query via API
I will try and get back to this area of the code soon
AgitatedDove14 can you share if there is a plan to put the gcp autoscaler in the open source?
CostlyOstrich36 yes, for the cache.
AgitatedDove14 I am not sure queue will be sufficient. it would require a queue per execution of the pipeline.
Really what I need is for A and B to be separate tasks, but guarantee they will be assigned to the same machine so that the clearml dataset cache on that machine will be warm.
Is there a way to group A and B into a sub-pipeline, have the pipeline be queued and executed remotely, but the tasks A and B inside it be treated like local tasks? or s...