Reputation
Badges 1
90 × Eureka!CostlyOstrich36 I confirm this was the case.
So :
module_a.py @PipelineDecorator.pipeline()... from module_b import my_func x = my_func()
` modele_b.py
@PipelineDecorator.component()
def my_func()
pass `
Under this circumstances, the pipeline is created correctly and run correctly
But when I clone it (or click “Run” and submit) - it fails with the error above.
Moving my_func from module_a to module_b solves this.
To me this looks like a bug or unreasonable and undocumented...
CostlyOstrich36 from what I gather the UI creates a task in the background, in status “hidden”, and it has like 10 fields of json configurations…
CostlyOstrich36 not that I am aware of deleting etc.
I didn’t set up the env though…
AgitatedDove14 the emphasis is that the imports I am doing are not from external/pipe packages, they are just neighbouring modules to the function I am importing. Imports that rely on pip installed packages work well
and of course this solution forces me to do a git push for all the other dependent modules when creating the task…
AgitatedDove14
the root git path should be part of your PYTHONPATH automatically
That’s true but it doesn’t respect the root package (sources root or whatever).
i.e. if all my packages are runder /path/to/git/root /src/
So I had to add it explicitly via a docker init script…
may I also add that PyYAML is the worst thing in the history of python dependency hell?
JitteryCoyote63 how do you detect spot interruption is coming from within the http://clear.ml task in time to mark it as “resume”?
that’s the thing. I want to it to appear like one long pipeline, vs. trigger a new set of steps after the approval. So “wait” is a better metaphore for me
I don’t think so.
In most cases I woudl have multiple agents pulling from the same queue. I can’t have a queue per pipeline execution.
So if I submit A and B to the same queue, it still doesn’t gurantee that they will be pulled by the same agent….
CostlyOstrich36 yes, for the cache.
AgitatedDove14 I am not sure queue will be sufficient. it would require a queue per execution of the pipeline.
Really what I need is for A and B to be separate tasks, but guarantee they will be assigned to the same machine so that the clearml dataset cache on that machine will be warm.
Is there a way to group A and B into a sub-pipeline, have the pipeline be queued and executed remotely, but the tasks A and B inside it be treated like local tasks? or s...
which configuration are you passing? are you using any framework for configuration?
Sure, but was wondering if it has more of a “first class citizen” status for tracking… e.g. something you can visualize in the UI or query via API
I mean, if it’s not tracked, I think it would be a good feature!
Trust me, I had to add this field to this default dict just so that clearml doesn’t delete it for me
it does appear on the task in the UI, just somehow not repopulated in the remote run if it’s not a part of the default empty dict…
DeliciousBluewhale87 what solution did you land on for this?
could work! is there a way to visualize the pipeline such that this step is “stuck” in executing?
is that because you couldn’t find a good way to have a “manual approval/selection” step in http://clear.ml ?
Apart from that seems that pipeline task could have worked?
I think it has something to do with clearml since I can run this code as pure python without clearml, and when I activate clearml, I see that torch.load() hits the
import_bind
.
__patched_import3
when trying to deserialize the saved model
CostlyOstrich36 Lineage information for datasets - oversimplifying but bare with me:
Task should have a section called “input datasets”)
each time I do a Dataset.get() inside a current_task, add the dataset ID to this section
Same can work with InputModel()
This way you can have a full lineage graph (also queryable/visualizable)
sure CostlyOstrich36
I have something like the following:
@PipelineDecorator.component(....) def my_task(...) from my_module1 import my_func1 from my_modeul2 import ....
my_module1 and 2 are modules that are a part of the same project source. they don’t come as a separate package.
Now when I run this in clearml, these imports don’t work.
These functions may require transitive imports of course, so the following doesn’t work:
` PipelineDecorator.component(helper_function=[my_fu...
However I see I should really have made my question clearer.
My workflow is as follows:
Engineer A develops a pipeline with a number of steps. She experiments with this pipeline until she is happy with the flow and her code
The training pipeline that is considered “best of breed” is committed to Git and deployed by CI/CD; tagged in ClearML clearly.
Users of this pipeline know it’s the “official” training flow that they can now play with using configuration.
Goal is to ensure that “official” pipelines are source controlled.
makes sense?
I want to have a CI/CD pipeline that, upon Engineer A commit, ensures that the pipeline is re-deployed such that with Engineer B uses it as template, it’s definitely the latest version of the code and process
I suppose that yes; and I want this task to be labeled as such that it’s clear it’s the “production” task.
SweetBadger76 thanks for your reply.
One quirk I found was that even with this flag on, the agent decides to install whatever is in the requirements.txt.