Reputation
Badges 1
131 × Eureka!Try to spin-up an auto-scaler app or any other Pro-only features
Hey, did you checked that out ? None
Does your current running environment has all the requuired packages, cause your pipeline controller has the run_locally()
option and I'm not sure if the pipeline orchestrator will follow the same logic of installing all your component's imports as dependencies on remote workers if you do not execute on a distant agent but in locall using that option
ClearML package version used: 1.9.1
ClearML Server: SaaS - Pro Tier
Okay, thanks for the pointer ❤
We're using Ray for hyperparameter search for non-CV model successfully on ClearML
Yup we too had to implement a lots of little things for ClearML in our tooling library due to it being pretty bare bone in some area
Nope, I tried several upload AFTER putting the policy and all my uploads attempts were met with a progress stuck at 0% when trying
Sure, as mentioned above: "I had to revert the change a simple policy".
The upload worked after the rollback, thus justifying my suspicions about the lifecycle policy causing the issue.
Btw AgitatedDove14 is there a way to define parallel tasks and use pipeline as an acyclic compute graph instead of simply sequential tasks ?
Nice, that's a great feature! I'm also trying to have a component executing Giskard QA test suites on model and data, is there a planned feature when I can suspend execution of the pipeline, and display on the UI that this pipeline "steps" require a human confirmation to go on or stop while displaying arbitrary text/plot information ?
Ah apparently reason was that the squash()
method defaults its output url to file_server
instead of the project's default storage string, might be nice to do the checks about storage validity before spawning sub-processes
Okay the force_store_standalone_script()
works
Well aside from the abvious removal of the line PipelineDecorator.run_locally()
on both our sides, the decorators arguments seems to be the same:@PipelineDecorator.component( return_values=['dataset_id'], cache=True, task_type=TaskTypes.data_processing, execution_queue='Quad_VCPU_16GB', repo=False )
And my pipeline controller:
` @PipelineDecorator.pipeline(
name="VINZ Auto-Retrain",
project="VINZ",
version="0.0.1",
pipeline_execution_queue="Quad_V...
Do I need to instantiate a task inside my component ? Seems a bit redundant....
The train.py
is the default YOLOv5 training file, I initiated the task outside the call, should I go edit their training command-line file ?
The worker docker image was running on python 3.8 and weare running on a PRO tier SaaS deployment, this failed run is from a few weeks ago and we did not run any pipeline since then
Well I uploaded datasets in the previous steps with the same credentials
But the task appeared with the correct name and outputs in the pipeline and the experiment manager
Is there an example of this somewhere ? Cause I'm training a YOLOv5 model which already has ClearML intergration built-in but it seems to be hardcoded to attach its task to a Yolov5
project and upload .pt
file as artifact while I want to upload a converted .onnx
weights with custom tags to my custom project
Okay! Tho I only see a param to specify a weights url while I'm looking to upload local weights
Well if you have:
ret_obj = None
for in in range(5):
ret_obj = step_x(ret_obj)
SInce the orchestration automatically determine the order of execution using the logic of return objects the controller will execute them sequentially.
However, if your steps don't have dependencies like this:
for i in range(5):
step_x(...)
It will try to execute them concurrently
Why note define your pipeline using PipelineDecorator
instead, then you'll be able to call each of your pipeline components in a very pythonic way
I was launching a pipeline run, but I don't remember having set the autoscaler to use spot instances (I believe the GCP terminology for spot instance is "preemptible" and I set it to false)
So it seems to be an issue with the component parameter called in:
` @PipelineDecorator.pipeline(
name="VINZ Auto-Retrain",
project="VINZ",
version="0.0.1",
pipeline_execution_queue="Quad_VCPU_16GB"
)
def executing_pipeline(start_date, end_date):
print("Starting VINZ Auto-Retrain pipeline...")
print(f"Start date: {start_date}")
print(f"End date: {end_date}")
window_dataset_id = generate_dataset(start_date, end_date)
if name == 'main':
PipelineDec...
Component's prototype seems fine:@PipelineDecorator.component( return_values=['dataset_id'], cache=False, task_type=TaskTypes.data_processing, execution_queue='Quad_VCPU_16GB', ) def generate_dataset(start_date: str, end_date: str, input_aws_credentials_profile: str = 'default'):
I have a pipeline with a single component:
` @PipelineDecorator.component(
return_values=['dataset_id'],
cache=True,
task_type=TaskTypes.data_processing,
execution_queue='Quad_VCPU_16GB'
)
def generate_dataset(start_date: str, end_date: str, input_aws_credentials_profile: str = 'default'):
"""
Convert autocut logs from a specified time window into usable dataset in generic format.
"""
print('[STEP 1/4] Generating dataset from autocut logs...')
import os
...
When running with PipelineDecorator.run_locally()
I get the legitimate pandas error that I fixed by specifying the freq
param in the pd.date_range(....
line in the component:Launching step [generate_dataset] ClearML results page:
`
[STEP 1/4] Generating dataset from autocut logs...
Traceback (most recent call last):
File "/tmp/tmp2jgq29nl.py", line 137, in <module>
results = generate_dataset(**kwargs)
File "/tmp/tmp2jgq29nl.py", line 18, in generate_dataset
...