Hey Currently Trying To Run A Pipeline Locally To Test A Pipeline Component With

Answered

Hey currently trying to run a pipeline locally to test a pipeline component with PipelineDecorator.run_locally() , first try returned a random pandas error, fixed it but the component execution still returns the same backtrace as if the code fix was not applied but when using PipelineDecorator.debug_pipeline() the fix is applied and the component runs properly, tried:
rebooting setting the cache=False , in my component's decorator updating the pipeline version in my@pipelineDecorator.pipeline() decorator deleting .clearml/cache/ ` deleting the pipeline from the GUIBut no effect.

I understand the main difference is that debug_pipelines() runs component as functions instead of sub-process ClearML's Tasks, is this possible the component was cached somewhere by my local clearml-agent and that I missed it ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

Votes Newest

Answers 20

CostlyOstrich36 Should I start a new issue since I pinpointed the exact problem given than the beginning of this one was clearly confusing for both of us ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

Component's prototype seems fine:
@PipelineDecorator.component( return_values=['dataset_id'], cache=False, task_type=TaskTypes.data_processing, execution_queue='Quad_VCPU_16GB', ) def generate_dataset(start_date: str, end_date: str, input_aws_credentials_profile: str = 'default'):

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

So it seems to be an issue with the component parameter called in:
` @PipelineDecorator.pipeline(
name="VINZ Auto-Retrain",
project="VINZ",
version="0.0.1",
pipeline_execution_queue="Quad_VCPU_16GB"
)
def executing_pipeline(start_date, end_date):
print("Starting VINZ Auto-Retrain pipeline...")
print(f"Start date: {start_date}")
print(f"End date: {end_date}")

window_dataset_id = generate_dataset(start_date, end_date)

if name == 'main':
PipelineDecorator.run_locally()

executing_pipeline(
    start_date="2022-01-01",
    end_date="2022-03-02"
) `

Also tried with specifying named parameters like : generate_dataset(start_date=start_date, end_date=end_date) but no effect

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

The value of start_date and end_date seems to be None

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

print(f"start_date: {start_date} end_date: {end_date}") time_range = pd.date_range(start=start_date, end=end_date, freq='D').to_pydatetime().tolist()

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

The pipeline log indicate the same version of Pandas ( 1.5.0 ) is installed, I really don't know what is happening

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

It's funny cause the line in the backtrace is the correct one so I don't think it has something to do with strange cachine behavior

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

CostlyOstrich36 Having the same issue running on a remote worker, even tho the line works correctly on python interpreter and the component run correctly in local debug mode (but not standard local mode):
File "/root/.clearml/venvs-builds/3.10/code/generate_dataset.py", line 18, in generate_dataset time_range = pd.date_range(start=start_date, end=end_date, freq='D').to_pydatetime().tolist() File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/pandas/core/indexes/datetimes.py", line 1128, in date_range dtarr = DatetimeArray._generate_range( File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/pandas/core/arrays/datetimes.py", line 355, in _generate_range raise ValueError( ValueError: Of the four parameters: start, end, periods, and freq, exactly three must be specified

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

Would gladly try to run it on a remote instance to verify the thesis on some local cache acting up but unfortunately also ran into an issue with the GCP autoscaler https://clearml.slack.com/archives/CTK20V944/p1665664690293529

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

Didn't have a chance to try and reproduce it, will try soon 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

I suppose you cannot reproduct the issue from your side ?
Maybe it has to do that the faulty code was initially defined as a cached component

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

Nope same result after having deleted .clearml

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

Ia lready deleted ~/.clearml/cache but I'll try deleting the entire folder

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

What happens if you delete ~/.clearml It's clearml's cache folder

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

So basically CostlyOstrich36 I feel like debug_pipeline() use the latest version of my code as it is defined on my filesystem but the run_locally() used a previous version it cached somehow

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

When running with PipelineDecorator.run_locally() I get the legitimate pandas error that I fixed by specifying the freq param in the pd.date_range(.... line in the component:
Launching step [generate_dataset] ClearML results page: [STEP 1/4] Generating dataset from autocut logs... Traceback (most recent call last): File "/tmp/tmp2jgq29nl.py", line 137, in <module> results = generate_dataset(**kwargs) File "/tmp/tmp2jgq29nl.py", line 18, in generate_dataset time_range = pd.date_range(start=start_date, end=end_date, freq='D').to_pydatetime().tolist() File "/home/jean-adrien/.local/lib/python3.10/site-packages/pandas/core/indexes/datetimes.py", line 1128, in date_range dtarr = DatetimeArray._generate_range( File "/home/jean-adrien/.local/lib/python3.10/site-packages/pandas/core/arrays/datetimes.py", line 355, in _generate_range raise ValueError( ValueError: Of the four parameters: start, end, periods, and freq, exactly three must be specified Setting pipeline controller Task as failed (due to failed steps) ! Traceback (most recent call last): File "/home/jean-adrien/Projects/xhr/vinz/v2/clearml/pipelines/retraining/vinz_retraining_pipeline.py", line 236, in <module> executing_pipeline( File "/home/jean-adrien/.local/lib/python3.10/site-packages/clearml/automation/controller.py", line 3510, in internal_decorator raise triggered_exception File "/home/jean-adrien/.local/lib/python3.10/site-packages/clearml/automation/controller.py", line 3486, in internal_decorator LazyEvalWrapper.trigger_all_remote_references() File "/home/jean-adrien/.local/lib/python3.10/site-packages/clearml/utilities/proxy_object.py", line 361, in trigger_all_remote_references func() File "/home/jean-adrien/.local/lib/python3.10/site-packages/clearml/automation/controller.py", line 3230, in results_reference raise ValueError( ValueError: Pipeline step "generate_dataset", Task ID=c6e3f272a7e044009d587e2d60e46d65 failed
Whereas the code runs normally as it should be since I fixed the error that caused the ValueError: Of the four parameters: start, end, periods, and freq, exactly three must be specified exception when running using @PipelineDecorator.debug_pipeline()

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

I'm just not sure what error you're getting

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Thus the main difference of behavior must be coming from the _debug_execute_step_function property in the Controller class, currently skimming through it to try to identify a cause, did I provide you enough info btw CostlyOstrich36 ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

I have a pipeline with a single component:
` @PipelineDecorator.component(
return_values=['dataset_id'],
cache=True,
task_type=TaskTypes.data_processing,
execution_queue='Quad_VCPU_16GB'
)
def generate_dataset(start_date: str, end_date: str, input_aws_credentials_profile: str = 'default'):
"""
Convert autocut logs from a specified time window into usable dataset in generic format.
"""
print('[STEP 1/4] Generating dataset from autocut logs...')
import os
import cv2
import sys
import srsly
import boto3
import shutil
import numpy as np
import pandas as pd
from clearml import Dataset
from zipfile import ZipFile

time_range = pd.date_range(start=start_date, end=end_date, freq='D').to_pydatetime().tolist()

... That I execute from there: @PipelineDecorator.pipeline(
name="VINZ Auto-Retrain",
project="VINZ",
version="0.0.1"
)
def executing_pipeline(start_date, end_date):
print("Starting VINZ Auto-Retrain pipeline...")
print(f"Start date: {start_date}")
print(f"End date: {end_date}")

window_dataset_id = generate_dataset(start_date, end_date)

if name == 'main':
PipelineDecorator.run_locally()

executing_pipeline(
    start_date="2022-01-01",
    end_date="2022-03-02"
) `

During my first try I got a legitimate error since the parameter freq from pd.date_range() was missing so I fixed it, but on further re-execution the pipeline the backtrace is still returned as if the code was not changed.

But when replaciing the line PipelineDecorator.run_locally() by PipelineDecorator.debug_pipeline() the component code works properly.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

Can you please elaborate on what you're trying to do and what is failing?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Write your answer

2K Views

20 Answers

3 years ago

2 years ago