I have a pipeline with a single component:
` @PipelineDecorator.component(
return_values=['dataset_id'],
cache=True,
task_type=TaskTypes.data_processing,
execution_queue='Quad_VCPU_16GB'
)
def generate_dataset(start_date: str, end_date: str, input_aws_credentials_profile: str = 'default'):
"""
Convert autocut logs from a specified time window into usable dataset in generic format.
"""
print('[STEP 1/4] Generating dataset from autocut logs...')
import os
import cv2
import sys
import srsly
import boto3
import shutil
import numpy as np
import pandas as pd
from clearml import Dataset
from zipfile import ZipFile
time_range = pd.date_range(start=start_date, end=end_date, freq='D').to_pydatetime().tolist()
... That I execute from there:
@PipelineDecorator.pipeline(
name="VINZ Auto-Retrain",
project="VINZ",
version="0.0.1"
)
def executing_pipeline(start_date, end_date):
print("Starting VINZ Auto-Retrain pipeline...")
print(f"Start date: {start_date}")
print(f"End date: {end_date}")
window_dataset_id = generate_dataset(start_date, end_date)
if name == 'main':
PipelineDecorator.run_locally()
executing_pipeline(
start_date="2022-01-01",
end_date="2022-03-02"
) `
During my first try I got a legitimate error since the parameter freq
from pd.date_range()
was missing so I fixed it, but on further re-execution the pipeline the backtrace is still returned as if the code was not changed.
But when replaciing the line PipelineDecorator.run_locally()
by PipelineDecorator.debug_pipeline()
the component code works properly.