I Am Studying Using The「Pipeline From Tasks 」Pages.

Answered

I am studying using the「Pipeline from Tasks 」pages.
None

I am facing two problems.

Q1:- I want to load a local csv in the step1 task and split it in step2, but I get an error that the csv cannot be found.
Can't local data be read when running in a queue?- Q2:- When I register a queue or try to run it locally, it looks for an unknown queue name. The error code is [ValueError: Could not find queue named “services”].
I have and run a queue called local, but I have never run a queue called services.Thank you!

C:\USERS\USER\DESKTOP\TEST
    data.csv
    test1.py
    test2.py
    test_all.py

step1(test1.py)

from clearml import Task
import pandas as pd

task = Task.init(project_name="test_test", task_name="step1")
args = {
    'csv_file': 'data.csv',
    'target_column': 'target'
}
task.connect(args)
task.execute_remotely()
data = pd.read_csv(args['csv_file'])
task.upload_artifact('dataset', artifact_object=data)

step2(test2.py)

from clearml import Task
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder

task = Task.init(project_name="test_test", task_name="step2")

args = {
    'dataset_task_id': '', 
    'test_size': 0.2,
    'random_state': 42,
    'target_column': 'target'
}
task.connect(args)
task.execute_remotely()

dataset_task = Task.get_task(task_id=args['dataset_task_id'])
data = dataset_task.artifacts['dataset'].get()

y = data[args['target_column']]
X = data.drop(columns=[args['target_column']])

categorical_columns = X.select_dtypes(include=['object']).columns
for col in categorical_columns:
    le = LabelEncoder()
    X[col] = le.fit_transform(X[col].astype(str))

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=args['test_size'], random_state=args['random_state']
)

task.upload_artifact('X_train', X_train)
task.upload_artifact('X_test', X_test)
task.upload_artifact('y_train', y_train)
task.upload_artifact('y_test', y_test)

bind step1 and 2

from clearml.automation import PipelineController

pipe = PipelineController(
    name="from task",
    project="test_test",
    version="0.0.1",
    add_pipeline_tags=False
)

pipe.add_parameter("csv_file", "data.csv")
pipe.add_parameter("target_column", "target")
pipe.add_parameter("test_size", 0.2)
pipe.add_parameter("random_state", 42)

pipe.add_step(
    name="load_data",
    base_task_project="test_test",
    base_task_name="step1",
    parameter_override={
        "General/csv_file": "${pipeline.csv_file}",
        "General/target_column": "${pipeline.target_column}"
    }
)

pipe.add_step(
    name="process_data",
    parents=["load_data"],
    base_task_project="test_test",
    base_task_name="step2",
    parameter_override={
        "General/dataset_task_id": "${load_data.id}",
        "General/test_size": "${pipeline.test_size}",
        "General/random_state": "${pipeline.random_state}",
        "General/target_column": "${pipeline.target_column}"
    }
)

pipe.set_default_execution_queue(default_execution_queue = "local")
pipe.start()

#pipe.start_locally()

result

C:\Users\[username]\Desktop\[project]>python test_all.py
ClearML Task: created new task id=[task_id]
[timestamp] - clearml.Task - INFO - No repository found, storing script code instead
ClearML results page:


ClearML pipeline page:


Traceback (most recent call last):
  File "C:\Users\[username]\Desktop\[project]\test_all.py", line 39, in <module>
    pipe.start()
  File "[path]\Lib\site-packages\clearml\automation\controller.py", line 1035, in start
    self._task.execute_remotely(queue_name=queue, exit_process=True, clone=False)
  File "[path]\Lib\site-packages\clearml\task.py", line 3163, in execute_remotely
    Task.enqueue(task, queue_name=queue_name)
  File "[path]\Lib\site-packages\clearml\task.py", line 1542, in enqueue
    raise ValueError('Could not find queue named "{}"'.format(queue_name))
ValueError: Could not find queue named "services"

  				
Posted 
	one year ago

					More
				  		
  Report
		
					SarcasticHare65
				
					0
					 × 1

Votes Newest

Answers 9

Everything is configurable. I'd suggest reviewing the documentation for the pipelines
None

  				
Posted 
	12 months ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Thank you very much.
I am aware of the execution_queue as it is configured in the official GitHub examples. However, even in the official GitHub examples, the process remains in a pending state and won't proceed unless the service queue is started.
None
pipe.set_default_execution_queue("default")

Thank you for providing the URL link.
Wouldn't this mean that the service queue is actually mandatory? Also, I would like to learn about the controller rather than the decorator.

  				
Posted 
	12 months ago

					More
				  		
  Report
		
					SarcasticHare65
				
					0
					 × 1

Also review these pages -
None
None

  				
Posted 
	12 months ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Hi @<1761199244808556544:profile|SarcasticHare65> , it looks like the failure results from it not finding a queue called services. Try creating one in the webUI

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

None
None

  				
Posted 
	12 months ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

You don't need to have the services queue, but you need to enqueue the controller into some queue if not running locally. I think this is what you're looking for.
None

  				
Posted 
	12 months ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Is it a self hosted server?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Thank you very much. I have read the page you mentioned. However, I was unable to find information about the necessity of service queues. I have also reviewed the function documentation, service mode, and executed examples (from tasks). Based on that, I am trying to run my own code with local CSV files.
I was asking about issues with reading CSV files and the necessity of service queues.
None
None
None
None
Do you have any helpful information regarding these matters?

  				
Posted 
	12 months ago

					More
				  		
  Report
		
					SarcasticHare65
				
					0
					 × 1

Hi @<1523701070390366208:profile|CostlyOstrich36>
Not self-hosted.
We use clearML's free plan server.
Can I use the pipeline without creating a queue named “services”?
I would like to know what is required as a procedure.
Can't the name default be used for the pipeline?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					SarcasticHare65
				
					0
					 × 1

Write your answer

1K Views

9 Answers

one year ago

12 months ago