Hello Folks! I Don'T Know If This Issue Has Already Been Addressed. I Have A Basic Pipelinecontroller Script With Two Steps: One Of Task Is For Preprocessing Purposes And The Other For Training A Model. Currently I Am Placing The Code Related To The Pack

Unanswered

I mean that I have a script for data preprocessing task where I need the following dependencies:

` import sys
from pathlib import Path
from contextlib import contextmanager

import numpy as np
from clearml import Task

with add_temporary_module_search_path("/home/user/myclearML/"):
from helpers import (
read_netcdf_dataset,
write_records,
) However, the xarray package is a dependency of the helpers module which is required by the read_netcdf_dataset function. Since helpers ` is a custom module that is imported into the preprocessing task script, clearML is unable to detect it as a dependency and, therefore, does not install it in the environment it creates for the preprocessing task.

That is the reason why I add the Task.add_requirements part to indicate to the agent that I will need those dependencies. The problem is that the agent reinstalls all the requirements for the next task (the training task), even though both tasks share the same environment. So my question is whether there is a way to tell the PipelineController to only generate the packages environment once and use it in the training task as well.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					GiganticTurtle0
				
					0
					 × 1

202 Views

0 Answers

3 years ago

one year ago