Hey @<1678212417663799296:profile|JitteryOwl13> , just to make sure I understand, you want to make your imports inside the pipeline step function, and you're asking whether this will work correctly?
If so, then the answer is yes, it will work fine if you move the imports inside the pipeline step function
No I want to put them inside the pipeline.py file where I config all steps, like this:
from clearml import PipelineDecorator
from train_helpers.common import params
from dataset import DataModule
from train import Trainer
@PipelineDecorator.component(return_values=['_args'], cache=True)
def init_experiment():
_args = params.parse_args()
return _args
@PipelineDecorator.component(return_values=['data'], cache=False)
def data_preparation(args):
data = DataModule(args)
return data
@PipelineDecorator.component(cache=False)
def train_model(args, data):
Trainer(args).train()
@PipelineDecorator.pipeline(name='Pipeline_decorator', project='Pipeline_decorator', version='0.1', pipeline_execution_queue=None)
def main():
args, setup_logger = init_experiment()
data = data_preparation(args)
train_model(args, data)
if __name__ == '__main__':
# PipelineDecorator.debug_pipeline()
PipelineDecorator.run_locally()
main()
Ah, I see now. There are a couple of ways to achieve this.
- You can enforce that the pipeline steps execute within a predefined docker image that has all these submodules - this is not very flexible, but doesn't require your clearml-agents to have access to your Git repository
- You can enforce that the pipeline steps execute within a predefined git repository, where you have all the code for these submodules - this is more flexible than option 1, but will require clearml-agents to have access to your Git repositoryFor agents to be able to access your Git repository, you must either specify
agent.git_user
andagent.git_pass
inclearml.conf
files on the worker machines, or to register ssh keys for those machines in your Git hosting server (like Bitbucket or Github) and addagent.force_git_ssh_protocol=true
to thoseclearml.conf
files I mentioned previously