Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey,

Hey,
I'm trying to build a clearml pipeline using decorators. My training script includes two modules(.py files) that's stored locally. This however causes an error with clearml as I suppose it creates separate environments for each function. Is there any solution to include local .py files? (Apart from explicitly creating a module)

  
  
Posted one year ago
Votes Newest

Answers 31


Super sorry for being a bit late!

  
  
Posted one year ago

have you tried to add the requirements using Task.add_requirements( local_packages ) in your main file ?

  
  
Posted one year ago

can you share the logs please ?

  
  
Posted one year ago

Sure, in a moment

  
  
Posted one year ago

Nop e

  
  
Posted one year ago

However, I use this to create an instance of a dataloader(torch) this is fed into my next stage in the pipeline - though I import the local modules and add the folders to the path it is unable to unpickle the artifact

  
  
Posted one year ago

How would you structure PyTorch pipelines in clearml? Especially dealing with image data

  
  
Posted one year ago

hey WickedElephant66 TenderCoyote78
I'm working on a solution, just hold on, I update you asap

  
  
Posted one year ago

Not local .py files

  
  
Posted one year ago

Though as per your docs the add_requirements is for a requirements .txt

  
  
Posted one year ago

Umm I suppose that won't work - this package consists of .py scripts that I use for a set of configs and Utils for my model.

  
  
Posted one year ago

Hey so I was able to get the local .py files imported by adding the folder to my path sys .path

  
  
Posted one year ago

image

  
  
Posted one year ago

btw here is the content of the imported file:

import torch
from torchvision import datasets, transforms
import os
MY_GLOBAL_VAR = 32

def my_dataloder ():
return torch.utils.data.DataLoader(
datasets.MNIST(os.path.join('./', 'data'), train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor()
])),
batch_size=32, shuffle=True)

  
  
Posted one year ago

i managed to import a custom package using the same way you did : i have added the current dir path to my system
i have a 2 steps pipeline :

  1. Run a function from a custom package. This function returns a Dataloader (built from torchvision.MNIST) 2) This step receives the dataloader built in the first step as a parameter ; it shows random samples from itthere has been no error to return the dataloader at the end of step1 and to import it at step2. Here is my code :

` from clearml import PipelineDecorator, Task

@PipelineDecorator.component(return_values=['dl'], cache=True,
repo='/home/xxxxxxxx/ClearML/Slack',
packages=['clearml==1.4.1'])
def step_one_IMPORT():
import sys
sys.path.insert(0,'/home/xxxxxxx/ClearML/Slack')
import omamitesh_import
print('==> STEP1: Import the custom import file')
print(f'Imported variable: {omamitesh_import.MY_GLOBAL_VAR}')
dl = omamitesh_import.my_dataloder()
print('Dataloader imported with success')
return dl

@PipelineDecorator.component(return_values=[], cache=True, parents=['step_one_IMPORT'])
def step_two_TEST_IMPORT(dl):
import numpy as np
import PIL.Image as pil
print(f'==> STEP2: Showing DL samples ({dl})')
for (i, sample) in enumerate(dl):
r = np.random.randint(32)
img = sample[0][r].view(28, 28).numpy()
img = pil.fromarray((img * 255).astype(np.uint8))
img.show()
if i > 4:
break

@PipelineDecorator.pipeline(name='220620', project='Issues Repro Pipeline', version='0.0.1',
default_queue='queue-1', pipeline_execution_queue='queue-2')
def pipeline():
# building the pipeline
dl = step_one_IMPORT()
step_two_TEST_IMPORT(dl)

if name == "main":
project_name = 'Issues Repro'
task_name = '220620'
task = Task.init(project_name=project_name, task_name=task_name)

PipelineDecorator.run_locally()
pipeline()
print('pipeline completed') `
  
  
Posted one year ago

I tried it - it works for a library that you can install, not for something local I suppose

  
  
Posted one year ago

With pleasures ! Hope that will help

  
  
Posted one year ago

Thank you so much for being active!

  
  
Posted one year ago

Thanks a lot David!

  
  
Posted one year ago

Here's the code, we're trying to make a pipeline using PyTorch so the first step has the dataset that ’ s created using ‘stuff’ - a local folder that serves as a package for my code. The issue seems to be in the unpicking stage in the train function.

  
  
Posted one year ago

No, it is supposed to have its status updated automatically. We may have a bug. Can you share some example code with me, so that i could try to figure out what is happening here ?

  
  
Posted one year ago

stuff is a package that has my local modules - I've added it to my path by sys.path.insert, though here it isn't able to unpickle

  
  
Posted one year ago

can you share with me an example or part from your code ? I might miss something in wht you intend to achieve

  
  
Posted one year ago

Is there a way to store the return values after each pipeline stage in a format other than pickle?

  
  
Posted one year ago

Hey We figured a temporary solution - by importing the modules and reloading the contents of the artefact by pickle. It still gives us a warning, though training works now. Do send an update if you find a better solution

  
  
Posted one year ago

I'm facing the same issue, is there any solution to this?

  
  
Posted one year ago

Thanks a lot David!

  
  
Posted one year ago

TenderCoyote78
the status should normally be automatically updated . Do all the steps finish successfully ? And also the pipeline ?

  
  
Posted one year ago

you can also specify a package, with or without specifying its version
https://clear.ml/docs/latest/docs/references/sdk/task#taskadd_requirements

  
  
Posted one year ago

Yep, the pipeline finishes but the status is still at running . Do we need to close a logger that we use for scalers or anything?

  
  
Posted one year ago
599 Views
31 Answers
one year ago
10 days ago
Tags