Avoiding

Answered

Avoiding

Avoiding http://Clear.ml glue code spaghetti - community best practices?

Say I have training pipeline :
Task 1 - data preprocessing -> create a dataset artifact Task 2 - model training: load dataset artifact from (1), train and output the model
There are multiple ways of achieving that, but just as an example, here is a snippet from the docs (written for models but nevermind, it’s just an example):
prev_task = Task.get_task(task_id='the_training_task') last_snapshot = prev_task.models['output'][-1] local_weights_path = last_snapshot.get_local_copy()
If I want to use http://clear.ml , I need to run these lines somewhere between task 1 and task 2.

The http://clear.ml documentation is rife with code examples that mix http://clear.ml boilerplate with the ML code in one happy party.

However this type “third party glue code” is a well documented source of technical debt in ML (e.g. https://proceedings.neurips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf ).

I would like to design the code such that it can run independently of http://Clear.ml and where core tasks are clean from http://clear.ml boilerplate.

I came up with a bunch of idioms for this - For example I use an interface I created for Datasets, and wire different implementations using Hydra configurations depending on whether I am running with or without http://clear.ml

I was wondering : has the community created any coding best practices for how to achieve separation of http://clear.ml glue for for various common tasks/flows ?
Anyone cares to share what they do around this?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					RoughTiger69
				
					0
					 × 1

Votes Newest

Answers 5

Hi RoughTiger69

How about using the pipeline decorator as a way to run this logic?
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py

I think I'm missing the context of where the code is executed....

btw: you can now set the configuration_objects directly when calling add_step 🙂
https://clearml.slack.com/archives/CTK20V944/p1633355990256600?thread_ts=1633344527.224300&cid=CTK20V944

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 Not sure the pipeline decorator is what I need.

Here’s a very simplified example to my question.

Say I want to train my model on some data.
Before adding http://clear.ml , the code looks something like:
def train(data_dir, ...): ...
Now I want to leverage the data versioning capability in http://clear.ml
So now, the code needs to fetch dataset by ID, save it locally, and let the model train on it as before:
from clearml import Dataset def train_clearml(dataset_id): ds = Dataset.get(dataset_id) self.base_dir = ds.get_local_copy() train(self.base_dir)
Another example would be to report artifacts like the output model from my training code.
Here, the http://clear.ml decorates the call to train() and needs to receive and report the output model.

I want to achieve the following:
Be able to trigger the “pure” function (e.g. train()) locally, without any http://clear.ml code running, while driving it from a configuration e.g. path to the data. Be able to trigger the “ http://clear.ml decorator” (e.g. train_clearml()) while driving it from configuration e.g. dataset_id Keep the code of (2) completely separate from the code to (1) - purely write it in separate modules where (2) delegates to (1) but not the other way around
While this is a simple example, it shows the following dilemmas:
How to maintain a configuration variant with and without the http://clear.ml code How to globally select if I am running with or without http://clear.ml How to design the interface between the http://clear.ml wrapper and the pure ML code when it needs to report artefacts
This gets more complex if the workflow includes inputs, outputs, scalars, and other goodies are being reported and fetched from http://clear.ml all along the workflow…

I was wondering
if the community has encountered these concerns and whether there are some recommended best practices for achieving the above separation? Or alternatively - if people feel that this is an anti-requirement and feel it’s better to continue writing http://clear.ml code mixed in with their pure ML code?
Hope I managed to clarify my question 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					RoughTiger69
				
					0
					 × 1

Be able to trigger the “pure” function (e.g. train()) locally, without any

code running, while driving it from a configuration e.g. path to the data.

When you say " without any http://clear.ml code" do mean without the agent, or without using the Clearml.Dataset ?

Be able to trigger the “

decorator” (e.g. train_clearml()) while driving it from configuration e.g. dataset_id

Hmm I can think of:
` def train_clearml(local_folder=None, dataset_id=None):
if Task.current_task():
params = dict(local_folder=local_folder, dataset_id=dataset_id)
Task.current_task().connect(params, name='train section')
local_folder, dataset_id = param['local_folder'], param['dataset_id']

if dataset_id:
ds = Dataset.get(dataset_id)
self.base_dir = ds.get_local_copy()
else:
self.base_dir = local_folder
train(self.base_dir) `*actually it will be nice is we could have used "locals()" instead of creating the dict and updating it back.. anyhow...

This is just a start, but is this the direction you are after ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I mean that there will be no task created, and no invocation of any http://clear.ml API whatsoever including no imports in the “core ML task” This is the direction - add very small wrappers of http://clear.ml code around the core ML task. The http://clear.ml wrapper is “aware’ of the core ML code, and never the other way. For cases where the wrapper is only “before” and “after” the core ML task, its somewhat easier to achieve. For reporting artifacts etc. which is “mid flow” - it’s more tricky Another success criteria is that I would be able to switch between running the full code base “with” or “without” http://clear.ml from one flag.
Anyhow from your response is it safe to assume that mixing in http://clear.ml code with the core ML task code has not occured to you as something problematic to start with?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					RoughTiger69
				
					0
					 × 1

Anyhow from your response is it safe to assume that mixing in

code with the core ML task code has not occurred to you as something problematic to start with?

Correct 🙂 Actually we believe it makes it easier, as worst case scenario you can always run clearml in "offline" without the need for the backend, and later if needed you can import that run.
That said, regrading (3), the "mid" interaction is always the challenge, clearml will do the auto tracking/upload of the models/checkpoints created, and metrics, but anything else (aka artifacts) is custom, so no real standard interface to connect to (I think). My suggestion would be to wither provide callback functionality in the wrapper (i.e. call a function to store artifacts, then the wrapper can either use clearml or store locally), or decide on a standard output folder and just upload the entire folder (which I have to admit I'm not a fan of, because you loose some context information on artifacts when you only know the file names)
wdyt?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

2K Views

5 Answers

3 years ago

2 years ago