I Have Some Code That Launches Ml Tasks And It Accepts A Yaml File,

Answered

I have some code that launches ML tasks and it accepts a YAML file, .env file and various CSVs. What would be the best way to upload these to a clearml task so that it runs natively (with execute_remotely )?
So far I've used StorageManager to upload and download these from a local S3 bucket, but I've now noticed that the tasks expose a cache_dir property, should that be used instead? Or should these files be registered as artifacts..?
EDIT: Additionally, its not immediately clear to me how does one augment the arguments captured in argparse for the new task?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Votes Newest

Answers 26

Maybe this is part of the paid version, but would be cool if each user (in the web UI) could define their own secrets, and a task could then be assigned to some user and use those secrets during boot?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

I should maybe mention that the security regarding this is low, since this is all behind a private VPN server anyway, I'm mostly just interested in having the credentials used for backtracking purposes

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Each user creates a

.env

file for their needs or exports them in the shell running the python code. Currently I copy the environment variables to an S3 bucket and download it from there

That is a great hack, but who carries the credentials for the S3 bucket? the reason for asking is I;m thinking maybe the code would directly do that (meaning download the .env file and apply them?!)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Maybe. When the container spins, are there any identifiers regarding the task etc available? I create a folder on the bucket per python train.py so that the environment variables files doesn't get overwritten if two users execute almost-simultaneously

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Maybe this is part of the paid version, but would be cool if each user (in the web UI) could define their own secrets,

Very cool (and actually how it works), but at the end someone needs to pay for salaries 😉

The S3 bucket credentials are defined on the agent, as the bucket is also running locally on the same machine - but I would love for the code to download and apply the file automatically!

I have an idea here, why not use the "docker bash script" argument for that ?
It could be a script always running at the beginning of each execution, or a script each user can add on top?
https://github.com/allegroai/clearml-agent/blob/3c8e0ae5dbf819a8125d0a8b8e70739e1e8b37fe/docs/clearml.conf#L145
wdyt?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi UnevenDolphin73

Maybe. When the container spins, are there any identifiers regarding the task etc available?

You mean at the container level or at clearml?

I create a folder on the bucket per

python train.py

so that the environment variables files doesn't get overwritten if two users execute almost-simultaneously

Nice 🙂 I have an idea, how about per user ID? then they can access their "secrets" based on the owner of the Task ?
task.data.user

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi UnevenDolphin73 , are those per user/project/system environment variables ?
If these are secrets (that you do not want to expose), maybe it is best just to have them on he agent's machine ?

BTW, I think there is some "vault" support in the paid tiers for these kind of secret, not sure on which level (i.e. user/system/project)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

You mean at the container level or at clearml?

Yes, the container level (when these docker shell scripts run).
The per user ID would be nice, except I upload the .env file before the Task is created (it's only available really early in the code).

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Thanks for the reply CostlyOstrich36 !
Does the task read/use the cache_dir directly? It's fine for it to be a cache and then removed from the fileserver; if users want the data to stay they will use the ClearML Dataset 🙂

The S3 solution is bad for us since we have to create a folder for each task (before the task is created), and hope it doesn't get overwritten by the time it executes.

Argument augmentation - say I run my code with python train.py my_config.yaml -e admin.env , then ClearML logs the my_config.yaml and -e admin.env as arguments for execute_remotely . I'd like to perhaps augment these to be config.yaml and remove the -e admin.env flag.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

That makes sense...
Basically in the open-source version the approach is everyone sees everything for maximum transparency (and also ease of use). I know there are access-roles in the paid tier and vault for exactly these types of things...

Where do you currently save them? and how do you pass them to the remote machine ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

UnevenDolphin73 , Hi!

I would avoid using cache_dir since it's only a cache. I think using S3 or the fileserver with Task.upload_artifact() is a nice solution

Also what do you mean by 'augment' arguments?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

UnevenDolphin73 , I think this might be right up your alley:
https://clear.ml/docs/latest/docs/references/sdk/task/#connect_configuration

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

I guess the big question is how can I transfer local environment variables to a new Task

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Each user creates a .env file for their needs or exports them in the shell running the python code. Currently I copy the environment variables to an S3 bucket and download it from there.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Yes, the container level (when these docker shell scripts run).

I think this is the tricky part, in code you can access the user ID of the Task, and download the .env and apply it, but before the process starts I can't really think of a way to do that ...
That said, I think that in the paid version they have "vault" support, which allows you to store the .env file on the clearml-server, and then the agent automatically applies it at the beginning of the container execution.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I mean, I know I could connect_configuration({k: os.environ.get(k) for k in [...]}) , but then those environment variables would be exposed in the ClearML UI, which is not ideal (the environment variables in question hold usernames and passwords, required for DB access)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

You can mix and match various buckets in your ~/clearml.conf

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

The S3 bucket credentials are defined on the agent, as the bucket is also running locally on the same machine - but I would love for the code to download and apply the file automatically!

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Also I appreciate the time youre taking to answer AgitatedDove14 and CostlyOstrich36 , I know Fridays are not working days in Israel, so thank you 🙂

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Great, thanks! Any idea about environment variables and/or other files (CSV)? I suppose I could use the task.upload_artifact for the CSVs. but I'm still unsure about the environment variables

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Thanks AgitatedDove14 , I'll first have to prove viability with the free version :)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

I'm not sure I follow, how would that solution look like?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

No worries, it's always good to know what can be built later.
I would start with a static .env file (i.e. the same for everyone), or start with hacking the python code to load the .env at the beginning 🤞

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

These are per-user. Essentially we log user DB access as well (for various backtracking afterwards), so it's beneficial for us to pass the user DB secrets to the task and not have it configured once on the agent.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

So the way it works anything in the " extra_docker_shell_script " section is executed inside the container everytime the container spins. I'm thinking that the
extra_docker_shell_script will pull the environment file from an S3 bucket and apply all "secrets" (or secrets are embedded into the startup bash script, like "export AWS_SECRET=abcdef"), that said this will not be on a per user basis 😞
Does that help?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

The overall flow I currently have is e.g.
Start an internal task (not ClearML Task; MLOps not initialized yet) Call some pre_init function with args so I can upload the environment file via StorageManager to S3 Call some start_run function with the configuration dictionary loaded, so I can upload the relevant CSV files and configuration file Finally initialize the MLOps (ClearML), start a task, execute remotely
I can play around with 3/4 (so e.g. upload CSVs and configuration directly with the ClearML task), but (2) is hidden and wont have access to the ClearML task directly.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Write your answer

2K Views

26 Answers

4 years ago

2 years ago