Feels like we've been over this 😄 Has there been new developments perhaps?
It's essentially that this - https://clear.ml/docs/latest/docs/guides/advanced/multiple_tasks_single_process cannot work in a remote execution.

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

It's okay 🙂 I was originally hoping to delete my "initializer" task, but I'll just archive it if someone is interested in the worker data etc. Setting the queue is quite nice.

I think this should get my team excited enough 😄

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

The new task is not running inside a new subprocess. Our platform trains several models, and we'd like each of them to be tracked in their own Task . When running locally, this is "out of the box", as we can init and close before and after each model.
When running remotely, one cannot close the main task (since it is what orchestrates everything), and so this workaround was needed.

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Feels like we've been over this

LOL, I think I can't wrap my head around the use case 🙂

When running locally, this is "out of the box", as we can init and close before and after each model.

I finally got it! Task.init should be dubbed "init Main task" , automagic kicks in Only when it is the only one existing. You remote execution is "linear" Task after Task, in theory a good candidate for pipeline.

Basically option (2) , the main task is being "replaced" (which locally would be task.close), right?
Your use case is a good example of Task.create use case (which you use), as it does n't mean it is the "main" Task, but as a result, all reports have to be done manualy.

How would you improve the current state? (take into account that when the agent spins the remote Task, it will continue to report console outputs, and monitor the status of the Original task)

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Are they ephemeral or later used by other Tasks, execution etc ?
For example: configuration files, they are specific for an execution, and someone will edit them.
Initial weights files, are something that multiple execution might needs them, and they will be used to restore an execution. Data, even if changing, is usually used by multiple executions tasks etc.
It seems like you treat these files as "configurations", is that right ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

UnevenDolphin73

we'd like the remote task to be able to spawn new tasks,

Why is this an issue? this should work out of the box ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Honestly, this is all related to issue #340. The only reason we have this to begin with is because we need one separate "initializer" task that downloads the remote cache and prepares the agent environment for execution (downloading the configuration files, etc).
Otherwise it fits perfectly with pipelines, but we're not there yet.

In the local execution we don't have this initializer task, so we use Task.init() before starting to work on a model, and task.close() when we're done.

I'd suggest some task.detach() method for remote execution maybe? We still find use-cases for this initializer task (it holds the original user and as such we can easily filter and report it to Slack, etc).

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

UnevenDolphin73 I have a suspicion we have a few terms mixed:
hyperparameters :
These are essentially key/value.
when you call Task. connect (dict_with_params), clearml will flatten the dict and you end up with key/value
configuration objects :
These are actually blobs of text, the UI will show as is
When you call my_local_file=Task. connect_configuration (name, "path/to/config/file")
The entire Content of the config file is stored on the Task object itself.

Back to the use case, instead of:
python train.py --config_file path/to/local/file.yaml
then inside the code:
` my_param_file = args.config_file
with open(my_param_file, 'rt'):

read parse etc. `You could bake both into the same line:

` my_param_file = task.connect_configuration("config_file", "path/to/local/file.yaml")
with open(my_param_file, 'rt'):

read parse etc. `When the "connect_configuration" is used, it actually combines the need to have both an argument pointing to the config file, and the content of the config file. It was designed to solve this exact use case. Am I making sense ?

EDIT:

"then our yaml file contains

!include

"

Is this the point where connect_configuration breaks ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Honestly, this is all related to issue #340.

makes total sense.
But actually this id different from #340. The feature is to store the Data on the Task, this means each Task in your "pipeline" will be upload a new copy of the data. No?

I'd suggest some

task.detach()

method for remote execution maybe

That is a good idea, in theory it can also be used in local execution

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

It does not 🙂
We started discussing it here - https://clearml.slack.com/archives/CTK20V944/p1640955599257500?thread_ts=1640867211.238900&cid=CTK20V944
You suggested this solution - https://clearml.slack.com/archives/CTK20V944/p1640973263261400?thread_ts=1640867211.238900&cid=CTK20V944
And I eventually found this solution to work - https://clearml.slack.com/archives/CTK20V944/p1641034236266500?thread_ts=1640867211.238900&cid=CTK20V944

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Am I making sense ?

No, not really. I don't see how task.connect_configuration interacts with our existing CLI? Additionally, the documentation for task.connect_configuration say the second argument is the name of a file, not the path to it? So something is off

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

So one can override the queue ID but not the worker

apparently ... I can't think of a good reason for that actually ...

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

But since this has come up a lot recently, any updates on #340? 😍

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

hmm, yes, but then this kind of a hacky solution... The original #340 was about packaging source code that was not in git... Now we want to add "data" (even if ephemeral) on to it, no?
My thinking is somehow make sure a Task can reference a "Dataset" to be downloaded before it starts by the agent ?!

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 the issue was that we'd like the remote task to be able to spawn new tasks, which it cannot do if I use Task.init before override_current_task_id(None) .

When would this callback be called? I'm not sure I understand the usecase.

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Debugging. It's very useful for us to be able to see the contents of the configuration and understand what is going on and what is meant to be going on. Without a preview (which in our case is the entire content of the configuration file), one has to take an annoying route of downloading the files etc. The configurations are uploaded to a single task and then linked across all task to conserve storage space (so the S3 storage point is identical across tasks) Sure, sounds good. I think it's a bit odd to enforce this only to the configurations section though? 🤔 It would be nice to be able to say "this artifact/configuration is needed prior to execution" for each artifact/configuration separately.

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

As the meme goes, well yes but actually no, since the input path is provided via argparse? I'm also not sure how this would help debug from the WebUI - you can't really see the contents of a zipped file/the configuration tab is too messy for such a nested configuration as the one we have. It's best suited as an artifact.

EDIT: Or am I missing something? Point being, when the remote execution begins, the entry point tries to run e.g. python train.py --config_file path/to/local/file.yaml which then fails, since that file does not exist.
Even if we override argparse with some arguments from ClearML (I suppose this is the idea behind the autoconnect feature?), then our yaml file contains !include instructions, referring to other yaml files, using a relative path. Then, our config file may or may not refer to additional files (again relative path).
As of now, we internally analyze the configuration file, use StorageManager to upload everything, check if we're running as a remote execution before doing anything, and if so, download using StorageManager and cleanup using boto3 .

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

That could work, given that:
Could we add a preview section? One reason I don't like using the configuration section is that it makes debugging much much harder. Will the clearml-agent download and unzip the files, placing them into the same local folder as needed for execution? What if we want to include non-configuration objects? (i.e. the model case I listed)

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

regrading the artifact, yes that make sense, I guess this is why there is "input" type for an artifact, the actual use case was never found (I guess until now?! what are you point there?)
Regrading the configuration

It's very useful for us to be able to see the contents of the configuration and understand

Wouldn't that just do exactly what you are looking for:
` local_config_file_that_i_can_always_open = task.connect_configuration("important", "/path/to/config/I/only/have/on/my/machine")
with open(local_config_file_that_i_can_always_open, 'rt') as f:

do something `This means that when running locally:

local_config_file_that_i_can_always_open == "/path/to/config/I/only/have/on/my/machine"And when running with an agent
local_config_file_that_i_can_always_open == "/tmp/config/file/from_my_machine.stuff"wdyt?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Most of these are configurations (specific for an execution, but one such configuration defines multiple tasks). Some models might be uploaded if the user does not use our built-in link to ClearML model fetching 😄

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Yeah that works too. So one can override the queue ID but not the worker 🤔

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

I guess it's mixed. If #340 is resolved, then this initializer task will be a no-op: detach, and init-close new tasks as needed.

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Since this is a single process, most of these are only needed once when our "initializer" task starts and loads.

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

1

One reason I don't like using the configuration section is that it makes debugging much much harder.

debugging ? please explain how it relates to the configuration, and presentation (i.e. preview)
2.
Yes in theory, but in your case it will not change things, unless these "configurations" are copied on any Task (which is just storage, otherwise no real harm)
3.
I was thinking "zip" file that the Task creates and uploads, and a new configuration type, say "external/zip" , and in the config section have something like
url: target: ./wdyt?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

And yes, our flow would break anyway with the internal references within the yaml file. It would be much simpler if we could specify the additional files

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

UnevenDolphin73 following the discussion https://clearml.slack.com/archives/CTK20V944/p1643731949324449 , I suggest this change in the pseudo code
` # task code
task = Task.init(...)

if not task.running_locally() and task.is_main_task():
# pre-init stage
StorageManager.download_folder(...) # Prepare local files for execution
else:
StorageManager.upload_file(...) # Repeated for many files needed
task.execute_remotely(...) `Now when I look at is, it kinds of make sense to have to callbacks added to execute_remotely (basically covering the exact if statement I have above)
wdyt?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

It's always the details... Is the new Task running inside a new subprocess ?
basically there is a difference between
remote task spawning new tasks (as subprocesses, or as jobs on remote machine), remote task still running remote task, is being replaced by a spawned task (same process?!)UnevenDolphin73 am I missing a 3rd option? which of these is your case?
p,s. I have a suspicion that there might be a misuse of "Task" here?! What are you considering a Task? (from clearml perspective a Task is Not a function, a Task is an entire standalone process, usually not a very short one)

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I didn't mention code in #340 nor did I mention data here 😄 The idea was to package non git-specific files for remote execution

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Hmm, so what I'm thinking is "extending" the capabilities of the "configuration" section (as it seems this is the right context). Allowing to upload a bunch of files (with the same mechanism as artifacts), as zip files, in the configuration "editable" section have the URL storing the zip, together with the target folder. wdyt?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

task.upload_artifact(..., is_requirement=True) , task.connect_configuration(..., is_requirement=True)
Just implies these artifacts/configurations must be downloaded prior to running the code itself; then you also don't have to worry about zipping? 🤔

  				
Posted 
	3 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Answers 30

read parse etc. `You could bake both into the same line:

read parse etc. `When the "connect_configuration" is used, it actually combines the need to have both an argument pointing to the config file, and the content of the config file. It was designed to solve this exact use case. Am I making sense ?

do something `This means that when running locally: