Is There A Reason Why All Clearml.Task Methods Regarding Requirements (E.G. Pip Requirements) Are Class Methods? Are Requirements Not Stored In A Task?

Answered

Is there a reason why all clearml.Task methods regarding requirements (e.g. pip requirements) are class methods? Are requirements not stored in a task?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Votes Newest

Answers 18

Long story short, the Task requirements are async, so if one puts it after creating the object (at least in theory), it might be too late.
Make sense ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Too late for what?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Maybe related question: Will there be some documentation about clearml internals with the new documentation? ClearML seems to store stuff that's relevant to script execution outside of clearml.Task if I am not mistaken. I would like to learn a little bit about what the code structure / internal mechanism is.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Too late for what?

To update the task.requirements before it actually creates it (the requirements are created in a background thread)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

ClearML seems to store stuff that's relevant to script execution outside of clearml.Task

Outside of the cleaml.Task?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Outside of the cleaml.Task?

Ah, nevermind. I thought wrong here.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

To update the task.requirements before it actually creates it (the requirements are created in a background thread)

Why can't it be updated after creation?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Why can't it be updated after creation?

You can but then you have to rerun it again. I mean technically this is obviously solvable, but the idea was to make it simple to use, and since we "assume" in most cases there is a single Task per execution, it made sense. wdyt?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Mhhm, then maybe it is not clear 😂 to me how clearml.Task is meant to be used. I thought of it as being a container for all the information regarding a single experiment that is reflected on the server-side and by this in the WebUI. Now I init() a Task and it will show in the WebUI. I thought after initialization I can still update the task to my liking, i.e. it being a documentation of my experiment.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

I am still not getting why it is a problem to just update the requirements at any time... 😕

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Long story short, the Task requirements are async, so if one puts it after creating the object (at least in theory), it might be too late.

AgitatedDove14 Is there no await/synchronize method to wait for task update?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Is there no await/synchronize method to wait for task update?

Yes, but then we will have to relaunch it (not unthinkable), but I'm still looking for the intimidate value of doing all that work, wdyt?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I think I still don't get how clearml is supposed to work/be used. Why wouldn't the following work currently?
Example:
task = Task.init(...) if not running_remotely: task_dict = task.export_task() requirements = task_dict["script"]["requirements"]["pip"].splitlines() requirement_torch = [r for r in requirements if r.startswith("torch==")] requirements.remove(requirement_torch[0]) requirements.append("torch >= 1.8.1") task_dict["script"]["requirements"]["pip"] = "\n".join(requirements) task.update_task(task_dict) task.execute_remotely(...)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

I think doing all that work is not worth it right now, I am just trying to understand why I clearml seems not to be designed something like this:

` task_name = args.task_name

task = Task()
task = task.load_statedict(await Task.load_or_create(task_name))

task.requirements.add(...)
await task.synchronize()

task.execute_remotely(queue_name, exit=True) `

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

If you think the explanation takes too much time, no worries! I do not want to waste your time on my confusion 😄

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

If you think the explanation takes too much time, no worries! I do not want to waste your time on my confusion

LOL no worries 🙂
Basically the git & python analysis can take some time (I mean it can take a minute! on a large repository)
And we wanted to make sure Task.init returns quickly (it already has to authenticate with the server that slows it down, and a few more things)
The easiest way is to have the code analysis run in the background since usually there is no interaction with this process (on the user end).

Back to your pseudo-code suggestion, is this how a User will use it, or how you suggest clearml implements the feature ? (I'm genuinely interested, we are always looking for ideas on improving the user interface)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Both, actually. So what I personally would find intuitive is something like this:
` class Task:
def load_statedict(self, state_dict):
pass

async def synchronize(self):
    ...

async def task_execute_remotely(self):
    await self.synchronize()
    ...

def add_requirement(self, requirement):
    ...

@classmethod
async def init(task_name):
    task = Task()
    task.load_statedict(await Task.load_or_create(task_name))
    await task.synchronize()
    asyncio.create_task(run_code_analysis().then_synchronize())
     `

I can either use the existing "easy" way or can build my init(). This moves customization from a lot of arguments for Task.init() to something like a custom "builder"-method. I.e. I can use clearml either as a library and build my own workflow or just use the existing predefined workflow.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Then I could also do this:
# My custom very special use case task = Task() task = task.load_statedict(await Task.load_or_create(task_name)) await task.synchronize() await run_code_analysis() task.add_requirement("myreq") await task.synchronize()

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Write your answer

2K Views

18 Answers

4 years ago

2 years ago