Hello All, I Want To Clarify Something. In The

Answered

Hello all,

I want to clarify something. In the ClearML Task Scheduler .add_tast() method there's a parameter for schedule_function . I think I had some assumptions based on other task schedulers that biased the way I thought of how this scheduler worked. I assumed the schedule_function was there to return a task_id at runtime that could be used for the task scheduler. But is that not what it's for? Is this just some function that's run instead of running a task?

  				
Posted 
	one year ago

					More  		
  Report
		
					EnthusiasticCow4
				
					0
					 × 1

Votes Newest

Answers 12

I think we should just have a new parameter

  				
Posted 
	one year ago

					More  		
  Report
		
					SmugDolphin23
				
					0

None

  				
Posted 
	one year ago

					More  		
  Report
		
					EnthusiasticCow4
				
					0
					 × 1

Hi EnthusiasticCow4 ! That's correct. The job function will run in a separate thread on the machine you are running the scheduler from. That's it. You can create tasks from functions tho using backend_interface.task.populate.CreateFromFunction.create_task_from_function

  				
Posted 
	one year ago

					More  		
  Report
		
					SmugDolphin23
				
					0

Thanks Eugen for the quick reply. If I can add a suggestion/comment from my perspective: Why is schedule_function included in the .add_task() method? As far as I can tell if you use schedule_function it changes the very nature of the method, it's no longer adding a task but adding a function . It seems like it would make more sense if this was broken into something like an .add_function() method. Also, if you call schedule_function many of the other parameters in .add_task() don't make sense. What is task_overrides overriding if you use schedule_function ? I think this would also make sense given the other places in ClearML where there are distinctions made between running a task vs running function .

Ok, grandpa's rant is over. 👴

With that said, can I run another thing by you related to this. What do you think about a PR that adds the functionality I originally assumed schedule_function was for? By this I mean: adding a new parameter (this wouldn't change anything about schedule_function or how .add_task() currently behaves) that also takes a function but the function expects to get a task_id when called. This function is run at runtime (when the task scheduler would normally execute the scheduled task) and use the task_id returned by the function + the other parameters from .add_task() as the scheduled task.

Why is this useful: there's a host of reasons but the biggest one: it gives users much more control over the tasks that are run by the task scheduler. Currently, as far as I can tell, if I wanted to run the most recent task (at runtime) from a given project with a specific tag, it's not possible to do with the task scheduler. I can use the schedule_function parameter and create a function that finds and runs the task but then I lose one of the core advantages of .add_task() , no way to specify queues, task_parameters and task_overrides . Naturally, I could wrap all of that into the function called by task_parameters but then I'm basically just writing my own scheduler at that point.

  				
Posted 
	one year ago

					More  		
  Report
		
					EnthusiasticCow4
				
					0
					 × 1

I figured you'd say that so I went ahead with that PR. I got it working but I'm going to test it a bit further.

  				
Posted 
	one year ago

					More  		
  Report
		
					EnthusiasticCow4
				
					0
					 × 1

With that said, can I run another thing by you related to this. What do you think about a PR that adds the functionality I originally assumed schedule_function was for? By this I mean: adding a new parameter (this wouldn't change anything about schedule_function or how .add_task() currently behaves) that also takes a function but the function expects to get a task_id when called. This function is run at runtime (when the task scheduler would normally execute the scheduled task) and use the task_id returned by the function + the other parameters from .add_task() as the scheduled task.

That is a great idea actually. Are you going to write a PR for this?

  				
Posted 
	one year ago

					More  		
  Report
		
					SmugDolphin23
				
					0

Sure, thank you!

  				
Posted 
	one year ago

					More  		
  Report
		
					SmugDolphin23
				
					0

No need, I think I will review it on Monday

  				
Posted 
	one year ago

					More  		
  Report
		
					SmugDolphin23
				
					0

Sounds good. Lmk if there's some changes that are required.

  				
Posted 
	one year ago

					More  		
  Report
		
					EnthusiasticCow4
				
					0
					 × 1

Should I post this in dev?

  				
Posted 
	one year ago

					More  		
  Report
		
					EnthusiasticCow4
				
					0
					 × 1

Great, thank you!

  				
Posted 
	one year ago

					More  		
  Report
		
					SmugDolphin23
				
					0

SmugDolphin23 Yeah, I just wanted to validate it was worth spending the time. Since there is already a parameter that takes callable (i.e. schedule_function ) it might make sense that we reuse the parameter. If it returns a str we validate that it's a task and if it does we can run the task as if we originally passed it as the task_id in .add_task() . This would only be a breaking change if the callable that was passed happened to return a task_id . Or do you think it would be better just to add a new parameter?

  				
Posted 
	one year ago

					More  		
  Report
		
					EnthusiasticCow4
				
					0
					 × 1

Write your answer

1K Views

12 Answers

one year ago