Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello All, I Want To Clarify Something. In The

Hello all,

I want to clarify something. In the ClearML Task Scheduler .add_tast() method there's a parameter for schedule_function . I think I had some assumptions based on other task schedulers that biased the way I thought of how this scheduler worked. I assumed the schedule_function was there to return a task_id at runtime that could be used for the task scheduler. But is that not what it's for? Is this just some function that's run instead of running a task?

  
  
Posted 8 months ago
Votes Newest

Answers 12


Great, thank you!

  
  
Posted 8 months ago

Sounds good. Lmk if there's some changes that are required.

  
  
Posted 8 months ago

No need, I think I will review it on Monday

  
  
Posted 8 months ago

Should I post this in dev?

  
  
Posted 8 months ago

With that said, can I run another thing by you related to this. What do you think about a PR that adds the functionality I originally assumed schedule_function was for? By this I mean: adding a new parameter (this wouldn't change anything about schedule_function or how .add_task() currently behaves) that also takes a function but the function expects to get a task_id when called. This function is run at runtime (when the task scheduler would normally execute the scheduled task) and use the task_id returned by the function + the other parameters from .add_task() as the scheduled task.

That is a great idea actually. Are you going to write a PR for this?

  
  
Posted 8 months ago

Hi @<1545216070686609408:profile|EnthusiasticCow4> ! That's correct. The job function will run in a separate thread on the machine you are running the scheduler from. That's it. You can create tasks from functions tho using backend_interface.task.populate.CreateFromFunction.create_task_from_function

  
  
Posted 8 months ago

I think we should just have a new parameter

  
  
Posted 8 months ago

None

  
  
Posted 8 months ago

Sure, thank you!

  
  
Posted 8 months ago

@<1523701435869433856:profile|SmugDolphin23> Yeah, I just wanted to validate it was worth spending the time. Since there is already a parameter that takes callable (i.e. schedule_function ) it might make sense that we reuse the parameter. If it returns a str we validate that it's a task and if it does we can run the task as if we originally passed it as the task_id in .add_task() . This would only be a breaking change if the callable that was passed happened to return a task_id . Or do you think it would be better just to add a new parameter?

  
  
Posted 8 months ago

Thanks Eugen for the quick reply. If I can add a suggestion/comment from my perspective: Why is schedule_function included in the .add_task() method? As far as I can tell if you use schedule_function it changes the very nature of the method, it's no longer adding a task but adding a function . It seems like it would make more sense if this was broken into something like an .add_function() method. Also, if you call schedule_function many of the other parameters in .add_task() don't make sense. What is task_overrides overriding if you use schedule_function ? I think this would also make sense given the other places in ClearML where there are distinctions made between running a task vs running function .

Ok, grandpa's rant is over. 👴

With that said, can I run another thing by you related to this. What do you think about a PR that adds the functionality I originally assumed schedule_function was for? By this I mean: adding a new parameter (this wouldn't change anything about schedule_function or how .add_task() currently behaves) that also takes a function but the function expects to get a task_id when called. This function is run at runtime (when the task scheduler would normally execute the scheduled task) and use the task_id returned by the function + the other parameters from .add_task() as the scheduled task.

Why is this useful: there's a host of reasons but the biggest one: it gives users much more control over the tasks that are run by the task scheduler. Currently, as far as I can tell, if I wanted to run the most recent task (at runtime) from a given project with a specific tag, it's not possible to do with the task scheduler. I can use the schedule_function parameter and create a function that finds and runs the task but then I lose one of the core advantages of .add_task() , no way to specify queues, task_parameters and task_overrides . Naturally, I could wrap all of that into the function called by task_parameters but then I'm basically just writing my own scheduler at that point.

  
  
Posted 8 months ago

I figured you'd say that so I went ahead with that PR. I got it working but I'm going to test it a bit further.

  
  
Posted 8 months ago
583 Views
12 Answers
8 months ago
8 months ago
Tags