Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi! I'Ve Been Trying Out The

Hi! I've been trying out the TaskScheduler functionality. This is a great feature! However I ran into a couple of problems, and wanted to clarify some things 🙂
I created a scheduler and tried to start it remotely, however I got an error "Failed deserializing configuration" . Here is the code I used scheduler = TaskScheduler(force_create_task_project="some_project") scheduler.add_task(schedule_task_id=some_task_id, queue="some_queue", name="scheduler_test", target_project="some_project", minute=1) scheduler.start_remotely(queue="some_queue")It seems that when running start_remotely , it tries the deserialize tasks, even though it has never serialized them before in https://github.com/allegroai/clearml/blob/master/clearml/automation/scheduler.py#L287 . I managed to get my code to run, but only after I made the following change:
if Task.running_locally(): self._serialize_state() self._serialize() else: self._serialize() self._deserialize()I am not sure if I am doing something wrong.
2. When adding a task in scheduler.add_task() , there is no option to specify the start_time of the schedule execution, it is set to datetime.now() per default. Do you plan to add this in the future or is there a way to specify start_time somewhere else?
3. When starting the scheduler for the first time, the last execution time is assumed to be start_time , next_run time is calculated based on that, and the scheduler does not launch the task until the first scheduling interval is finished (this is what I observed when setting the interval to 5 min). To me it seems more intuitive if next_run time were set to start_time , when we launch the scheduler; and then update it according to the scheduler interval parameters. Otherwise I can see a situation where we have a large interval (e.g. 7 days), and we have to wait for a week till we have the first task run. Do I understand the code correctly?

  
  
Posted 2 years ago
Votes Newest

Answers 10


Yes, a boolean flag for (3) would be a good option!
In (1), the code snippet from my original question is the code that I execute on my local machine. It is submitted to a queue with an agent that runs with docker. Please, let me know if this is enough to reproduce it.

  
  
Posted 2 years ago

Thank you for your reply AgitatedDove14 !
For the point 1, here is what is printed out when the task starts execution for the first timeSyncing scheduler Failed deserializing configuration: the JSON object must be str, bytes or bytearray, not NoneTypeHowever, now I see that this actually doesn't break the code, and this exception is ignored. The task is not started in the first run, but this is because of the reasons discussed in the point 3, I assume. Sorry for a false alarm here.
For the point 2, I was thinking of a use case where we have a heavy job to run, which we would like to schedule for a night time on a Saturday, for example, when the server load is low. Or can this be achieved by e.g. scheduling a job for weekday=saturday, hour=3 ? For the point 3, yes, I think we would expect the execution immediately. For example, if we want a job to run every 15 min, we don't want to wait 15 minutes for the first execution to happen, we want to start straight away, and then repeat every 15 min 🙂

  
  
Posted 2 years ago

(2) yes weekdays with specific hour should do exactly that:)
(3) yes I see your point, maybe we should add boolean allowing you to run immediately?
Back to (1) , let me see if I can reproduce, anything specific I need to add to the schedule call?

  
  
Posted 2 years ago

I'm getting the same error when using TaskScheduler. Is there any updates on this?

  
  
Posted 2 years ago

First that is awesome to hear PanickyFish98 !
Can you send the full exception? You might be on to something...
2. Actually we thought of it, but could not find a use case, can you expand?
3. I'm not sure I follow, do you mean you expect the first execution to happen immediately?

  
  
Posted 2 years ago

AgitatedDove14 - any doc yet for scheduler? Is it essentially for just time based scheduling?

  
  
Posted 2 years ago

Ok code suggests so. Looking for more powerful pipeline scheduling like on datasets publish, actions on model publish etc

  
  
Posted 2 years ago

🙏

  
  
Posted 2 years ago

You're getting
Syncing scheduler Failed deserializing configuration: the JSON object must be str, bytes or bytearray, not NoneTypeLike before? Are all the symptoms the same as above?

  
  
Posted 2 years ago
563 Views
10 Answers
2 years ago
one year ago
Tags
Similar posts