Hi! I'Ve Been Trying Out The

Answered

Hi! I've been trying out the TaskScheduler functionality. This is a great feature! However I ran into a couple of problems, and wanted to clarify some things 🙂
I created a scheduler and tried to start it remotely, however I got an error "Failed deserializing configuration" . Here is the code I used scheduler = TaskScheduler(force_create_task_project="some_project") scheduler.add_task(schedule_task_id=some_task_id, queue="some_queue", name="scheduler_test", target_project="some_project", minute=1) scheduler.start_remotely(queue="some_queue")It seems that when running start_remotely , it tries the deserialize tasks, even though it has never serialized them before in https://github.com/allegroai/clearml/blob/master/clearml/automation/scheduler.py#L287 . I managed to get my code to run, but only after I made the following change:
if Task.running_locally(): self._serialize_state() self._serialize() else: self._serialize() self._deserialize()I am not sure if I am doing something wrong.
2. When adding a task in scheduler.add_task() , there is no option to specify the start_time of the schedule execution, it is set to datetime.now() per default. Do you plan to add this in the future or is there a way to specify start_time somewhere else?
3. When starting the scheduler for the first time, the last execution time is assumed to be start_time , next_run time is calculated based on that, and the scheduler does not launch the task until the first scheduling interval is finished (this is what I observed when setting the interval to 5 min). To me it seems more intuitive if next_run time were set to start_time , when we launch the scheduler; and then update it according to the scheduler interval parameters. Otherwise I can see a situation where we have a large interval (e.g. 7 days), and we have to wait for a week till we have the first task run. Do I understand the code correctly?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					PanickyFish98
				
					0
					 × 1

Votes Newest

Answers 10

Thank you for your reply AgitatedDove14 !
For the point 1, here is what is printed out when the task starts execution for the first timeSyncing scheduler Failed deserializing configuration: the JSON object must be str, bytes or bytearray, not NoneTypeHowever, now I see that this actually doesn't break the code, and this exception is ignored. The task is not started in the first run, but this is because of the reasons discussed in the point 3, I assume. Sorry for a false alarm here.
For the point 2, I was thinking of a use case where we have a heavy job to run, which we would like to schedule for a night time on a Saturday, for example, when the server load is low. Or can this be achieved by e.g. scheduling a job for weekday=saturday, hour=3 ? For the point 3, yes, I think we would expect the execution immediately. For example, if we want a job to run every 15 min, we don't want to wait 15 minutes for the first execution to happen, we want to start straight away, and then repeat every 15 min 🙂

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					PanickyFish98
				
					0
					 × 1

(2) yes weekdays with specific hour should do exactly that:)
(3) yes I see your point, maybe we should add boolean allowing you to run immediately?
Back to (1) , let me see if I can reproduce, anything specific I need to add to the schedule call?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

You're getting
Syncing scheduler Failed deserializing configuration: the JSON object must be str, bytes or bytearray, not NoneTypeLike before? Are all the symptoms the same as above?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

AgitatedDove14 - any doc yet for scheduler? Is it essentially for just time based scheduling?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

I'm getting the same error when using TaskScheduler. Is there any updates on this?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					PanickyAnt52
				
					0
					 × 1

First that is awesome to hear PanickyFish98 !
Can you send the full exception? You might be on to something...
2. Actually we thought of it, but could not find a use case, can you expand?
3. I'm not sure I follow, do you mean you expect the first execution to happen immediately?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

https://github.com/allegroai/clearml/blob/master/clearml/automation/trigger.py
Example coming soon, with docs :)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

🙏

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

Yes, a boolean flag for (3) would be a good option!
In (1), the code snippet from my original question is the code that I execute on my local machine. It is submitted to a queue with an agent that runs with docker. Please, let me know if this is enough to reproduce it.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					PanickyFish98
				
					0
					 × 1

Ok code suggests so. Looking for more powerful pipeline scheduling like on datasets publish, actions on model publish etc

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

Write your answer

2K Views

10 Answers

4 years ago

2 years ago