I will actually write here what I found.  trigger_on_tags  and  trigger_required  are actually the same and concatenated with OR. You need to make sure you are using the "__$all" before if that's the behavior you want.
there is a bug in my opinion on the deserialization process because the triggers get de-dupped by trigger name or when using trigger_project there are dozens of triggers being created with the same name (one per dataset in the project). This leads to random behavior depending on which project id survived the deserialization process.
The way I solved it is by subclassing trigger_scheduler and overwriting add_dataset_trigger to create a unique name for the created triggers when using trigger_project (I use a combination of  name  and project_id.
			
				Answered
			
			
 
			
	
		
			
		
		
		
		
	
			
 				
	
	
		
			
		
		
		
		
	
 
					
		
		Hi Guys,
I Have A New Question Related To Triggerscheduler. I Am Seeing Very Erratic Behavior On Datasets Triggers. I Ahve A Cron Scheduler That Creates A Dataset After A File Gets Dropped On S3 Into A Project And Some Tags, In Particular "Processed=Fals
Hi Guys,
I have a new question related to TriggerScheduler. I am seeing very erratic behavior on datasets triggers. I ahve a Cron scheduler that creates a dataset after a file gets dropped on s3 into a project and some tags, in particular "processed=false".
I have a TriggerScheduler that has a  add_dataset_trigger  that triggers a task id. The Cron scheduler works great, but the TriggerScheduler, is 10% of the time maybe.
Here is my config:
trigger = TriggerScheduler(
        pooling_frequency_minutes=0.1, sync_frequency_minutes=0.1
    )
for client in batch_transfer_params:
  trigger.add_dataset_trigger(
            name=f"batch processing - {client['client_name']}",
            # schedule_function=trigger_dataset_func,
            schedule_task_id=schedule_task_id,
            schedule_queue="high-mem",
            trigger_project=client["client_name"],
            trigger_name="batch_processing - incoming",
            target_project=client["client_name"],
            task_parameters={
                "client_name": client["client_name"],
                "dataset_id": "${dataset.id}",
            },
            trigger_required_tags=["dm=true", "in=true", "processed=false"],
            single_instance=True,
        )
I have tried so many scenarios right now I don't know what to think anymore, I cannot get it to work reliably.
- sometimes removing the tag processed=false and putting back on will trigger but sometimes it won't
- I checked the triggers after adding them via  trigger.get_triggers()and it looks fine it creates one trigger per subfolder.
 Any pointers is appreciated.
thanks a lot for all the work.
181 Views
				1
Answer
				
					 
	27 days ago
				
					
						 
	27 days ago
					
					 Tags
					
			