I will actually write here what I found. trigger_on_tags
and trigger_required
are actually the same and concatenated with OR. You need to make sure you are using the "__$all" before if that's the behavior you want.
there is a bug in my opinion on the deserialization process because the triggers get de-dupped by trigger name or when using trigger_project there are dozens of triggers being created with the same name (one per dataset in the project). This leads to random behavior depending on which project id survived the deserialization process.
The way I solved it is by subclassing trigger_scheduler and overwriting add_dataset_trigger to create a unique name for the created triggers when using trigger_project (I use a combination of name
and project_id.
Answered
Hi Guys,
I Have A New Question Related To Triggerscheduler. I Am Seeing Very Erratic Behavior On Datasets Triggers. I Ahve A Cron Scheduler That Creates A Dataset After A File Gets Dropped On S3 Into A Project And Some Tags, In Particular "Processed=Fals
Hi Guys,
I have a new question related to TriggerScheduler. I am seeing very erratic behavior on datasets triggers. I ahve a Cron scheduler that creates a dataset after a file gets dropped on s3 into a project and some tags, in particular "processed=false".
I have a TriggerScheduler that has a add_dataset_trigger
that triggers a task id. The Cron scheduler works great, but the TriggerScheduler, is 10% of the time maybe.
Here is my config:
trigger = TriggerScheduler(
pooling_frequency_minutes=0.1, sync_frequency_minutes=0.1
)
for client in batch_transfer_params:
trigger.add_dataset_trigger(
name=f"batch processing - {client['client_name']}",
# schedule_function=trigger_dataset_func,
schedule_task_id=schedule_task_id,
schedule_queue="high-mem",
trigger_project=client["client_name"],
trigger_name="batch_processing - incoming",
target_project=client["client_name"],
task_parameters={
"client_name": client["client_name"],
"dataset_id": "${dataset.id}",
},
trigger_required_tags=["dm=true", "in=true", "processed=false"],
single_instance=True,
)
I have tried so many scenarios right now I don't know what to think anymore, I cannot get it to work reliably.
- sometimes removing the tag processed=false and putting back on will trigger but sometimes it won't
- I checked the triggers after adding them via
trigger.get_triggers()
and it looks fine it creates one trigger per subfolder.
Any pointers is appreciated.
thanks a lot for all the work.
5 Views
1
Answer
one day ago
7 hours ago
Tags