Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I'M Looking At How Triggers Work In Clearml. Is There An Example, Maybe With Clearml Data And A Dataset Being Uploaded Or Some Other Example?

I'm looking at how triggers work in ClearML. Is there an example, maybe with clearml data and a dataset being uploaded or some other example?

  
  
Posted 2 years ago
Votes Newest

Answers 23


I'd like to add an update to this, when I use schedule function instead of schedule task with the dataset trigger scheduler, it works as intended. It runs the desired function when triggered. Then is asleep again next time since no other trigger was fired.

  
  
Posted 2 years ago

Also could you explain the difference between trigger.start() and trigger.start_remotely()

  
  
Posted 2 years ago

Thank you for the help with that.

  
  
Posted 2 years ago

Thank you, I'll take a look

  
  
Posted 2 years ago

So in my head, every time i publish a dataset, it should get triggered and run that task.

  
  
Posted 2 years ago

But what's happening is, that I only publish a dataset once but every time it polls, it gets triggered and enqueues a task even though the dataset was published only once.

  
  
Posted 2 years ago

So it won't work without clearml-agent? Sorry for the barrage of questions. I'm just very confused right now.

  
  
Posted 2 years ago

This here shows my situation. You can see the code on the left and the tasks called 'Cassava Training' on the right. They keep getting enqueued even though I only sent a trigger once. By that I mean I only published a dataset once.

  
  
Posted 2 years ago

Okay so they run once i started a clear ml agent listening to that queue.

  
  
Posted 2 years ago

So I took dataset trigger from this and added it to my own test code, which needs to run a task every time this trigger is activated.

  
  
Posted 2 years ago

Okay so when I add trigger_on_tags, the repetition issue is resolved.

  
  
Posted 2 years ago

It works, however it shows the task is enqueued and pending. Note I am using .start() and not .start_remotely() for now

  
  
Posted 2 years ago

To be more clear. An example use case for me would be, that I'm trying to make a pipeline which every time a new dataset/batch is published using clearml-data,

Get the data Train it Save the model and publish it
I want to start this process with a trigger when a dataset is published to the server. Any example which I can look to for accomplishing something like this?

  
  
Posted 2 years ago

Also, the task just prints a small string on the console.

  
  
Posted 2 years ago

This problem occurs when I'm scheduling a task. Copies of the task keep being put on the queue even though the trigger only fired once.

  
  
Posted 2 years ago

they're also enqueued

  
  
Posted 2 years ago

Also could you explain the difference between trigger.start() and trigger.start_remotely()

Start will start the trigger process (the one "watching the changes") locally (this makes sense for debugging etc.)
start_remotely will launch the trigger process on the "services" where it should live forever 🙂

Okay so when I add trigger_on_tags, the repetition issue is resolved.

Nice!

This problem occurs when I'm scheduling a task. Copies of the task keep being put on the queue even though the trigger only fired once.

Hmm I think a bit lost here (and I have a feeling there is some hidden bug somewhere that I'd like us to fix)
How exactly do I make it trigger twice on the same Dataset?

  
  
Posted 2 years ago

Yes, for an enqueued task to run you require an agent to run against the task 🙂

  
  
Posted 2 years ago

VexedCat68 , what do you mean by trigger? You want some indication that a dataset whats published so you can move to the next step in your pipeline?

  
  
Posted 2 years ago

I however have another problem. I have a dataset trigger that has a schedule task.

  
  
Posted 2 years ago

VexedCat68

But what's happening is, that I only publish a dataset once but every time it polls,

this seems wrong (i.e a bug?!), how do you setup the trigger ? is the Trigger Task constantly running or are you re-launching it?

  
  
Posted 2 years ago

So I just published a dataset once but it keeps scheduling task.

  
  
Posted 2 years ago