Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Everyone, I'M Having An Issue With Clearml Datasets And Would Like To Know If This Is Possible. I Have A Task That Is Executed Repeatedly. This Task May Require Data To Be Loaded And Updated From A Dataset. My Question Is: Is There A Way To Append To A

Hi everyone,
I'm having an issue with Clearml Datasets and would like to know if this is possible.
I have a task that is executed repeatedly. This task may require data to be loaded and updated from a dataset. My question is: Is there a way to append to a CLEARMl dataset without creating new versions? If this task is created several times a day, this would otherwise lead to an increasing number of dataset versions. Is there a way to avoid this? I tried not to finalizea dataset, but then I could not download a mutable copy. If I finalize, I need to create a new version of the dataset. Is that right? If not, is there another way to share data between two tasks without using an external database? These will be created and triggered from an external UI or TaskScheduler.

  
  
Posted 2 months ago
Votes Newest

Answers 4


Hi @<1717350310768283648:profile|SplendidFlamingo62> , you can also use artifacts on the task itself in order to pass data between tasks - None

  
  
Posted 2 months ago

@<1523701070390366208:profile|CostlyOstrich36> Thanks for your answer. But then the data would still not be stored “efficiently”, would it? Since you download the artifact append and append the data set to the new task every time, right?

  
  
Posted 2 months ago

In that case you are correct. If you want to have a 'central' source of data then Datasets would be the suggested approach. Regarding your question on adding data, you would always have to create a new child version and append new data to the child.

Also maybe squashing the dataset might be relevant to you - None

  
  
Posted 2 months ago

Okay, thank you very much 🙂

  
  
Posted 2 months ago
336 Views
4 Answers
2 months ago
2 months ago
Tags
Similar posts