Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Two Questions About Datasets: Question 1: Are Parallel Writes To A Dataset With The Same Version Possible? Is The Way To Go, To Have A Task, Which Creates A Dataset Object, Which In Turn Is Passed As Artifact To The Subsequent Ingestion Tasks? After The P


Hi @<1661542579272945664:profile|SaltySpider22>

question 1: are parallel writes to a dataset with the same version possible?

When you are saying parallel what do you mean? from multiple machines ?

Whats the recommended way to append the dataset in a future version?

Once a dataset was finalized the only way to add files is to add another version that inherits from the previous one (i.e. the finalized version becomes the parent of the new version)
If you are worried about multiple versions, just like in git you have squeeze 🙂

passing Dataset artifacts between tasks seems to be not possible,

The correct way would be to pas the Dataset ID, then other task would simple get it with Dataset.get
No need to worry about re-download, everything is automatically cached.
Make sense ?

  
  
Posted 3 months ago
35 Views
0 Answers
3 months ago
3 months ago