Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi. I'M Running This Little Pipeline:


Hi there,

PanickyMoth78
I am having the same issue.
Some steps of the pipeline create huge datasets (some GBs) that I don’t want to upload or save.
Wrap the returns in a dict could be a solution, but honestly, I don’t like it.

AgitatedDove14 Is there any better way to avoid the upload of some artifacts of pipeline steps?

The image above shows an example of the first step of a training pipeline, that queries data from a feature store.
It gets the DataFrame, zip and upload it (this one is very small, but in practice they are really big)
How to avoid this?

  
  
Posted 2 years ago
175 Views
0 Answers
2 years ago
one year ago