Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi. I'M Running This Little Pipeline:


Hi there,

PanickyMoth78
I am having the same issue.
Some steps of the pipeline create huge datasets (some GBs) that I don’t want to upload or save.
Wrap the returns in a dict could be a solution, but honestly, I don’t like it.

AgitatedDove14 Is there any better way to avoid the upload of some artifacts of pipeline steps?

The image above shows an example of the first step of a training pipeline, that queries data from a feature store.
It gets the DataFrame, zip and upload it (this one is very small, but in practice they are really big)
How to avoid this?

  
  
Posted one year ago
113 Views
0 Answers
one year ago
one year ago