Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi Folks, Tldr: Dataset.Remove_Files() Is Very Slow. How Can I Speed It Up? I'M Working With A Large Raw Dataset That We Are Trying To Use A Small Subset Of. The Data Is Thousands Of Images And A Metadata Json File For Each Image. To Create This Subset


The way I wrote it is a bit of a quick fix with a lot of code duplication, I'm sure it could be implemented in a cleaner way (e.g. having only one remove_files method that can either take a single path or a list of paths).
It's one of those things that I intended to do at some point, but never had the time to clean it up (I did a similar modification for adding lists of files, since this has exactly the same issue if you don't want to add something you can define with a wildcard but only specific files).
If you're up for it, feel free - I'm sure there are plenty of people who would appreciate it.

  
  
Posted 7 months ago
86 Views
0 Answers
7 months ago
6 months ago