Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi Everyone, I Wanted To Inquire If It'S Possible To Have Some Type Of Model Unloading. I Know There Was A Discussion Here About It, But After Reviewing It, I Didn'T Find An Answer. So, I Am Curious: Is It Possible To Explicitly Unload A Model (By Calling


Suppose that I have three models and these models can't be loaded simultaneously on GPU memory(

Oh!!!

For now, this is the behavior I observe: Suppose I have two models, A and B. ....

Correct

Yes this is a current limitation of the Triton backend BUT!
we are working on a new version that does Exactly what you mentioned (because it is such a common case where in some cases models are not being used that frequently)
The main caveat is the loading time, re-loading models from dist takes way too much time at the moment (meaning you might get a timeout on the request), and we are trying to accelerate the process (for example cache model in RAM instead of GPU memory). But we made good progress and I'm sure the next version will be able to address that

  
  
Posted 10 months ago
112 Views
0 Answers
10 months ago
10 months ago