Unanswered
Hi Everyone, I Wanted To Inquire If It'S Possible To Have Some Type Of Model Unloading. I Know There Was A Discussion Here About It, But After Reviewing It, I Didn'T Find An Answer. So, I Am Curious: Is It Possible To Explicitly Unload A Model (By Calling
@<1523701205467926528:profile|AgitatedDove14> No, I didn't do that, but if I'm not mistaken, about a month ago I saw some users on Reddit comparing it. They observed that TRT-LLM outperforms all kinds of leading backends, including VLLM. I will try to find it and paste it here.
111 Views
0
Answers
10 months ago
10 months ago