Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Everyone, I’M Encountering A Cpu Bottleneck While Performing Inference With Clearml Serving And Am Hoping To Get Some Assistance. Setup: I Have Successfully Deployed A Clearml Server And Configured Clearml Serving Following The Instructions Provided He

Hi everyone,
I’m encountering a CPU bottleneck while performing inference with ClearML Serving and am hoping to get some assistance.
Setup: I have successfully deployed a ClearML Server and configured ClearML Serving following the instructions provided here: ClearML Serving Setup . I’m specifically using the docker-compose-triton.yml file, as I’m working with ONNX models.
Issue: During inference on ClearML Serving, I’ve noticed that only a single CPU core is being utilized, while the remaining cores remain idle. This causes the inference to time out each time. Is there a way to distribute the workload across multiple CPU cores to improve performance?
Thanks in advance for any help or suggestions!

  
  
Posted 15 days ago
Votes Newest

Answers


Hi @<1769534182561681408:profile|ReassuredFrog10> , do you have a GPU available? Maybe try the other docker compose without Triton as that one is specifically built for GPU inference.

  
  
Posted 14 days ago
56 Views
1 Answer
15 days ago
14 days ago
Tags