Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I'M New To Clearml And I'D Like To Deploy An Inference Service Based On My Trained Model, Something Like What Bentoml Does Wrapping Flask Api... Is There A Way To Do It Within Clearml?

I'm new to ClearML and I'd like to deploy an inference service based on my trained model, something like what BentoML does wrapping Flask API... is there a way to do it within ClearML?

  
  
Posted 3 years ago
Votes Newest

Answers 4


ContemplativeCockroach39 unfortunately No directly as part of clearml 😞
I can recommend the Nvidia triton serving (I'm hoping we will have the out-of-the-box integration soon)
mean while you can manually run it , see docs:
https://developer.nvidia.com/nvidia-triton-inference-server
docker here
https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver

  
  
Posted 3 years ago

It depends on what you mean by deployment, and what kind of inference you plan to do (ie rt vs batched etc)
But overall currently serving itself is not handled by the open source offering, mainly because there are so many variables and frameworks to consider.
Can you share some more details about the capabilities you are looking for? Some essentials like staging and model versioning are handled very well...

  
  
Posted 3 years ago

GrumpyPenguin23 , AgitatedDove14 thanks for replying! basically i'm looking for a real time inference endpoint exposing a prediction API method, something like:
curl -i \ --header "Content-Type: application/json" \ --request POST \ --data '[[5.1, 3.5, 1.4, 0.2]]' \

  
  
Posted 3 years ago

Hi ContemplativeCockroach39
Assuming you wrap your model with a flask app (or using any other serving solution), usually you need:
Get the model Add some metrics on runtime performance package in a dockerGetting a pretrained model is straight forward one you know either the creating Task or the Model ID
` from clearml import Task, Model
model_file_from_task = Task.get_task(task_id).models['output'][-1].get_local_copy()

or

model_file_from_model = Model(model_id=<moedl_id>).get_local_copy() Add performance metrics : from clearml import Task
task = Task.init(project_name='inference', task_name='runtime')
task.get_logger().report_scalar(title='performance', series='latency', value=0.123, iteration=some_counter_here) Once you run it once you have a Task of the inference code in the system, you can either enqueue to a clearml-agent, or package as a standalone docker. Packaging to a docker clearml-agent build --id <task_id_here> --docker --target docker_image_name `

  
  
Posted 3 years ago
980 Views
4 Answers
3 years ago
one year ago
Tags