Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Can Someone Help Me With Deploying This Example Model (From Triton Inference Server) Deployed In Clearml-Serving? Too Many Random Errors For Me To Figure It Out

Can someone help me with deploying this example model (from triton inference server) deployed in clearml-serving? Too many random errors for me to figure it out

https://github.com/triton-inference-server/server/tree/main/qa/python_models/add_sub

For now seeing this error in the triton serving engine task:

File "/opt/tritonserver/backends/python/startup.py", line 382, in <module> python_host = PythonHost(module_path=FLAGS.model_path) File "/opt/tritonserver/backends/python/startup.py", line 172, in __init__ spec.loader.exec_module(module) File "<frozen importlib._bootstrap_external>", line 779, in exec_module File "<frozen importlib._bootstrap_external>", line 915, in get_code File "<frozen importlib._bootstrap_external>", line 972, in get_data FileNotFoundError: [Errno 2] No such file or directory: '/models/test-project-nbdev/16/model.py'
I created the model as follows:

Have a model folder with the config.pbtxt and model.py from the example
task.connect_configuration(configuration=Path('model/config.pbtxt'), name='config.pbtxt') output_model.save("model")

  
  
Posted 3 years ago
Votes Newest

Answers 30


Initially it was complaining about it, but then when I did the connect_configuration it started working

  
  
Posted 3 years ago

Was wondering how it can handle 10s, 100s of models.

Yes, it supports dynamically loading/unloading models based on requests
(load balancing multiple nodes is disconnected from it, but assuming they are under diff endpoints, the load balancer can be configured to route accordingly)

  
  
Posted 3 years ago

Do we launch multiple gorups of these in different projects?

Actually Triton can serve multiple models and the endpoints/models are controlled from the clearml-serving.
The only issue is adding a load-balancer in front of multiple nodes to balance the requests between them. wdyt?

  
  
Posted 3 years ago

I think you are correct, it seems like it is missing requirements to boto/azure/google (I will make sure this is added). In the meantime, you can stop the "triton serving engine" Task, reset it, add boto3 to the installed packages and relaunch.
That said your main issue might be packaging the python model. Basically you need to create a model from the entire folder (with whatever there is inside the folder), then Triton should be able to run it (if the config.pbtxt is correct).
m = OutputModel() m.update_weights_package(weights_path='path/goes/here/')

  
  
Posted 3 years ago

Also btw, is this supposed to be screenshot from community verison

Hmm seems like screenshot from an enterprise version, I'll ask them to update 🙂

I am also not understanding how clearml-serving is doing the version for models in triton.

Basically you have two Tasks, one is the "controller" checking model changes and updating itself.
The other is the engine, checking on the "controller" Task, which models it needs to download/configure and replaces them.
This way you can have multiple engines controlled from the same "controller" Task

  
  
Posted 3 years ago

Model says PACKAGE, that means it’s fine right?

  
  
Posted 3 years ago

I used .update_weights(path) with path being the model dir containing the model.py annd the config.pbtxt. Should I use update_weights_package ?

  
  
Posted 3 years ago

The agent ip? Generally what’s the expected pattern to deploy and scale this for multiple models?

Yes the agent's IP, and with multiple agents, one would probably use k8s for the nodes, then configure ingest. This is the next step for the cleaml-serving, adding support for KFServing or manually configuring the ingest. wdyt?

  
  
Posted 3 years ago

On my to do list, but will have to wait for later this week (feel free to ping on this thread to remind me).
Regrading the issue at hand, let me check the requirements it is using.

  
  
Posted 3 years ago

forking and using the latest code fixes the boto issue at least

  
  
Posted 3 years ago

Think I will have to fork and play around with it 🙂

  
  
Posted 3 years ago

Yes, I have no experience with triton does it do lazy loading? Was wondering how it can handle 10s, 100s of models. If we load balance across a set of these engine containers with say 100 models and all of these models get traffic but distribution is not even, each of those engine container will load all those 100 models?

  
  
Posted 3 years ago

I am also not understanding how clearml-serving is doing the version for models in triton.

  
  
Posted 3 years ago

yes I'm with you, we need to fix it asap

  
  
Posted 3 years ago

The agent ip? Generally what’s the expected pattern to deploy and scale this for multiple models?

  
  
Posted 3 years ago

Think I will have to fork and play around with it 

NICE! (BTW: if you manage to get it working I'll be more than happy to help push the PR)
Maybe the quickest win is to store just the .py as model ?

  
  
Posted 3 years ago

Got the engine running.

curl <serving-engine-ip>:8000/v2/models/keras_mnist/versions/1What’s the serving-engine-ip supposed to be?

  
  
Posted 3 years ago

For now that's a quick thing, but for actual use I will need a proper model (pkl) and the .py

  
  
Posted 3 years ago

Yeah

  
  
Posted 3 years ago

It did pick it from the task?

  
  
Posted 3 years ago

Without some sort of automation on top feels a bit fragile

  
  
Posted 3 years ago

Should I use 

update_weights_package

Yes
BTW, config.pbtxt you should pass when "registering" the endpoint with the CLI

  
  
Posted 3 years ago

That makes sense - one part I am confused on is - The Triton engine container hosts all the models right? Do we launch multiple gorups of these in different projects?

  
  
Posted 3 years ago

AgitatedDove14 - looks like the serving is doing the savemodel stuff?

https://github.com/allegroai/clearml-serving/blob/main/clearml_serving/serving_service.py#L554

  
  
Posted 3 years ago

AgitatedDove14 - it does have boto but the clearml-serving installation and code refers to older commit hash and hence the task was not using them - https://github.com/allegroai/clearml-serving/blob/main/clearml_serving/serving_service.py#L217

  
  
Posted 3 years ago

Progress with boto3 added, but fails:

  
  
Posted 3 years ago

Also btw, is this supposed to be screenshot from community verison? https://github.com/manojlds/clearml-serving/blob/main/docs/webapp_screenshots.gif

  
  
Posted 3 years ago

don’t know what’s happening there

  
  
Posted 3 years ago

mode.savemodel ?

  
  
Posted 3 years ago
1K Views
30 Answers
3 years ago
one year ago
Tags
Similar posts