Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi There, Another Triton-Related Question: Are We Able To Deploy

Hi there, another triton-related question:

Are we able to deploy Python_backend models? Like TritonPythonModel something like this within clearml-serving?

Trying to setup an ensemble of DL models, but the only example is about a scikit-learn ensemble which is not quite related.

  
  
Posted one year ago
Votes Newest

Answers 14


Hi @<1523701118159294464:profile|ExasperatedCrab78> , so I’ve started looking into setting up the TritonBackends now, as we first discussed.

I was able to structure the folders correctly, and deploy the endopints. However, when I spin up the containers, I get the following error:

clearml-serving-triton        | | detection_preprocess | 1       | UNAVAILABLE: Internal: Unable to initialize shared memory key 'triton_python_backend_shm_region_1' to requested size (67108864 bytes). If you are running Triton inside docker, use '--shm-size' flag to control the shared memory region size. Each Python backend model instance requires at least 64MBs of shared memory. Error: No such file or directory

I then wanted to debug this a little further, to see if this is the issue. Passed --t-log-verbose=2 in CLEARML_TRITON_HELPER_ARGS to get more logs, but triton didn’t like it:

tritonserver: unrecognized option '--t_log_verbose=2'
Usage: tritonserver [options]
...

So wondering, is there any way to increase the shared memory size as well? I believe we have to do this when running/starting the container? But i couldn’t figure out how the container is brought up when doing it directly:

docker run --name triton --gpus=all -it --shm-size=512m -p8000:8000 -p8001:8001 -p8002:8002 -v $(pwd)/model_repository:/models image_path
  
  
Posted one year ago

@<1523701118159294464:profile|ExasperatedCrab78> So this is something I mean. If you think it’d be okay, I can properly implement this:

None

  
  
Posted one year ago

oh actually it seems like this is possible already from the code!

  
  
Posted one year ago

I can see pipelines, but not sure if it applies to Triton directly, more of a DAG approach?

None

  
  
Posted one year ago

@<1547028116780617728:profile|TimelyRabbit96>
Pipelines has little to do with serving, so let's not focus on that for now.

Instead, if you need a ensemble_scheduling block, you can use the CLI's --aux-config command to add any extra stuff that needs to be in the config.pbtxt

For example here, under the Setup section step 2, we use the --aux-config flag to add a dynamic batching block: None

  
  
Posted one year ago

Hi @<1547028116780617728:profile|TimelyRabbit96> Awesome that you managed to get it working!

  
  
Posted one year ago

@<1523701118159294464:profile|ExasperatedCrab78> , would you have any idea about above? Triton itself supports ensembling, was wondering if we can somehow support this as well?

  
  
Posted one year ago

I see, yep aux-config seems useful for sure. Would it be possible to pass a file perhaps to replace config.pbtxt completely? Formatting all the input/output shapes, and now the ensemble stuff is starting to get quite complicated 🙂

  
  
Posted one year ago

Hi there!

Technically there should be nothing stopping you from deploying a python backend model. I just checked the source code and ClearML basically just downloads the model artifact and renames it based on the inferred type of model.

None

As far as I'm aware (could def be wrong here!), the Triton Python backend essentially requires a folder containing e.g. a model.py file. I propose the following steps:

  • Given the code above, if you package the model.py file as a folder in clearml, clearml-serving will detect this and simply extract the folder in the right place for you. Then you have to adjust the config.pbtxt using the command line arguments to properly load the python file
  • If this does not work, an extra ifelse check should be added in the code above, also checking for "python" in the framework, similar to e.g. pytorch or onnx
  • However it is done, once the python file is in the right position and the config.pbtxt is properly setup, triton should just take it from there and everything should work as expected
    Could you try this approach? If this works, it would be an interesting example to add to the repo! Thanks 😄
  
  
Posted one year ago

Thanks for your response! I see, yep from an initial view it could work. Will certainly give it a try 🙂

However, to give you more context, in order to setup an ensemble within Triton, you also need to add a ensemble_scheduling block to the config.pbtxt file, which would be something like this:

None

I’m guessing this’ll be difficult given the current functionality of the CLI?

  
  
Posted one year ago

Yes, you will indeed need to add all ensemble endpoints separately 🙂

  
  
Posted one year ago

Hey! Thanks for all the work you're putting in and the awesome feedback 😄

So, it's weird you get the shm error, this is most likely our fault for not configuring the containers correctly 😞 The containers are brought up using the docker-compose file, so you'll have to add it in there. The service you want is called clearml-serving-triton , you can find it here .

Check the docker docs here for the right key to add in the docker compose. It looks like it's called shm_size and set it to something higher. On the other hand, if I'm not mistaken, setting ipc: host instead should also work and is probably better for performance! Would you mind adding that? So adding ipc: host to the clearml-serving-triton service on the same level as image or ports for example

  
  
Posted one year ago

okay sorry for spamming here, but i feel like other ppl would find this useful, so i was able to deploy the ensemble model, and i guess to complete this, i would need to individually add all the other “endpoints” independently right?

As in, to reach something like below within Triton:
image

  
  
Posted one year ago

Thank you for all the answers! Yep that worked, though is it usually safe to add this option? Instead of --shm-size

Also, now I managed to send an image through curl using a local image (@img.png in curl). Seems to work through this! Getting the same gRPC limit size , but seems like there’s a new commit that addressed it! 🎉

  
  
Posted one year ago
926 Views
14 Answers
one year ago
one year ago
Tags