Hi There, Another Triton-Related Question: Are We Able To Deploy

Answered

Hi there, another triton-related question:

Are we able to deploy Python_backend models? Like TritonPythonModel something like this within clearml-serving?

Trying to setup an ensemble of DL models, but the only example is about a scikit-learn ensemble which is not quite related.

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

Votes Newest

Answers 14

Hi TimelyRabbit96 Awesome that you managed to get it working!

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrab78
				
					0
					 × 1

oh actually it seems like this is possible already from the code!

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

Yes, you will indeed need to add all ensemble endpoints separately 🙂

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrab78
				
					0
					 × 1

ExasperatedCrab78 So this is something I mean. If you think it’d be okay, I can properly implement this:

None

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

okay sorry for spamming here, but i feel like other ppl would find this useful, so i was able to deploy the ensemble model, and i guess to complete this, i would need to individually add all the other “endpoints” independently right?

As in, to reach something like below within Triton:

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

Hi there!

Technically there should be nothing stopping you from deploying a python backend model. I just checked the source code and ClearML basically just downloads the model artifact and renames it based on the inferred type of model.

None

As far as I'm aware (could def be wrong here!), the Triton Python backend essentially requires a folder containing e.g. a model.py file. I propose the following steps:

Given the code above, if you package the model.py file as a folder in clearml, clearml-serving will detect this and simply extract the folder in the right place for you. Then you have to adjust the config.pbtxt using the command line arguments to properly load the python file
If this does not work, an extra ifelse check should be added in the code above, also checking for "python" in the framework, similar to e.g. pytorch or onnx
However it is done, once the python file is in the right position and the config.pbtxt is properly setup, triton should just take it from there and everything should work as expected
Could you try this approach? If this works, it would be an interesting example to add to the repo! Thanks 😄

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrab78
				
					0
					 × 1

TimelyRabbit96
Pipelines has little to do with serving, so let's not focus on that for now.

Instead, if you need a ensemble_scheduling block, you can use the CLI's --aux-config command to add any extra stuff that needs to be in the config.pbtxt

For example here, under the Setup section step 2, we use the --aux-config flag to add a dynamic batching block: None

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrab78
				
					0
					 × 1

I see, yep aux-config seems useful for sure. Would it be possible to pass a file perhaps to replace config.pbtxt completely? Formatting all the input/output shapes, and now the ensemble stuff is starting to get quite complicated 🙂

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

I can see pipelines, but not sure if it applies to Triton directly, more of a DAG approach?

None

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

Thanks for your response! I see, yep from an initial view it could work. Will certainly give it a try 🙂

However, to give you more context, in order to setup an ensemble within Triton, you also need to add a ensemble_scheduling block to the config.pbtxt file, which would be something like this:

None

I’m guessing this’ll be difficult given the current functionality of the CLI?

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

Hi ExasperatedCrab78 , so I’ve started looking into setting up the TritonBackends now, as we first discussed.

I was able to structure the folders correctly, and deploy the endopints. However, when I spin up the containers, I get the following error:

clearml-serving-triton        | | detection_preprocess | 1       | UNAVAILABLE: Internal: Unable to initialize shared memory key 'triton_python_backend_shm_region_1' to requested size (67108864 bytes). If you are running Triton inside docker, use '--shm-size' flag to control the shared memory region size. Each Python backend model instance requires at least 64MBs of shared memory. Error: No such file or directory

I then wanted to debug this a little further, to see if this is the issue. Passed --t-log-verbose=2 in CLEARML_TRITON_HELPER_ARGS to get more logs, but triton didn’t like it:

tritonserver: unrecognized option '--t_log_verbose=2'
Usage: tritonserver [options]
...

So wondering, is there any way to increase the shared memory size as well? I believe we have to do this when running/starting the container? But i couldn’t figure out how the container is brought up when doing it directly:

docker run --name triton --gpus=all -it --shm-size=512m -p8000:8000 -p8001:8001 -p8002:8002 -v $(pwd)/model_repository:/models image_path

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

Thank you for all the answers! Yep that worked, though is it usually safe to add this option? Instead of --shm-size

Also, now I managed to send an image through curl using a local image (@img.png in curl). Seems to work through this! Getting the same gRPC limit size , but seems like there’s a new commit that addressed it! 🎉

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

ExasperatedCrab78 , would you have any idea about above? Triton itself supports ensembling, was wondering if we can somehow support this as well?

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

Hey! Thanks for all the work you're putting in and the awesome feedback 😄

So, it's weird you get the shm error, this is most likely our fault for not configuring the containers correctly 😞 The containers are brought up using the docker-compose file, so you'll have to add it in there. The service you want is called clearml-serving-triton , you can find it here .

Check the docker docs here for the right key to add in the docker compose. It looks like it's called shm_size and set it to something higher. On the other hand, if I'm not mistaken, setting ipc: host instead should also work and is probably better for performance! Would you mind adding that? So adding ipc: host to the clearml-serving-triton service on the same level as image or ports for example

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrab78
				
					0
					 × 1

Write your answer

1K Views

14 Answers

2 years ago