Hi, I Wanted To Try Model Versioning, Suppose That I'Ve A Model And Want To Have Multiple Versions Of The Same Model And To Be Able To Have Inference On These Models(For Example

Answered

Hi, I wanted to try model versioning, suppose that I've a model and want to have multiple versions of the same model and to be able to have inference on these models(for example yolo/v1 , yolo/v2 , and so on), so based on the ClearML document, I used --version $VERSION when I wanted to add the new version of a specific model. I noticed that it caused the creation of a new directory with the name <model>_$VERSION in the corresponding Triton container. I also changed the name in the config file to <model>_$VERSION . I saw the following successful message:

I0222 19:13:16.847970 70 model_repository_manager.cc:1352] successfully loaded 'yolo_1' version 1

However, when I tried to test it by hitting its endpoint with a curl request, I encountered the following error:

curl -X POST "

" -H "accept: application/json" -H "accept: application/json" -H "Content-Type: application/json" -d '{"s3_url": "hamid_beta/U2K230420001/2023/12/20/23/2023-12-20_23-29-08-front.mp4", "file_type": "video"}'
{"detail":"Error processing request: <AioRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"Request for unknown model: 'yolo/1' is not found\"\n\tdebug_error_string = \"{\"created\":\"@1708630278.616586876\",\"description\":\"Error received from peer ipv4:192.168.64.5:8001\",\"file\":\"src/core/lib/surface/call.cc\",\"file_line\":1069,\"grpc_message\":\"Request for unknown model: 'yolo/1' is not found\",\"grpc_status\":14}\"\n>"}

How can I solve this issue?

  				
Posted 
	one year ago

					More  		
  Report
		
					FranticWhale40
				
					0
					 × 1

Votes Newest

Answers 17

I'm using the latest version of clearml-serving

Name: clearml-serving
Version: 1.3.0

  				
Posted 
	one year ago

					More  		
  Report
		
					FranticWhale40
				
					0
					 × 1

Yes I'm sure that the Triton container finished syncing.
here is the Triton logs:

I0223 15:58:32.515979 71 model_repository_manager.cc:1352] successfully loaded 'yolo_2' version 1
I0223 15:58:32.842511 71 model_repository_manager.cc:1352] successfully loaded 'yolo_1' version 1
I0223 15:58:32.842579 71 server.cc:559] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0223 15:58:32.842606 71 server.cc:586] 
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend     | Path                                                            | Config                                                                                                                                                        |
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"true","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0223 15:58:32.842629 71 server.cc:629] 
+---------+---------+--------+
| Model   | Version | Status |
+---------+---------+--------+
| yolo_1 | 1       | READY  |
| yolo_2 | 1       | READY  |
+---------+---------+--------+

I0223 15:58:32.869848 71 metrics.cc:650] Collecting metrics for GPU 0: NVIDIA GeForce RTX 3080
I0223 15:58:32.871196 71 tritonserver.cc:2176] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                        |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                       |
| server_version                   | 2.24.0                                                                                                                                                                                       |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0]         | /models                                                                                                                                                                                      |
| model_control_mode               | MODE_POLL                                                                                                                                                                                    |
| strict_model_config              | 0                                                                                                                                                                                            |
| rate_limit                       | OFF                                                                                                                                                                                          |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                                     |
| response_cache_byte_size         | 0                                                                                                                                                                                            |
| min_supported_compute_capability | 6.0                                                                                                                                                                                          |
| strict_readiness                 | 1                                                                                                                                                                                            |
| exit_timeout                     | 30                                                                                                                                                                                           |

  				
Posted 
	one year ago

					More  		
  Report
		
					FranticWhale40
				
					0
					 × 1

Also I can’t call the “preprocess” function since there is no valid endpoint to be hitting

Wait now I'm confused, when you are calling " None " you are actually calling the preprocess function running on the inference container, and this one in turn (automatically) calls the Triton container.

Are you calling the Triton manually?
Could you share your preprcoess.py , and the command line you have used to register the two model versions ?
(based on your logs everything seems to be loaded correctly, hence my question)

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi FranticWhale40
Are you positive the Triton container finished syncing ?
Could you provide the docker log (both the serving and the triton)?
What is the clearml-serving version you are using ?
Could you add a print in the "preprocess" function, just to validate you are getting to the correct model version ?

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Also I can’t call the “preprocess” function since there is no valid endpoint to be hitting

  				
Posted 
	one year ago

					More  		
  Report
		
					FranticWhale40
				
					0
					 × 1

FranticWhale40 I might have found something, let me see if I can reproduce it

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

FranticWhale40 could you test the fix? just pull & run

allegroai/clearml-serving-triton:1.3.1
allegroai/clearml-serving-inference:1.3.1

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 Also could you please share the commit to fix the issue? It'll help to address it on our end.
Thanks!

  				
Posted 
	one year ago

					More  		
  Report
		
					FranticWhale40
				
					0
					 × 1

FranticWhale40 this one: None

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

If you’re wondering about the case where no optional config.pbtxt is provided, I guess the logic would be pretty much the same as above:

model_name = f"{model_name}_{version}"

But then after looking at create_config_pbtxt() , it seems like this is not being constructed at all, making me realize that this may have been optional - confirming name is an optional propery . It may just be simpler to drop this value all together.

Or to be explicit, something like this:

config_dict.put("name", f"{endpoint.name}_{endpoint.version}")

  				
Posted 
	one year ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

Thanks AgitatedDove14 , this seems to solve the issue. I guess the main issue is that the delimiter is a _ instead of / . This did work, however, as you can see from the model endpoint deployment snippet, we also provide a custom aux-config file. We also had to make sure to update the name inside config.pbtxt so that Triton is happy:

From

name: "mmdet"

TO:

name: "mmdet_VERSION" -> "mmdet_1"

  				
Posted 
	one year ago

					More  		
  Report
		
					TimelyRabbit96
				
					0
					 × 1

making me realize that this may have been optional

I think it is optional, and this is why it was not entered in the first place.
Could you double check and just remove it from your manual pbtxt ?

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Sure,

clearml-serving --id $SERVING_ID model add \
    --name "yolo" --version 2 --project $PROJECT_NAME --engine triton --endpoint "yolov8" \
    --preprocess "./yolo/preprocess.py" \
    --input-size 3 -1 -1 --input-name "images" --input-type float32 \
    --output-size -1 -1 --output-name "output0" --output-type float32 \
    --aux-config ./yolo/config.pbtxt

  				
Posted 
	one year ago

					More  		
  Report
		
					FranticWhale40
				
					0
					 × 1

we also provide a custom

aux-config

file. We also had to make sure to update the name inside

config.pbtxt

so that Triton is happy:

Good point, what would be the logic of the auto "config.pbtxt" patching we should employ ?

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Great! Thank you so much!

  				
Posted 
	one year ago

					More  		
  Report
		
					FranticWhale40
				
					0
					 × 1

Thanks FranticWhale40 !
I was able to locate the issue, fix should be released later today (or worst case tomorrow)

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

and here is the serving inference logs:

ffmpeg version 4.3.6-0+deb11u1 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 10 (Debian 10.2.1-6)
  configuration: --prefix=/usr --extra-version=0+deb11u1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '2023-12-20_23-29-08-front.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41isomiso2
    creation_time   : 2023-12-20T23:28:14.000000Z
  Duration: 00:01:00.08, start: 0.000000, bitrate: 3517 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x960, 3514 kb/s, SAR 1:1 DAR 4:3, 24.04 fps, 24 tbr, 10k tbn, 20k tbc (default)
    Metadata:
      creation_time   : 2023-12-20T23:28:14.000000Z
      handler_name    : VideoHandler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> mjpeg (native))
[swscaler @ 0x5592da50a780] deprecated pixel format used, make sure you did set range correctly
Output #0, image2, to '2023-12-20_23-29-08-front/%04d.jpg':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41isomiso2
    encoder         : Lavf58.45.100
    Stream #0:0(und): Video: mjpeg, yuvj420p(pc), 1280x960 [SAR 1:1 DAR 4:3], q=2-31, 200 kb/s, 12 fps, 12 tbn, 12 tbc (default)
    Metadata:
      creation_time   : 2023-12-20T23:28:14.000000Z
      handler_name    : VideoHandler
      encoder         : Lavc58.91.100 mjpeg
    Side data:
      cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: N/A
frame=  721 fps=177 q=24.8 Lsize=N/A time=00:01:00.08 bitrate=N/A speed=14.8x    
video:26746kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
ffmpeg version 4.3.6-0+deb11u1 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 10 (Debian 10.2.1-6)
  configuration: --prefix=/usr --extra-version=0+deb11u1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '2023-12-20_23-29-08-front.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41isomiso2
    creation_time   : 2023-12-20T23:28:14.000000Z
  Duration: 00:01:00.08, start: 0.000000, bitrate: 3517 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x960, 3514 kb/s, SAR 1:1 DAR 4:3, 24.04 fps, 24 tbr, 10k tbn, 20k tbc (default)
    Metadata:
      creation_time   : 2023-12-20T23:28:14.000000Z
      handler_name    : VideoHandler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> mjpeg (native))
[swscaler @ 0x55d076d5d740] deprecated pixel format used, make sure you did set range correctly
Output #0, image2, to '2023-12-20_23-29-08-front/%04d.jpg':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41isomiso2
    encoder         : Lavf58.45.100
    Stream #0:0(und): Video: mjpeg, yuvj420p(pc), 1280x960 [SAR 1:1 DAR 4:3], q=2-31, 200 kb/s, 12 fps, 12 tbn, 12 tbc (default)
    Metadata:
      creation_time   : 2023-12-20T23:28:14.000000Z
      handler_name    : VideoHandler
      encoder         : Lavc58.91.100 mjpeg
    Side data:
      cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: N/A
frame=  721 fps=163 q=24.8 Lsize=N/A time=00:01:00.08 bitrate=N/A speed=13.6x    
video:26746kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
/usr/local/lib/python3.9/site-packages/pyparsing/core.py:854: RuntimeWarning: coroutine 'Preprocess._process' was never awaited
  loc, tokens = self.parseImpl(instring, pre_loc, doActions)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

  				
Posted 
	one year ago

					More  		
  Report
		
					FranticWhale40
				
					0
					 × 1

Write your answer

1K Views

17 Answers

one year ago