Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey, Just Trying Out Clearml-Serving And Getting The Following Error

Hey,

Just trying out clearml-serving and getting the following error the provided PTX was compiled with an unsupported toolchain in the clearml-serving-triton container. My guess is it's something from the converting PyTorch code to TorchScript. I'm getting this error when trying the examples/pytorch . SuccessfulKoala55 or JitteryCoyote63 would be great if you can help 🙂

  
  
Posted one year ago
Votes Newest

Answers 18


Notice that we are using the same version:
https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/clearml_serving/engines/triton/Dockerfile#L2
The reason was that previous version did not support torchscript, (similar error you reported)
My question is, why don't you use the "allegroai/clearml-serving-triton:latest" container ?

  
  
Posted one year ago

Hi RobustRat47

My guess is it's something from the converting PyTorch code to TorchScript. I'm getting this error when trying the

I think you are correct see here:
https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/examples/pytorch/train_pytorch_mnist.py#L136
you have to convert the model to TorchScript for Triton to serve it

  
  
Posted one year ago

RobustRat47 what's the Triton container you are using ?
BTW, the Triton error is:
model_repository_manager.cc:1152] failed to load 'test_model_pytorch' version 1: Internal: unable to create stream: the provided PTX was compiled with an unsupported toolchain.https://github.com/triton-inference-server/server/issues/3877

  
  
Posted one year ago

The latest commit to the repo is 22.02-py3 ( https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/clearml_serving/engines/triton/Dockerfile#L2 ) I will have a look at versions now 🙂

  
  
Posted one year ago

Just for ref if anyone has this issue. I had to update my cuda drivers to 510 on system os

` docker run --gpus=0 -it nvcr.io/nvidia/tritonserver:22.02-py3

=============================
== Triton Inference Server ==

NVIDIA Release 22.02 (build 32400308)

Copyright (c) 2018-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

ERROR: This container was built for NVIDIA Driver Release 510.39 or later, but
version 470.103.01 was detected and compatibility mode is UNAVAILABLE.

   [[Forward compatibility was attempted on non supported HW (CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE) cuInit()=804]]

root@36a9fc676a25:/opt/tritonserver# `

  
  
Posted one year ago

Still debugging.... That fixed the issue with the
nvcr.io/nvidia/tritonserver:22.02-py3 container which now returns
` =============================
== Triton Inference Server ==

NVIDIA Release 22.02 (build 32400308)

Copyright (c) 2018-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

root@c702b766ba35:/opt/tritonserver# `I'm now testing if the clearml-serving repo works. I'll keep this thread updated 🙂

  
  
Posted one year ago

RobustRat47 are you saying updating the nvidia drivers solved the issue ?

  
  
Posted one year ago

I'll add a more detailed response once it's working

  
  
Posted one year ago

I'm using "allegroai/clearml-serving-triton:latest" container I was just debugging using the base image

  
  
Posted one year ago

$ curl -X 'POST' ' ' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ "url": " " }' {"digit":5}

  
  
Posted one year ago

Yay 🥳

  
  
Posted one year ago

Okay just for clarity...

Originally, my Nvidia drivers were running on an incompatible version for the triton server
This container was built for NVIDIA Driver Release 510.39 or later, but version 470.103.01 was detected and compatibility mode is UNAVAILABLE.
To fix this issue I updated the drivers on my base OS i.e.
sudo apt install nvidia-driver-510 -y sudo reboot
Then it worked. The docker-compose logs from clearml-serving-triton container did not make this clear (i.e. by running docker-compose -f docker/docker-compose-triton-gpu.yml logs -f ) might be good to throw this as an error in the logs 🙂

AgitatedDove14 let me know if there's anything else I can provide that is useful for you.

  
  
Posted one year ago

Thanks RobustRat47 !
Should we put somewhere this requirement ? (i.e. nvidia drivers) ?
Is this really a must ?

  
  
Posted one year ago

It might only be a req for the docker/docker-compose-triton-gpu.yml file but I'd need to check

  
  
Posted one year ago

Yes already tried that but it seems there's some form of mismatch with a C/C++ lib.

  
  
Posted one year ago

RobustRat47
What exactly is the error you are getting ? (I remember only the latest Triton solved some issue there)

  
  
Posted one year ago

I can raise this as an issue on the repo if that is useful?

  
  
Posted one year ago

I can raise this as an issue on the repo if that is useful?

I think this is a good idea, at least increased visibility 🙂
Please do 🙏

  
  
Posted one year ago
707 Views
18 Answers
one year ago
one year ago
Tags
Similar posts