Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey, Just Trying Out Clearml-Serving And Getting The Following Error

Hey,

Just trying out clearml-serving and getting the following error the provided PTX was compiled with an unsupported toolchain in the clearml-serving-triton container. My guess is it's something from the converting PyTorch code to TorchScript. I'm getting this error when trying the examples/pytorch . SuccessfulKoala55 or JitteryCoyote63 would be great if you can help 🙂

  
  
Posted 2 years ago
Votes Newest

Answers 18


I can raise this as an issue on the repo if that is useful?

I think this is a good idea, at least increased visibility 🙂
Please do 🙏

  
  
Posted 2 years ago

Hi RobustRat47

My guess is it's something from the converting PyTorch code to TorchScript. I'm getting this error when trying the

I think you are correct see here:
https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/examples/pytorch/train_pytorch_mnist.py#L136
you have to convert the model to TorchScript for Triton to serve it

  
  
Posted 2 years ago

Yes already tried that but it seems there's some form of mismatch with a C/C++ lib.

  
  
Posted 2 years ago

RobustRat47
What exactly is the error you are getting ? (I remember only the latest Triton solved some issue there)

  
  
Posted 2 years ago

Thanks RobustRat47 !
Should we put somewhere this requirement ? (i.e. nvidia drivers) ?
Is this really a must ?

  
  
Posted 2 years ago

RobustRat47 what's the Triton container you are using ?
BTW, the Triton error is:
model_repository_manager.cc:1152] failed to load 'test_model_pytorch' version 1: Internal: unable to create stream: the provided PTX was compiled with an unsupported toolchain.https://github.com/triton-inference-server/server/issues/3877

  
  
Posted 2 years ago

$ curl -X 'POST' ' ' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ "url": " " }' {"digit":5}

  
  
Posted 2 years ago

I'm using "allegroai/clearml-serving-triton:latest" container I was just debugging using the base image

  
  
Posted 2 years ago

It might only be a req for the docker/docker-compose-triton-gpu.yml file but I'd need to check

  
  
Posted 2 years ago

I'll add a more detailed response once it's working

  
  
Posted 2 years ago

Just for ref if anyone has this issue. I had to update my cuda drivers to 510 on system os

` docker run --gpus=0 -it nvcr.io/nvidia/tritonserver:22.02-py3

=============================
== Triton Inference Server ==

NVIDIA Release 22.02 (build 32400308)

Copyright (c) 2018-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

ERROR: This container was built for NVIDIA Driver Release 510.39 or later, but
version 470.103.01 was detected and compatibility mode is UNAVAILABLE.

   [[Forward compatibility was attempted on non supported HW (CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE) cuInit()=804]]

root@36a9fc676a25:/opt/tritonserver# `

  
  
Posted 2 years ago

Okay just for clarity...

Originally, my Nvidia drivers were running on an incompatible version for the triton server
This container was built for NVIDIA Driver Release 510.39 or later, but version 470.103.01 was detected and compatibility mode is UNAVAILABLE.
To fix this issue I updated the drivers on my base OS i.e.
sudo apt install nvidia-driver-510 -y sudo reboot
Then it worked. The docker-compose logs from clearml-serving-triton container did not make this clear (i.e. by running docker-compose -f docker/docker-compose-triton-gpu.yml logs -f ) might be good to throw this as an error in the logs 🙂

AgitatedDove14 let me know if there's anything else I can provide that is useful for you.

  
  
Posted 2 years ago

I can raise this as an issue on the repo if that is useful?

  
  
Posted 2 years ago

Yay 🥳

  
  
Posted 2 years ago

RobustRat47 are you saying updating the nvidia drivers solved the issue ?

  
  
Posted 2 years ago

Still debugging.... That fixed the issue with the
nvcr.io/nvidia/tritonserver:22.02-py3 container which now returns
` =============================
== Triton Inference Server ==

NVIDIA Release 22.02 (build 32400308)

Copyright (c) 2018-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

root@c702b766ba35:/opt/tritonserver# `I'm now testing if the clearml-serving repo works. I'll keep this thread updated 🙂

  
  
Posted 2 years ago

Notice that we are using the same version:
https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/clearml_serving/engines/triton/Dockerfile#L2
The reason was that previous version did not support torchscript, (similar error you reported)
My question is, why don't you use the "allegroai/clearml-serving-triton:latest" container ?

  
  
Posted 2 years ago

The latest commit to the repo is 22.02-py3 ( https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/clearml_serving/engines/triton/Dockerfile#L2 ) I will have a look at versions now 🙂

  
  
Posted 2 years ago
1K Views
18 Answers
2 years ago
one year ago
Tags
Similar posts