FileNotFoundError: [Errno 2] No such file or directory: 'tritonserver': 'tritonserver'
This is oddd.
Can you retry with the latest from the github ?pip install git+
AstonishingWorm64 can you share the full log (In the UI under Results/Console there is a download button)?
That sounds like an internal tritonserver error.
https://forums.developer.nvidia.com/t/provided-ptx-was-compiled-with-an-unsupported-toolchain-error-using-cub/168292
The latest image seems to require drivers on the host 460+
try this one:
https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel_20-12.html#rel_20-12
So instead of updating gpu drivers can we install a lower compatible version of CUDA inside docker for clearml-serving?
Also when I checked log file I found thisagent.default_docker.image = nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 agent.enable_task_env = false agent.git_user = agent.default_python = 3.8 agent.cuda_version = 112
This might be a dumb question but I'm confused with CUDA version being installed here, is it 10.1(from first line) or 11.2(from last line)?
AstonishingWorm64 I found the issue.
The cleamlr-serving assume the agent is working in docker mode, as it Has to have the triton docker (where triton engine is installed).
Since you are running in venv mode, tritonserver is not installed, hence the error
(I'll make sure we reply on the issue as well later)
I already shared log from UI, anyways I'm sharing log for recently tried experiment please find the attachment
Tried installing latest clearml-serving from git, but still not luck same error persists.
I have attached both serving service and serving engine(triton) console logs from clearml-server, please have a look at them
By default clearml-serving is installing triton version 21.03, can we somehow override this to install some other version. I tried to configure but could not find anything related to tritonserver in clearml.conf file. So can you please guide me on this?
Server that I'm using has GPU Driver Version: 455.23.05 and CUDA Version: 11.1, Conda is also installed and clearml-serving is installing cuda version 10.1 for which gpu drivers should be >= 418.39 so I guess version mismatch is not the problem and currently I can't update gpu drivers since other processes are running.
And also I tried overriding clearml.conf file and changed default docker image by modifying below lineimage: "nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04"
to this:image: "nvidia/cuda:11.1-cudnn8-runtime-ubuntu18.04"
But still same same error the provided PTX was compiled with an unsupported toolchain
occurred while launching triton-engine
Hi AstonishingWorm64
I think you are correct, there is external interface to change the docker.
Could you open a GitHub issue so we do not forget to add an interface for that ?
As a temp hack, you can manually clone "triton serving engine" and edit the container image (under the execution Tab).
wdyt?
This solved tritonserver not found issue but now a new error is occuring which is UNAVAILABLE: Internal: unable to create stream: the provided PTX was compiled with an unsupported toolchain
Please check attached log file for complete console log.
And also I am facing issue while initializing serving server and triton engine using below two commands:clearml-serving triton --project "serving" --name "serving ex1"
clearml-serving triton --endpoint "inference" --model-project "serving" --model-name "exp_v1"
So after the second command I am seeing below errorError: No projects found when searching for
DevOps
But when I clubbed these two commands as a single command like below, the error disappeared so I went on and launced service and engine, does this change in blending of commands resulted in above error?clearml-serving triton --project "serving" --name "serving ex1" --endpoint "inference" --model-project "serving" --model-name "exp_v1"
Bottom line the driver version in the host machine does not support the CUDA version you have in the docker container
That is a good question, usually the cuda version is automatically detected, unless you overrride it with the conf file or OS env. What's the setup? Are you using as package manager ? (conda actually installs CUDA drivers, if the original Task was executed on a machine with conda, it will take the CUDA version automatically, reason is to match the CUDA/Torch/TF)
Hi AgitatedDove14 , thanks for the reply!
It's not the same issue that you just pointed, in fact the issue is raised after launching inference onto the queue using below commands
` clearml-serving triton --project "serving" --name "serving example"
clearml-serving triton --endpoint "keras_mnist" --model-project "examples" --model-name "Keras MNIST serve example - serving_model"
clearml-serving launch --queue default `
Hi AstonishingWorm64
Is this the same ?
https://github.com/allegroai/clearml-serving/issues/1
(I think it was fixed on the later branch, we are releasing 0.3.2 later today with a fix)
Can you try:pip install git+