Hi @<1739455989154844672:profile|SmarmyHamster62> , can you add some logs from the triton container while you try calling your endpoint?
Since recieving the 405 the triton container just prints the following information Info: syncing models from main serving service
reporting metrics: relative time 13998 sec I already restarted clearml completely. This Morning the triton container just crahsed when serving a different model There I've got the following errors:
Update model v1 in /models/tritontest/1
copy model into /models/tritontest/1/model.bin
Starting server: ['tritonserver', '--model-control-mode=poll', '--model-repository=/models', '--repository-poll-secs=60.0', '--metrics-port=8002', '--allow-metrics=true', '--allow-gpu-metrics=true']
1724678179082 mips-data-analytics-platform-clearml-serving-triton-5d8874kpsq2 error Traceback (most recent call last):
File "clearml_serving/engines/triton/triton_helper.py", line 540, in <module>
main()
File "clearml_serving/engines/triton/triton_helper.py", line 532, in main
helper.maintenance_daemon(
File "clearml_serving/engines/triton/triton_helper.py", line 274, in maintenance_daemon
raise ValueError("triton-server process ended with error code {}".format(error_code))
ValueError: triton-server process ended with error code 1
we are currently using ClearML Server 1.13.0 and ClearML Serving 1.3.0.
@<1739455989154844672:profile|SmarmyHamster62> , I suggest updating your versions. The server is a bit old
Also a newer version to the serving as well
@<1523701070390366208:profile|CostlyOstrich36> Hi there we were able to update our clearML instance. But i'm still experiencing the same issue
Traceback (most recent call last):
File "clearml_serving/engines/triton/triton_helper.py", line 540, in <module>
main()
File "clearml_serving/engines/triton/triton_helper.py", line 532, in main
helper.maintenance_daemon(
File "clearml_serving/engines/triton/triton_helper.py", line 274, in maintenance_daemon
raise ValueError("triton-server process ended with error code {}".format(error_code))
ValueError: triton-server process ended with error code 1
we updated our clearML to 7.11.0
serving: 1.5.6
agent: 5.2.1