Unanswered
Hi All! I Recently Started Working With Clearml Serving. I Got This Example Working
Ok, so I killed all docker containers (the proposal by chatgpt did not work for me, but your commands did). The result is that we have one less warning. The warning clearml-serving-triton | Warning: more than one valid Controller Tasks found, using Task ID=4709b0b383a04bb1a033e99fd325dcbf
seems to be solved. All remaining errors come up in the clearml-serving-triton service and this is the log I get
CLEARML_SERVING_TASK_ID=9309c20af9244d919b0f063642198c57
CLEARML_TRITON_POLL_FREQ=1.0
CLEARML_TRITON_METRIC_FREQ=1.0
CLEARML_TRITON_HELPER_ARGS=
CLEARML_EXTRA_PYTHON_PACKAGES=
clearml-serving - Nvidia Triton Engine Controller
ClearML Task: created new task id=ad7bd1d205a24f3086ad4cdc9a94017d
2023-01-27 08:15:50,264 - clearml.Task - INFO - No repository found, storing script code instead
WARNING: [Torch-TensorRT] - Unable to read CUDA capable devices. Return status: 35
I0127 08:15:56.498773 34 libtorch.cc:1381] TRITONBACKEND_Initialize: pytorch
I0127 08:15:56.498849 34 libtorch.cc:1391] Triton TRITONBACKEND API version: 1.9
I0127 08:15:56.498856 34 libtorch.cc:1397] 'pytorch' TRITONBACKEND API version: 1.9
2023-01-27 08:15:56.868725: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0127 08:15:56.904341 34 tensorflow.cc:2181] TRITONBACKEND_Initialize: tensorflow
I0127 08:15:56.904373 34 tensorflow.cc:2191] Triton TRITONBACKEND API version: 1.9
I0127 08:15:56.904380 34 tensorflow.cc:2197] 'tensorflow' TRITONBACKEND API version: 1.9
I0127 08:15:56.904384 34 tensorflow.cc:2221] backend configuration:
{}
I0127 08:15:56.918633 34 onnxruntime.cc:2400] TRITONBACKEND_Initialize: onnxruntime
I0127 08:15:56.918656 34 onnxruntime.cc:2410] Triton TRITONBACKEND API version: 1.9
I0127 08:15:56.918659 34 onnxruntime.cc:2416] 'onnxruntime' TRITONBACKEND API version: 1.9
I0127 08:15:56.918662 34 onnxruntime.cc:2446] backend configuration:
{}
I0127 08:15:56.935321 34 openvino.cc:1207] TRITONBACKEND_Initialize: openvino
I0127 08:15:56.935344 34 openvino.cc:1217] Triton TRITONBACKEND API version: 1.9
I0127 08:15:56.935348 34 openvino.cc:1223] 'openvino' TRITONBACKEND API version: 1.9
W0127 08:15:56.936061 34 pinned_memory_manager.cc:236] Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version
I0127 08:15:56.937483 34 cuda_memory_manager.cc:115] CUDA memory pool disabled
I0127 08:15:56.939850 34 server.cc:549]
+------------------+------+
2023-01-27T08:15:56.944545100Z | Agent | Path |
+------------------+------+
+------------------+------+
I0127 08:15:56.939906 34 server.cc:576]
+-------------+-------------------------------------------------------------------------+--------+
2023-01-27T08:15:56.944562300Z | | Path | Config |
+-------------+-------------------------------------------------------------------------+--------+
2023-01-27T08:15:56.944568100Z | | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} |
2023-01-27T08:15:56.944570800Z | | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {} |
2023-01-27T08:15:56.944573400Z | | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {} |
2023-01-27T08:15:56.944576500Z | | /opt/tritonserver/backends/openvino_2021_4/libtriton_openvino_2021_4.so | {} |
+-------------+-------------------------------------------------------------------------+--------+
I0127 08:15:56.940411 34 server.cc:619]
+-------+---------+--------+
2023-01-27T08:15:56.944589300Z | | Version | Status |
+-------+---------+--------+
+-------+---------+--------+
Error: Failed to initialize NVML
W0127 08:15:56.945560 34 metrics.cc:571] DCGM unable to start: DCGM initialization error
I0127 08:15:56.946387 34 tritonserver.cc:2123]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2023-01-27T08:15:56.946463100Z | | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2023-01-27T08:15:56.946468000Z | | triton |
2023-01-27T08:15:56.946470200Z | | 2.21.0 |
2023-01-27T08:15:56.946478400Z | | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
2023-01-27T08:15:56.946480900Z | | /models |
2023-01-27T08:15:56.946483100Z | | MODE_POLL |
2023-01-27T08:15:56.946499700Z | | 1 |
2023-01-27T08:15:56.946501900Z | | OFF |
2023-01-27T08:15:56.946503900Z | | 268435456 |
2023-01-27T08:15:56.946506000Z | | 0 |
2023-01-27T08:15:56.946527700Z | | 6.0 |
2023-01-27T08:15:56.946529900Z | | 1 |
2023-01-27T08:15:56.946532100Z | | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0127 08:15:56.950157 34 grpc_server.cc:4544] Started GRPCInferenceService at 0.0.0.0:8001
I0127 08:15:56.951006 34 http_server.cc:3242] Started HTTPService at 0.0.0.0:8000
I0127 08:15:56.992744 34 http_server.cc:180] Started Metrics Service at 0.0.0.0:8002
E0127 08:23:57.017606 34 model_repository_manager.cc:2064] Poll failed for model directory 'test_model_pytorch': failed to open text file for read /models/test_model_pytorch/config.pbtxt: No such file or directory
E0127 08:24:57.019191 34 model_repository_manager.cc:2064] Poll failed for model directory 'test_model_pytorch': failed to open text file for read /models/test_model_pytorch/config.pbtxt: No such file or directory
E0127 08:25:57.019860 34 model_repository_manager.cc:2064] Poll failed for model directory 'test_model_pytorch': failed to open text file for read /models/test_model_pytorch/config.pbtxt: No such file or directory
E0127 08:26:57.020321 34 model_repository_manager.cc:2064] Poll failed for model directory 'test_model_pytorch': failed to open text file for read /models/test_model_pytorch/config.pbtxt: No such file or directory
E0127 08:27:57.021140 34 model_repository_manager.cc:2064] Poll failed for model directory 'test_model_pytorch': failed to open text file for read /models/test_model_pytorch/config.pbtxt: No such file or directory
E0127 08:28:57.021939 34 model_repository_manager.cc:2064] Poll failed for model directory 'test_model_pytorch': failed to open text file for read /models/test_model_pytorch/config.pbtxt: No such file or directory
E0127 08:29:57.022943 34 model_repository_manager.cc:2064] Poll failed for model directory 'test_model_pytorch': failed to open text file for read /models/test_model_pytorch/config.pbtxt: No such file or directory
163 Views
0
Answers
one year ago
one year ago