Thanks @<1569496075083976704:profile|SweetShells3> ! let me see if I can reproduce the issue
@<1523701205467926528:profile|AgitatedDove14> config.pbtxt in triton container (inside /models/conformer_joint) - after merge:
default_model_filename: "model.bin"
max_batch_size: 16
dynamic_batching {
max_queue_delay_microseconds: 100
}
input: [
{
name: "encoder_outputs"
data_type: TYPE_FP32
dims: [
1,
640
]
},
{
name: "decoder_outputs"
data_type: TYPE_FP32
dims: [
640,
1
]
}
]
output: [
{
name: "outputs"
data_type: TYPE_FP32
dims: [
129
]
}
]
platform: "onnxruntime_onnx"
@<1523701205467926528:profile|AgitatedDove14> this error appears before postprocess part.
Today I redeployed existing entrypoint with --aux-config "./config.pbtxt"
and get the same error
Before:
!clearml-serving --id "<>" model add --engine triton --endpoint 'conformer_joint' --model-id '<>' --preprocess 'preprocess_joint.py' --input-size '[1, 640]' '[640, 1]' --input-name 'encoder_outputs' 'decoder_outputs' --input-type float32 float32 --output-size '[129]' --output-name 'outputs' --output-type float32 --aux-config name=\"conformer_joint\" platform=\"onnxruntime_onnx\" default_model_filename=\"model.bin\" max_batch_size=16 dynamic_batching.max_queue_delay_microseconds=100
After:
!clearml-serving --id "<>" model add --engine triton --endpoint 'conformer_joint' --model-id '<>' --preprocess 'preprocess_joint.py' --aux-config "./config.pbtxt"
config.pbtxt:
default_model_filename: "model.bin"
max_batch_size: 16
dynamic_batching {
max_queue_delay_microseconds: 100
}
input: [
{
name: "encoder_outputs"
data_type: TYPE_FP32
dims: [
1,
640
]
},
{
name: "decoder_outputs"
data_type: TYPE_FP32
dims: [
640,
1
]
}
]
output: [
{
name: "outputs"
data_type: TYPE_FP32
dims: [
129
]
}
]
"before" Entrypoint worked as expected , "After" One returns the same error:{'detail': "Error processing request: object of type 'NoneType' has no len()"}
Something wrong with config.pbtxt
P.S. I tried wihout default_model_filename
line, but still get an error
@<1523701205467926528:profile|AgitatedDove14> I think there is no chance to pass config.pbtxt as is.
In this line, function use self.model_endpoint.input_name
(and after that input_name
, input_type
and input_size
), but there are no such attributes (see endpoint config above) in endpoint - they are in auxiliary_cfg
string field
If I understand correctly, we can not use config.pbtxt
as I used above, and we have to define inputs and outputs like in clearml-serving examples, and use config.pbtxt
only for additional parameters like max_batch_size
Hi @<1523701087100473344:profile|SuccessfulKoala55> Turns out if I delete
platform: ...
string from config.pbtxt, it will deploy model on tritonserver (serving v 1.3.0 add "platform" string at the end of config file when clearm-model has "framework" attribute). But when I try to check endpoint with random data (but with right shape according config), I am getting
{'detail': "Error processing request: object of type 'NoneType' has no len()"}
error. Do you know how to solve it?
I am getting this error in request response:
import numpy as np
import requests
body={
"encoder_outputs": [np.random.randn(1, 640).tolist()],
"decoder_outputs": [np.random.randn(640, 1).tolist()]
}
response =
(f"
", json=body)
response.json()
Unfortunately, I see nothing related to this problem in both inference and triton pods /deployments (we use Kubernetes to spin ClearML-serving
Hi @<1523701205467926528:profile|AgitatedDove14>
https://github.com/allegroai/clearml-serving/issues/62
I have an issue basen on that case. Could you tell me if I miss something in it?
Hi @<1523701205467926528:profile|AgitatedDove14>
My preprocess file:
from typing import Any, Union, Optional, Callable
class Preprocess(object):
def init(self):
pass
def preprocess(
self,
body: Union[bytes, dict],
state: dict,
collect_custom_statistics_fn: Optional[Callable[[dict], None]]
) -> Any:
return body["length"], body["audio_signal"]
def postprocess(
self,
data: Any,
state: dict,
collect_custom_statistics_fn: Optional[Callable[[dict], None]]
) -> dict:
return {
"encoded_lengths": data["encoded_lengths"].tolist(),
"output": data["output"].tolist()
}
My request code that returns error:
import numpy as np
import requests
batch = 4
length = 3
body={
"length": [length] * batch,
"audio_signal": np.random.randn(batch, 80, length).astype(np.float16).tolist()
}
response =
(f"<>", json=body)
response.json()
- I'm happy tp hear you found a work around
- Seems like there is something wrong with the way the pbtxt is being merged, but I need some more information
{'detail': "Error processing request: object of type 'NoneType' has no len()"}
Where are you seeing this error?
What are you seeing in the docker-compose log.
Hi @<1569496075083976704:profile|SweetShells3> , can you make the basic example work in your setup?
@<1569496075083976704:profile|SweetShells3> remove these from your pbtext:
name: "conformer_encoder"
platform: "onnxruntime_onnx"
default_model_filename: "model.bin"
Second, what do you have in your preprocess_encoder.py
?
And where are you getting the Error? (is it from the triton container? or from the Rest request?
Hi @<1523701205467926528:profile|AgitatedDove14>
Are there any questions or updates about the issue?
Thanks @<1569496075083976704:profile|SweetShells3> for bumping it!
Let me check where it stands, I think I remember a fix...
"After" version in logs is the same as config above. There is no "before" version in logs((
Endpoint config from ClearML triton task:
conformer_joint {
engine_type = "triton"
serving_url = "conformer_joint"
model_id = "<>"
version = ""
preprocess_artifact = "py_code_conformer_joint"
auxiliary_cfg = """default_model_filename: "model.bin"
max_batch_size: 16
dynamic_batching {
max_queue_delay_microseconds: 100
}
input: [
{
name: "encoder_outputs"
data_type: TYPE_FP32
dims: [
1,
640
]
},
{
name: "decoder_outputs"
data_type: TYPE_FP32
dims: [
640,
1
]
}
]
output: [
{
name: "outputs"
data_type: TYPE_FP32
dims: [
129
]
}
]
"""
}
Yeah I think that for some reason the merge of the pbtxt raw file is not working.
Any chance you have an end to end example we could debug? (maybe just add a pbtxt for one of the examples?)
I see... In the triton pod, when you run it, it should print the combined pbtxt. Can you print both before/after ones? so that we could compare ?
data["encoded_lengths"]
This makes no sense to me, data is a numpy array, not a pandas frame...