@<1523701087100473344:profile|SuccessfulKoala55> Thank you once again, I extracted the scripts and commands, that seemingly were responsible for model registration and its inference on GPU worker server:
register_model.py
from clearml import Task, OutputModel
task = Task.init(project_name="LogSentinel", task_name="Model Registration")
model_path = "~/<full_local_path_to_model>/deeplog_bestloss.pth"
# Register the model
output_model = OutputModel(task=task)
output_model.update_weights(model_path)
output_model.publish()
print(f"Model ID: {output_model.id}")
Commands:
docker compose --env-file .env -f docker-compose-triton-gpu.yml up -d
clearml-serving create --project "LogSentinel" --name "deeplog-serving"
clearml-serving model add --engine triton --endpoint "deeplog" --model-id 0c6a1c24067a49a0ac09c7e42c215b05 --input-name "log_sequence" --input-type "int64" --input-size 1 10 --output-name "predictions" --output-type "float32" --output-size 1 28
Here's the simplified diagram of the architecture:
Hi @<1773158043551272960:profile|PungentRobin32> ,
I'm a bit confused, do you mean clearml-serving
? How did you install it?
Hi @<1523701087100473344:profile|SuccessfulKoala55> , thank you for the reply!
Yes, I am talking about clearml-serving.
I will be near my pc in nearest couple of hours and will send the list of commands as well as a visual scheme of an architecture. :)