Hello Everyone! I Am Relatively New To Clearml And To The *-Ops Concepts At All, As I Am But A Regular Python Dev. I Am Currently Trying To Implement Mlops Into Our Existing Local Infrastructure, So That We Would Be Able To Utilize Automated Data Preproc

Answered

Hello everyone!
I am relatively new to ClearML and to the *-Ops concepts at all, as I am but a regular python dev.

I am currently trying to implement MLOps into our existing local infrastructure, so that we would be able to utilize automated data preprocessing, model training/finetuning, stream inference, concept drift detection and mitigation etc.

However, currently, I am stuck with streaming inference problem:

Our current setup is:

CPU server with lots of space and RAM (x.x.x.69) - there we placed the clearml-server dockers on their default ports (8080 http, 8008 api, 8081 fileserver).
GPU worker PC (x.x.x.68) with clearml-agent docker for model training/inference.
Plans are to expand to 6 more GPU workstations with the same logic.
GPU worker and CPU server are able to see each other in LAN, they have functioning mutual SSH, ClearML server webUI registers the GPU clearml-agent as valid and running worker, so seems like no issues here.

Now I tried to upload and register the existing .pth file of our model, which was situated on GPU worker (.68), to the clearml server (.69).

ClearML documentation and ChatGPT were telling me to use the clearml-serve package, and that clearml-serve should be installed on the GPU worker, not on the CPU server, to avoid port and logic conflicts.

So I:

Installed clearml-serve on .68 , by trial and error somehow registered the model file from GPU worker host, and it appeared in the list of models for my project in ClearML server webUI.
Secondly, I launched the streaming inference using Triton-inference docker (because pytorch model and it is necessary to utilize CUDA).
After that, I registered the endpoint via the clearml-serve, and it appeared in the description of serving task in webUI.
Now the tricky part: I tried to send some input data to test model predictions via CURL POST and via python requests to the endpoint URL I derived from ClearML tutorials: http://x.x.x.69:8080/serve/<my_endpoint_name > And it gave me the HTTP 405 error - Method not allowed.

What could be the issue here? Or rather, what is the easiest and correct way to upload the existing model file and inference it in streaming mode?

Thank you in advance!

If any other info might be necessary or helpful, please, let me know, I shall provide it. :)

  				
Posted 
	11 months ago

					More
				  		
  Report
		
					PungentRobin32
				
					0
					 × 1

Votes Newest

Answers 4

@<1523701087100473344:profile|SuccessfulKoala55> Thank you once again, I extracted the scripts and commands, that seemingly were responsible for model registration and its inference on GPU worker server:

register_model.py

from clearml import Task, OutputModel

task = Task.init(project_name="LogSentinel", task_name="Model Registration")
model_path = "~/<full_local_path_to_model>/deeplog_bestloss.pth"

# Register the model
output_model = OutputModel(task=task)
output_model.update_weights(model_path)
output_model.publish()
print(f"Model ID: {output_model.id}")

Commands:

docker compose --env-file .env -f docker-compose-triton-gpu.yml up -d

clearml-serving create --project "LogSentinel" --name "deeplog-serving"

clearml-serving model add   --engine triton   --endpoint "deeplog"   --model-id 0c6a1c24067a49a0ac09c7e42c215b05   --input-name "log_sequence" --input-type "int64" --input-size 1 10   --output-name "predictions" --output-type "float32" --output-size 1 28

  				
Posted 
	11 months ago

					More
				  		
  Report
		
					PungentRobin32
				
					0
					 × 1

Hi @<1773158043551272960:profile|PungentRobin32> ,
I'm a bit confused, do you mean clearml-serving ? How did you install it?

  				
Posted 
	11 months ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Hi @<1523701087100473344:profile|SuccessfulKoala55> , thank you for the reply!

Yes, I am talking about clearml-serving.

I will be near my pc in nearest couple of hours and will send the list of commands as well as a visual scheme of an architecture. :)

  				
Posted 
	11 months ago

					More
				  		
  Report
		
					PungentRobin32
				
					0
					 × 1

Here's the simplified diagram of the architecture:

  				
Posted 
	11 months ago

					More
				  		
  Report
		
					PungentRobin32
				
					0
					 × 1

Write your answer

2K Views

4 Answers

11 months ago