PungentRobin32

Moderator

2 Questions, 12 Answers

Active since 29 November 2024

Last activity one month ago

Reputation

Badges 1

12 × Eureka!

Questions 2
Answers 12

0 Votes

16 Answers

165 Views

0 Votes 16 Answers 165 Views

Hi Guys, I Am Trying To Upload And Serve A Pre-Existing 3-Rdparty Pytorch Model Inside My Clearml Cluster. However, After Proceeding With The Suggested Sequence Of Operations By Official Docs And Later Even Gpt O3, I Am Having Errors Which I Cannot Solve.

Hi guys, I am trying to upload and serve a pre-existing 3-rdparty PyTorch model inside my ClearML cluster. However, after proceeding with the suggested seque...

clearml

one month ago

0 Votes

4 Answers

401 Views

0 Votes 4 Answers 401 Views

Hello Everyone! I Am Relatively New To Clearml And To The *-Ops Concepts At All, As I Am But A Regular Python Dev. I Am Currently Trying To Implement Mlops Into Our Existing Local Infrastructure, So That We Would Be Able To Utilize Automated Data Preproc

Hello everyone! I am relatively new to ClearML and to the *-Ops concepts at all, as I am but a regular python dev. I am currently trying to implement MLOps i...

pytorch

4 months ago

0 Hi Guys, I Am Trying To Upload And Serve A Pre-Existing 3-Rdparty Pytorch Model Inside My Clearml Cluster. However, After Proceeding With The Suggested Sequence Of Operations By Official Docs And Later Even Gpt O3, I Am Having Errors Which I Cannot Solve.

@<1523701087100473344:profile|SuccessfulKoala55> Also, there's one more thing that is bugging me: I have my model files on a remote host in the same LAN (.68 machine), so I try to push them to the model storage of clearml server (.69 machine).

But as far as I understand, I must provide either the URL or local path to the model file in order for ClearML SDK to send it to server machine. So I provide the absolute local path on my .68 device.

However, when I open the model storage on .69 and...

one month ago

Hi, @<1523701087100473344:profile|SuccessfulKoala55> Yeah, sure, please, wait a sec - I will rerun the command. :)

Here's the command and output:

clearml-serving model add     --endpoint deepl_query     --engine triton     --model-id 8df30222595543d3a3ac55c9e5e2fb15     --input-size 7 1     --input-type float32     --output-size 6     --output-type float32 --input-name layer_0 --output-name layer_99


clearml-serving - CLI for launching ClearML serving engine
Notice! serving service...

one month ago

Ok, @<1523701087100473344:profile|SuccessfulKoala55> , I was partially able to find one of the incorrect parts of my serving setup:

Pytorch models inference require me to have .ENV file and clearml-serving-triton-gpu docker configured and running.
Configuration of .ENV requires me to provide the clearml-serving Service ID, which was created by clearml-serving create.
I have multiple services created via that command, as there is no command to remove the others, only to create add...

one month ago

Hi, @<1523701205467926528:profile|AgitatedDove14> , host OS is Ubuntu, I connect there via ssh.

The docker compose is of version 2 (the one that uses "docker compose" instead of older "docker-compose").

I did not pass anything to or from docker manually, only used the commands according to the official guide for clearml-serving:

pip install clearml-serving

clearml-serving create --name deeplog-inference-test --project LogSentinel

git clone



nano .env # here I added my...

one month ago

0 Hello Everyone! I Am Relatively New To Clearml And To The *-Ops Concepts At All, As I Am But A Regular Python Dev. I Am Currently Trying To Implement Mlops Into Our Existing Local Infrastructure, So That We Would Be Able To Utilize Automated Data Preproc

Hi @<1523701087100473344:profile|SuccessfulKoala55> , thank you for the reply!

Yes, I am talking about clearml-serving.

I will be near my pc in nearest couple of hours and will send the list of commands as well as a visual scheme of an architecture. :)

4 months ago

@<1523701087100473344:profile|SuccessfulKoala55> Thank you once again, I extracted the scripts and commands, that seemingly were responsible for model registration and its inference on GPU worker server:

register_model.py

from clearml import Task, OutputModel

task = Task.init(project_name="LogSentinel", task_name="Model Registration")
model_path = "~/<full_local_path_to_model>/deeplog_bestloss.pth"

# Register the model
output_model = OutputModel(task=task)
output_model....

3 months ago

Here's the simplified diagram of the architecture:

3 months ago

@<1523701205467926528:profile|AgitatedDove14> ClearML server itself and all of its components (API server etc.) are on x.x.x.69 machine.
Agents and serving are on x.x.x.68 worker machine. My model files are also there, just placed in some usual non-shared linux directory.
And I didn't do any specific configurations of the clearml fileserver docker - everything is on its defaults without a single line changed except the IP address of the ClearML server.

I tried a couple of approaches to u...

one month ago

Also, @<1523701205467926528:profile|AgitatedDove14> , thank you very much for your advice regarding archive - I did that, removed all current clearml-serving services, created a new one, attached its ID to the ENV file, disabled all running serving dockers and then restarted the clearml-serving-triton-gpu docker, adding a model file afterwards.

I don't see any docker run errors now in clearml webui tasks console, but now serving is not able to locate the model file itself, and that file...

one month ago

Hi @<1523701205467926528:profile|AgitatedDove14> , I don't remember it well as I initially installed ClearML about half a year ago, but as far as I remember, I didn't preconfigure any specific queue.
HOWEVER, the first thing I have done in webui - I accidentally deleted the "default" queue, and later, when my clearml agents began to fail due to its abscence, had to use api to create another queue named "my_default_queue" with "default" system tag - then it was fixed.

Here are the logs from...

one month ago

Also, I tested out the reachability of an endpoint with a CURL query made from modifying the example in Clearml-serving tutorial: https://clear.ml/docs/latest/docs/clearml_serving/clearml_serving_tutorial ,
and it returns error 405: method not allowed :

curl -X POST -H "accept: application/json" -H "Content-Type: application/json" -d '{"log_sequence": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}'



<htm...

one month ago

@<1523701205467926528:profile|AgitatedDove14> Please, correct me, if I am wrong: are you currently proposing the following sequence:

On a device, that hosts clearml server, I should have my model file in any directory.
Then, I should upload it to the clearml model repository as OutputModel directly?

Because today I did try to upload the model using the following script:

from clearml import Task, OutputModel

# Step 1: Initialize a Task
task = Task.init(project_name="LogSentine...

one month ago