SuccessfulRaven86

16 Questions, 63 Answers

Active since 12 April 2023

Last activity one year ago

Reputation

Badges 1

62 × Eureka!

Questions 16
Answers 63

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hello! I Have A Small Question Regarding Storage Data Retrieval With Clearml

Hello! I have a small question regarding storage data retrieval with ClearML 😉 Context: My team uploads thousands of data samples for training as one ClearM...

mlops

one year ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Hey Channel, I Would Like To Setup Kubernetes For Serving My Models Only. Does It Mean I Can Use Clearml-Serving Helm Chart Alone? What Would Be The Use Case Of The Two Other Charts (Agent And Clearml Server). I Am Not Sure To Understand That Properly. I

Hey channel, I would like to setup Kubernetes for serving my models only. Does it mean I can use clearml-serving helm chart alone? What would be the use case...

mlops

2 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Hello Channel, I Have A Question Regarding Clearml Serving In Production. I Have Different Environments, And Different Models Each Of Them Linked To A Use Case. I Would Like To Spin Up One Kubernetes Cluster (From Triton Gpu Docker Compose) Taking Into

Hello channel, I have a question regarding clearml serving in production. I have different environments, and different models each of them linked to a use ca...

kubernetes scikit

2 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hello, I Have The Same Issue As This Github Issue:

Hello, I have the same issue as this github issue: None I tried setting up my AWS autoscaler conf file with the following params: sdk.development.store_uncom...

mlops

2 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Hey, I Saw Previously That Grafana/Prometheus Were Not Supported As Part Of The Clearml-Serving Helm Chart. I Guess This Is Outdated Right? I See These Charts As Part Of Your Serving Chart. So They Should Be Accessible If The K8S Cluster Is Accessible Fro

Hey, I saw previously that Grafana/Prometheus were not supported as part of the clearml-serving helm chart. I guess this is outdated right? I see these chart...

clearml

2 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Hi, I Have A Small Question Regarding K8S Clearml-Serving Behavior. I Have In My Cluster One Gpu Of 16Gb Ram, And Another One Of 24 Gb Ram. I Have A Llm Model Fitting The 24Gb But Not The 16Gb Gpu. When I Call The Endpoint, How Will I Know To Which Gpu I

Hi, I have a small question regarding k8s clearml-serving behavior. I have in my cluster one GPU of 16GB RAM, and another one of 24 GB RAM. I have a LLM mode...

clearml

one year ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Hey, Can Clearml Uploads Data To Private Blob Storage Azure? I Have Authorization Errors. Is It Due To The Fact That I Do Not Have The Required Permission On The Storage Account (Could Be Possible) Or The Fact That The Storage Account Is Set As Private A

Hey, Can ClearML uploads data to private blob storage Azure? I have authorization errors. Is it due to the fact that I do not have the required permission on...

clearml

2 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi Channel, I Am Using K8S Clearml-Serving Helm Chart And Noticed A Small Issue. The Current Implementation Of

Hi channel, I am using K8s clearml-serving helm chart and noticed a small issue. The current implementation of ...ingress.yaml resource does not contain the ...

clearml

2 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hello, Question About The Time Of Upload: Is It Faster Or Exactly The Same To Upload 1 File Of 1Gb Compared To 10 Files Of 100 Mb?

Hello, Question about the time of upload: Is it faster or exactly the same to upload 1 file of 1Gb compared to 10 files of 100 Mb?

clearml

2 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Hello Channel, Two Other Related Questions:

Hello channel, Two other related questions: - ClearML is supposed to automatically detect GIT repo directly. It works when I run a python script but it does ...

clearml

2 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hello, I Am Trying To Modify My Clearml-Agent Running On A Aws Autoscaler (From Clearml Applications). I Want To Be Able To Clone My Repo (Working), And Install My Poetry Dependencies From

Hello, I am trying to modify my clearml-agent running on a AWS autoscaler (From ClearML applications). I want to be able to clone my repo (working), and inst...

mlops

2 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Hi, Can Someone Give More Information About What An Api Call Means? Our Team Has Been Charged For 10 Millions Api Calls, But We Struggle To Understand Where They Are Coming From (We Are Only Making Training Tasks). Thanks

Hi, Can someone give more information about what an API call means? Our team has been charged for 10 Millions API calls, but we struggle to understand where ...

clearml

2 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hey Channel, Clearml-Serving Question Is It Good Practice To Save A .Zip File As Model, And Unzip It In The Custom Endpoint For Usage?

Hey channel, Clearml-serving question Is it good practice to save a .zip file as model, and unzip it in the custom endpoint for usage?

clearml

2 years ago

0 Votes

12 Answers

2K Views

0 Votes 12 Answers 2K Views

Can We Use The Simple Docker-Compose.Yml File For Clearml Serving On A Huggingface Model (Not Processed To Tensorrt)?

Can we use the simple docker-compose.yml file for clearml serving on a huggingface model (not processed to tensorrt)?

clearml

2 years ago

0 Votes

40 Answers

161K Views

0 Votes 40 Answers 161K Views

Hello Channel, I Am Struggling A Lot On An Issue Linked To

Hello channel, I am struggling a lot on an issue linked to ClearMl agent and AWS Autoscaler . This issue is very problematic and urgent, please help me out! ...

mlops

2 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hello everyone, *Context:* I am currently facing a headache-inducing issue regarding the integration of flash attention V2 for LLM training. I am running a python script locally, that then runs remotely. Without the integration of flash attention, the co

Hello everyone, Context: I am currently facing a headache-inducing issue regarding the integration of flash attention V2 for LLM training. I am running a pyt...

clearml

one year ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

Yes I take the export statements from my bash script of the task

2 years ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

Yes indeed, but what about the possibility to do the clone/poetry installation ourself in the init bash script of the task?

2 years ago

0 Hello, I Have The Same Issue As This Github Issue:

Ok. I spinned up three AWS autoscalers, each with different conf. I also fixed a submodule issue in my repo (which I was believing was the problem of the git diff) and every run now passes and fails after (not this problem). So I think store_code_diff_from_remote is of no help from me but my problem is gone...

2 years ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

I tried too. I do not have more logs inside the ClearML agent 😞

2 years ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

but I still had time to go inside the container, export the PATH variables for my poetry and python versions, and run the poetry install command there

2 years ago

0 Hi All! Couldn'T Find This In The Documentation, How Do You Specify A "Setup Shell Script", So It Is Used For That Specific Task?

Task.set_base_docker 🙂

2 years ago

0 Hey, Can Clearml Uploads Data To Private Blob Storage Azure? I Have Authorization Errors. Is It Due To The Fact That I Do Not Have The Required Permission On The Storage Account (Could Be Possible) Or The Fact That The Storage Account Is Set As Private A

Related to that,
Is it possible to do Dataset.add_external_files() with source_url and destination_url being two separate azure storage containers?

2 years ago

Yep got it working thanks.

2 years ago

0 Hello, I Have The Same Issue As This Github Issue:

These changes reflect the modifications I have in my staging area (not commited, not put in staging area with git add ) But I would like to remove this uncommited section from clearml and not be blocked by it

2 years ago

0 Hello, I Have The Same Issue As This Github Issue:

Sure, here is the updated clearml.conf file of the AWS autoscaler instance:

agent {
    vcs_cache.enabled: false

    package_manager: {
          type: poetry,
          poetry_version: "1.4.2",
     }  
}

sdk {
    development {
         store_code_diff_from_remote: false,
    }

}

I see uncommited changes, where as I would like to have nothing.

2 years ago

0 Can We Use The Simple Docker-Compose.Yml File For Clearml Serving On A Huggingface Model (Not Processed To Tensorrt)?

I basically would like to know if we can serve the model without tensorrt format which is highly efficient but more complicated to get.

2 years ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

How do you explain that it works when I ssh-ed into the same AWS container instance from the autoscaler?

2 years ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

Using a pyenv virtual env then exporting LOCALPYTHON env var

2 years ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

Yes should be correct. Inside the bash script of the task.

2 years ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

@<1523701070390366208:profile|CostlyOstrich36> @<1523701087100473344:profile|SuccessfulKoala55> I tried with dummy repo. Using Python and stripe packages ONLY in the pyproject.toml

Here is my result (still failing) :

Poetry Enabled: Ignoring requested python packages, using repository poetry lock file!
Creating virtualenv debug in /root/.clearml/venvs-builds/3.9/task_repository/clearmldebug.git/.venv
Using virtualenv: /root/.clearml/venvs-builds/3.9/task_repository/clearmldebug.git/...

2 years ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

This is really extremely hard to debug. I am thinking to create another repo and iterate on the packages to hopefully find the problem, but it will take ages.

2 years ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

When the task finally failed, I was kicked of from the container

2 years ago

How to set that up inside clearml.conf or something else to know which credentials to load?

2 years ago

0 Can We Use The Simple Docker-Compose.Yml File For Clearml Serving On A Huggingface Model (Not Processed To Tensorrt)?

Thanks, my question is dumb indeed 🙂 Thanks for the reply !

2 years ago

0 Can We Use The Simple Docker-Compose.Yml File For Clearml Serving On A Huggingface Model (Not Processed To Tensorrt)?

I would like to know if it is possible to run any pytorch model on the basic docker compose file ? Without triton?

2 years ago

0 Can We Use The Simple Docker-Compose.Yml File For Clearml Serving On A Huggingface Model (Not Processed To Tensorrt)?

Sorry to come back to this! Regarding the Kubernetes Serving helm chart, I can see horyzontal scaling of docker containers. What about vertical scaling? Is it implemented? More specifically, where is defined the SKU of the VMs in use?

2 years ago

0 Hello Channel, I Have A Question Regarding Clearml Serving In Production. I Have Different Environments, And Different Models Each Of Them Linked To A Use Case. I Would Like To Spin Up One Kubernetes Cluster (From Triton Gpu Docker Compose) Taking Into

@<1523701118159294464:profile|ExasperatedCrab78> do you have any inputs for this one? 🙂

2 years ago

0 Can We Use The Simple Docker-Compose.Yml File For Clearml Serving On A Huggingface Model (Not Processed To Tensorrt)?

Prerequisites, PyTorch models require Triton engine support, please use docker-compose-triton.yml / docker-compose-triton-gpu.yml or if running on Kubernetes, the matching helm chart.

2 years ago

Great, and can we specify an environment variable of ClearML that directly updates the clearml.conf file regarding the azure config or do something similar. I do not want to ask every engineer of my team to modify its clearml.conf file? @<1523701070390366208:profile|CostlyOstrich36> Thanks

2 years ago

0 Hello Channel, Two Other Related Questions:

The flask command is ran inside the git project, which is the strange behavior. It is executed in ~/code/repo/ as flask train ...

2 years ago

0 Hello Channel, I Am Struggling A Lot On An Issue Linked To

@<1523701087100473344:profile|SuccessfulKoala55> Do you think it is possible to ask to run docker mode in the aws autoscaler, and add the cloning and installation inside the init bash script of the task?

2 years ago

0 Can We Use The Simple Docker-Compose.Yml File For Clearml Serving On A Huggingface Model (Not Processed To Tensorrt)?

Thank you! I will try this 🙂

2 years ago

Thanks ! So regarding question2, it means that I can spin up a K8s cluster with triton enabled, and by specifiying the type of model while creating the endpoint, it will use or not the triton engine.
Linked to that, Is the triton engine expecting the tensorrt format or is it just an improvement step compared to other model weights ?

Finally, last question ( I swear 😛 ) : How is the serving on Kubernetes flow supposed to look like? Is it something like that:

Create en...

2 years ago

0 Hello! I Have A Small Question Regarding Storage Data Retrieval With Clearml

One possible solution I could see as well, is putting the data storage to S3 bucket to improve download performance as it is the same cloud provider. No transfer latency.

one year ago

0 Hello Channel, Two Other Related Questions:

I have my Task.init inside a train() function inside the flask command. We basically have flask commands allowing to trigger specific behaviors. When running it locally, everything works properly except the repository information. The use case is linked to the way our codebase works. For example, I am going to do flask train {arguments} and it will trigger the training of a model (that I want to track).

I stopped the autoscaler and deleted it manually. I did it because I want to test...

2 years ago

Show more results