RobustFox47

22 Questions, 94 Answers

Active since 10 January 2023

Last activity 2 years ago

Reputation

Badges 1

89 × Eureka!

Questions 22
Answers 94

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

How Do You Save Models And Store The Classes As Well? When We Save The Model Using Pytorch We Just Save The State Dict And Not The Custom Classes For Different Models. I Was Thinking Of Just Pickling The Model But If Anyone Has A Better Suggestion That Wo

How do you save models and store the classes as well? When we save the model using pytorch we just save the state dict and not the custom classes for differe...

pytorch

3 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Another Thing That Would Be Useful Is To Enable Git Clone Via Github Personal Access Token As Ssh Config Can Be Annoying When Working In Containerised Env For Cloning Tasks In The Agent

Another thing that would be useful is to enable git clone via github personal access token as ssh config can be annoying when working in containerised env fo...

mlops

4 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hey, Are You Thinking Of Getting Iso27001 Or Soc2 At Some Point / Do You Already Have It?

Hey, Are you thinking of getting ISO27001 or SOC2 at some point / do you already have it?

clearml

3 years ago

0 Votes

22 Answers

3K Views

0 Votes 22 Answers 3K Views

Hi Guys For The Aws Auto-Scaler I Need To Access Aws Ssm Or Create .Env File Locally When Using The Init Script. Has Anyone Done This?

Hi guys for the AWS auto-scaler I need to access aws ssm or create .env file locally when using the init script. Has anyone done this?

aws

4 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Can I Prevent

Can I prevent torch.save()from automatically uploading the model and use task.update_output_model(model_path=best_model_path)at the end to upload the best mo...

clearml

4 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Hey How Can I Trigger Multiple "Hyperparameter Optimization" In A For Loop To Run On A Service? I Have Around 30 Models I'D Like To Run A Bo On In A Service Queue. If I Use The Code In This Example

Hey how can I trigger multiple "Hyperparameter Optimization" in a for loop to run on a service? I have around 30 models I'd like to run a BO on in a service ...

clearml

2 years ago

0 Votes

24 Answers

2K Views

0 Votes 24 Answers 2K Views

How Can I Call The

How can I call the runtime of a experiment running on the clearml agent?

clearml

3 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

I'Ve Just Tried Uploading A Few Datasets And They Are Not Being Detected On The Ui.

I've just tried uploading a few datasets and they are not being detected on the UI.

clearml

3 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Are Gp3 Volumes Automatically Deleted When The Instance Is Spun Down Using The Clearml Aws Autoscaler?

Are GP3 volumes automatically deleted when the instance is spun down using the clearml aws autoscaler?

aws

3 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Hi Guys

Hi guys 🙂 hope you're having a good weekend. When I try and compare experiments on the Web UI the display just freezes. It would be great if you could give ...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Loving The New Website & Design

Loving the new website & design 😍 😍 😍 😍 😍 congratulations 🚀

clearml

3 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Hi Everyone, We Train Our Ml Models Using The Aws Autoscaler On G4Dn Instances. We Currently Have A 24 Vcpu Limit For G Type Instances In Eu-West. I'M Trying To Get This Limit At Least Doubled Or Tripled. My Request Keeps Stagnating With The Service Team

Hi everyone, we train our ML models using the AWS autoscaler on g4dn instances. We currently have a 24 vCPU limit for G type instances in eu-west. I'm trying...

mlops

3 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hey Guys, Hope You'Re Having A Good Week

Hey guys, hope you're having a good week 🙂 How can i get the status of a task in the queue? status = Task.enqueue( cloned, queue_name='default', ) #now get ...

clearml

4 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Hi Guys, I Want Automate The Clearml-Init In A Docker Container. Ideally I Would Do This By Having The Clearml.Conf File In The Container And Then Read In Env Variables I.E. Any Recommendations Would Be Good?

Hi guys, I want automate the clearml-init in a docker container. Ideally I would do this by having the clearml.conf file in the container and then read in en...

aws

4 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi Guys, I'M Loving Clearml So Far Now We Can Quickly Train And Deploy Customer Models

Hi guys, I'm loving ClearML so far now we can quickly train and deploy customer models 🚀 however, when prototyping models in a .py file it's quite annoying ...

clearml

4 years ago

0 Votes

12 Answers

2K Views

0 Votes 12 Answers 2K Views

Hi Everyone, I Think It Would Be Nice If

Hi everyone, I think it would be nice if startup_bash_script that takes in bash_script for the AWS autoscalers can be completely overwritten to input custom ...

clearml

4 years ago

0 Votes

18 Answers

2K Views

0 Votes 18 Answers 2K Views

Hey, Just Trying Out Clearml-Serving And Getting The Following Error

Hey, Just trying out clearml-serving and getting the following error the provided PTX was compiled with an unsupported toolchain in the clearml-serving-trito...

clearml

3 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hey, How Can I Download

Hey, how can I download report_table objects back into python as a dataframe? I'm sure I've done this before but have no idea how I did it.

clearml

3 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Any Plans For Log Space For Hyperparameter Support (Log Argument)? This Is Supported Config Space

any plans for log space for hyperparameter support (log argument)? this is supported config space https://github.com/automl/ConfigSpace/blob/772665a86bca0d5b...

clearml

4 years ago

0 Votes

16 Answers

3K Views

0 Votes 16 Answers 3K Views

Hi Guys, I Think There'S Something Wrong On

Hi guys, I think there's something wrong on https://app.clear.ml . The font etc is changing to caps and . 's are being introduced into the path of project fo...

clearml

3 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Happy Friday Everyone

Happy Friday Everyone 🙂 how can I apply uncommitted changes from a task manually to debug locally?

clearml

3 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Has Anyone Been Having Issues With Clearml-Agent / Security? Our Internal Workers Have All Shut Down Since Yesterday Morning. One Worker Is No Longer Accessible Via Ssh And The Other Is Not Reachable Via Ssh (Service Has Shutdown). The One Instance That I

Has anyone been having issues with clearml-agent / security? Our internal workers have all shut down since yesterday morning. One worker is no longer accessi...

clearml

4 years ago

0 Hi Guys For The Aws Auto-Scaler I Need To Access Aws Ssm Or Create .Env File Locally When Using The Init Script. Has Anyone Done This?

Error: Can not start new instance, Could not connect to the endpoint URL: " "

4 years ago

0 Are Gp3 Volumes Automatically Deleted When The Instance Is Spun Down Using The Clearml Aws Autoscaler?

Yep figured this out yesterday. I had been tagging G type instances with an alarm as a fail safe if the AWS autoscaler failed. The alarm only stopped the instance and didn't terminate it (which deletes the drive). Thanks anyway CostlyOstrich36 and TimelyPenguin76 🙂

3 years ago

0 Hi Everyone, I Think It Would Be Nice If

Yes on the apps page is the possible to tigger programatically?

remote execution is working now. Internal worker nodes had not spun up the agent correctly 😛

4 years ago

0 Hi Everyone, I Think It Would Be Nice If

In short we clone the repo, build the docker container, and run agent in the container. The reason we do it this, rather than provide a docker image to the clearml-agent is two fold:

We actively develop our custom networks and architectures within a containerised env to make it easy for engineers to have a quick dev cycle for new models. (same repo is cloned and we build the docker container to work inside) We use the same repo to serve models on our backend (in a slightly different contain...

4 years ago

0 Hi Guys, I Think There'S Something Wrong On

It was the CORS like error we were getting

3 years ago

0 Hey How Can I Trigger Multiple "Hyperparameter Optimization" In A For Loop To Run On A Service? I Have Around 30 Models I'D Like To Run A Bo On In A Service Queue. If I Use The Code In This Example

For reference
import subprocess for i in ['1', '2']: command = ['python', 'hyp_op.py', '--testnum', f'{i}'] process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

2 years ago

0 Hi Guys, I Think There'S Something Wrong On

` python upload_data_to_clearml_copy.py
Generating SHA2 hash for 1 files
100%|████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 733.91it/s]
Hash generation completed
0%| | 0/1 [00:00<?, ?it/s]
Compressing local files, chunk 1 [remaining 1 files]
100%|████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 538.77it/s]
File compression completed: t...

3 years ago

0 Hey, Just Trying Out Clearml-Serving And Getting The Following Error

The latest commit to the repo is 22.02-py3 ( https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/clearml_serving/engines/triton/Dockerfile#L2 ) I will have a look at versions now 🙂

3 years ago

0 Hey, Just Trying Out Clearml-Serving And Getting The Following Error

Just for ref if anyone has this issue. I had to update my cuda drivers to 510 on system os

` docker run --gpus=0 -it nvcr.io/nvidia/tritonserver:22.02-py3

=============================
== Triton Inference Server ==

NVIDIA Release 22.02 (build 32400308)

This container image and its contents are gove...

3 years ago

0 Hi Everyone, We Train Our Ml Models Using The Aws Autoscaler On G4Dn Instances. We Currently Have A 24 Vcpu Limit For G Type Instances In Eu-West. I'M Trying To Get This Limit At Least Doubled Or Tripled. My Request Keeps Stagnating With The Service Team

This was the response from AWS:

"Thank you for for sharing the requested details with us. As we discussed, I'd like to share that our internal service team is currently unable to support any G type vCPU increase request for limit increase.

The issue is we are currently facing capacity scarcity to accommodate P and G instances. Our engineers are working towards fixing this issue. However, until then, we are unable to expand the capacity and process limit increase."

3 years ago

gdn4.xlarge (the best price for 16GB of GPU ram). Not so surprising they would want a switch

3 years ago

Nope AWS aren't approving the increased vCPU request. I've explained the use case several times and they've not approved

3 years ago

0 Hey, Just Trying Out Clearml-Serving And Getting The Following Error

I'll add a more detailed response once it's working

3 years ago

0 Hi Everyone, I Think It Would Be Nice If

yep using the community version

4 years ago

0 Hi Guys, I Think There'S Something Wrong On

`
import os
import glob
from clearml import Dataset

DATASET_NAME = "Bug"
DATASET_PROJECT = "ProjectFolder"
TARGET_FOLDER = "clearml_bug"
S3_BUCKET = os.getenv('S3_BUCKET')

if not os.path.exists(TARGET_FOLDER):
os.makedirs(TARGET_FOLDER)

with open(f'{TARGET_FOLDER}/data.txt', 'w') as f:
f.writelines('Hello, ClearML')

target_files = glob.glob(TARGET_FOLDER + "/**/*", recursive=True)

# upload dataset

dataset = Dataset.create(dataset_name=DATASET_NAME, dataset_project=DATASET_PR...

3 years ago

0 I'Ve Just Tried Uploading A Few Datasets And They Are Not Being Detected On The Ui.

thank you guys 😄 😄

3 years ago

0 Hey, Just Trying Out Clearml-Serving And Getting The Following Error

I'm using "allegroai/clearml-serving-triton:latest" container I was just debugging using the base image

3 years ago

0 I'Ve Just Tried Uploading A Few Datasets And They Are Not Being Detected On The Ui.

From SuccessfulKoala55 suggestion

3 years ago

Okay thanks for the update 🙂 the account manager got involved and the limit has been approved 🚀

3 years ago

AgitatedDove14 is any working on a GCP or Azura autoscaler at the moment?

3 years ago

0 Has Anyone Been Having Issues With Clearml-Agent / Security? Our Internal Workers Have All Shut Down Since Yesterday Morning. One Worker Is No Longer Accessible Via Ssh And The Other Is Not Reachable Via Ssh (Service Has Shutdown). The One Instance That I

Trying to retrieve logs now 🙂 Yes I mean the machines are not accessible. Trying to figure what's going on

4 years ago

0 Hey Guys, Hope You'Re Having A Good Week

if it's completed or not

4 years ago

0 Hey Guys, Hope You'Re Having A Good Week

Thanks Martin 🙂 just seen this worked as expected

4 years ago

0 Hey Guys, Hope You'Re Having A Good Week

I've got it... i just remembered I can call
task_idfrom the cloned tasked and check the status of that 🙂

4 years ago

0 Hey Guys, Hope You'Re Having A Good Week

so I guess if the status has changed from running to completed

4 years ago

0 I'Ve Just Tried Uploading A Few Datasets And They Are Not Being Detected On The Ui.

Hi yes all sorted ! 🙂

3 years ago

0 Hi Guys For The Aws Auto-Scaler I Need To Access Aws Ssm Or Create .Env File Locally When Using The Init Script. Has Anyone Done This?

I make 2x in eu-west-2 on the AWS console but still no luck

4 years ago

0 Has Anybody Used

we normally do something like that - not sure what why it's freezing for you without more info

3 years ago

0 Hi Guys, I Think There'S Something Wrong On

This was the error I was getting from uploads using the old SDK
has been rejected for invalid domain. heap-2443312637.js:2:108655 Referrer Policy: Ignoring the less restricted referrer policy "no-referrer-when-downgrade" for the cross-site request:

3 years ago

0 Happy Friday Everyone

great thank you

3 years ago

Show more results

Reputation

Badges 1

=============================== Triton Inference Server ==

# upload dataset

=============================
== Triton Inference Server ==