SubstantialElk6

117 Questions, 310 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

282 × Eureka!

Answers 310

0 Hi, We Are Running On Disconnected On Prem With A K8S Glue. When A Pod Is Spawned, We Noted That An Apt-Get Command Is Performed On The Pod. Short Of Changing The Content Of /Etc/Apt/Sources.List In The Images, Is There A Way For Clearml Agent To Override

In the Kube logs of the pod, i see 'Err:1 http://security.ubuntu.com/ubuntu bionic-security InRelease Temporary failure resolving http://security.ubuntu.com '. My guess is its trying to do a apt update.

As we are on disconnected network, we have a server hosting the repo but on a differennt name.

3 years ago

0 Hi, Trying To Understand Clearml-Session. I Have An Agent Running On A Machine Monitoring A Queue Then I Ran Clearml-Session --Queue Myqueu --Docker Torch-Image. The Clearml Session Ended Up Tunneling Into The Physical Machine That My Agent Is Running

Hi it is missing --docker on the agent. Thanks! Dynamic GPU option only available with Enterprise version right?

3 years ago

0 Hi, I Am Trying To Use Clearml-Data To Upload My Data To S3, Which Is Password Protected. How Should I Indicate The Credentials After I Set --Storage S3://.... ?

I see, so its a path. Another question, as far as i can tell, clearml-data will download entire datasets before starting training. This isn't very ideal when we are dealing with billions of datasets (E.g. WE might want to download a subset at a time, send to GPU for training and then use the CPU to concurrently pull another subset.). Any comments on this?

3 years ago

0 Hi, I'M Getting This Long Error When Running

Ok, i guess i will have to kill the whole thing and refresh it.

3 years ago

0 Hi, I'M Getting This Long Error When Running

Thanks that did solve the problem, the tasks are running again.

3 years ago

0 Hi, I'M Getting This Long Error When Running

Alright thanks, i will work on that.

3 years ago

0 Hi, Can I Do A Quick Check If All The Documentation I Find On Trains Are Still Valid For Clearml? Specifically, I Am Looking At Integration Of Clearml And Kubernetes.

thanks GrumpyPenguin23 , i'll look deeper on that. This kinda fits what i am looking for but its for TRAINS and there's no technical how-to.
https://clear.ml/blog/stop-using-kubernetes-for-ml-ops/

3 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

No issues. I know its hard to track open threads with Slack. I wish there's a plugin for this too. 🙂

3 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

Can i dig into the mongodb or ES to pull these data?

3 years ago

0 Hi, I Am Trying To Use Clearml-Data To Upload My Data To S3, Which Is Password Protected. How Should I Indicate The Credentials After I Set --Storage S3://.... ?

Got that thanks. Just to better understand. When clearml-data upload my recursive folder of image data, it convert it into a compressed form with a different folder structure than the original datasets.

When my software pull the data, i'm returned a str. How would we manipulate the data from there?

3 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

ok thanks! will try it out.

3 years ago

0 Hi

Yeah me 3.

3 years ago

0 We'Re Working On Clearml Serving Right Now And Are Very Interested In What You All Are Searching For In A Serving Engine, So We Can Make The Best Serving Engine We Can

Do you mean by this that you want to be able to seamlessly deploy models that were tracked using ClearML experiment manager with ClearML serving?

Ideally that's best. Imagine that i used Spacy (Among other frameworks) and i just need to add the one or two lines of clearml codes in my python scripts and i get to track the experiments. Then when it comes to deployment, i don't have to worry about Spacy having a model format that Triton doesn't recognise.

Do you want clearml serving ...

2 years ago

0 Hi, I Can'T Seem To Set A Password To Clearml, Anyone Seems To Be Able To Just Enter The Username And They Can Enter That Username'S Workspace.

It's a local deployment. I was only presented with username without a need to enter passwords. When I'm in, I don't see an option in my profile to set a password as well. Neither is there integration with ldap for example.

3 years ago

0 Hi, How Might I Use The Sdk To Pull Parameters Of The Agent'S Clearml.Conf Into My Code During Runtime? For Example, If I Wish To Pull The Configuration For Aws.S3.Credentials.Key And Aws.S3.Credentials.Secret?

Hi SuccessfulKoala55 I was refering to the Task.init() or any other SDK API that we use in our training codes.

3 years ago

0 Hi, We Are Having An Interesting Issue Here. We Serve Many Users And Each User Has Their Own Credentials In Accessing The Private Git Repo. We Can'T Seem To Find A Way For The End User To Pass In Their Git Credentials When They Run Their Codes In Both Age

Hi, just wondering if this 'feature: Passing env via the code' is in the works?
https://clearml.slack.com/archives/CTK20V944/p1616677400127900?thread_ts=1616585832.098200&cid=CTK20V944

3 years ago

0 Hi, Can I Ask How I Can Make Clearml-Datasets In Comparison With Pytorch Datasets/Dataloader? In Particular, Pytorch Dataloaders Would Be Able To Batch Pull And Then Preprocess Data Using Multi-Cpus, Feed It Into The Training Loop And Achieve As High Util

Thanks CostlyOstrich36 , how do i know how is the parts indexed in the first place? Or rather, how is chunk and parts defined? Say in the context of images, videos, text documents...etc.

2 years ago

0 Hi, I'M Getting This Long Error When Running

docker exec clearml-elastic curl zsh: no matches found:

3 years ago

0 Hi, I Am Running Several Python Scripts But All For The Same Project/Task. Is It Possible To Task.Init To Existing Running/Completed Task And Adding On The Results?

Hi TimelyPenguin76 , i am adding a debug sample to an existing task using the above method. What should i put for the iteration? I do not want to overwrite existing ones but i do not know what's the last count. This is for both scalar and media reporting.

3 years ago

0 Hi, I Am Running Several Python Scripts But All For The Same Project/Task. Is It Possible To Task.Init To Existing Running/Completed Task And Adding On The Results?

Hi, for both of them, args.lastiter is the exact same value. But when plotted out, they are 2 actually iterations apart.

3 years ago

0 Hi, I Shifted My Clearml Setup To An On-Premise Disconnected Env, Which Has A Pip Repo Setup. I Noted This Warning,

Hi AgitatedDove14 , i changed everything to cuda 10.1 and tried again with the same rrror. the section as follows. I made sure torch==1.6.0+cu101 and torchvision==0.8.2+cu101 are in the pypi repo. But the same error still came up.
` # Python 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0]
boto3 == 1.14.56
clearml == 0.17.4
numpy == 1.19.1
torch == 1.6.0
torchvision == 0.7.0

Detailed import analysis

**************************

IMPORT PACKAGE boto3

clearml.storage: 0

IMPORT PACKAG...

3 years ago

0 Hi, I Shifted My Clearml Setup To An On-Premise Disconnected Env, Which Has A Pip Repo Setup. I Noted This Warning,

AlertBlackbird30 , Actually the log says 10.2.
docker_cmd = nvidia/cuda:10.2-devel-ubuntu18.04 -e GIT_SSL_NO_VERIFY=true

3 years ago

0 Hi, I'M Getting This Long Error When Running

Can i somehow perform an export or backup?

3 years ago

0 Hi, Can I Get Clearml To Not Print Anything Other Than The Prints From My Codes? The Reason Is Because Clearml Is Printing The Username And Passwords I Passed To The Container Via Env Vars.

Hi SuccessfulKoala55 , thanks. Opened issue on the CLearml-Agent GH at https://github.com/allegroai/clearml-agent/issues/67

3 years ago

0 Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

Hi, the problem is the same.

I noticed that its not checking out the latest version in gitlab. This latest version would contain the requirements.txt.
Using cached repository in "/root/.clearml/vcs-cache/pytorchmnist.f220373e7227ec760b28c7f4cd99b534/pytorchmnist" warning: redirecting to Note: checking out 'cfb833bcc70f3e10d3b6a96cfad3225ed682382b'.But i'm guessing this block below applied the diff..does it include the requirements.txt though?
` HEAD is now at cfb833b Upload New Fil...

3 years ago

0 Hi, I Have Been Getting The Following For A While. Is There A More Detailed Log I Can Look Into? This Happens On Both Https And Http.

I thought of another potential way but not sure if the SDK supports it.
We will perform manual save and upload of model using vanilla boto3 and credentials passed in as env var. Use ClearML SDK to update the Model Repo on the location of the model, without ClearML uploading it explicitly.Would the above work?

3 years ago

0 Hi, I Have A Scenario Where When The Code Is Run Remotely Via Clearml-Agent, The Code Appears To Get Stuck At

Is there anyway to see an error log from that?

one year ago

0 Hi, I'M Working On A Post Deployment Data And Model Monitoring Using Clearml. The Idea Is This.

Hi SuccessfulKoala55 , just wondering how i can follow up on this.

2 years ago

0 Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

Yes of cos, its a long one.

3 years ago

0 Hi, I Am Trying To Use Clearml-Data To Upload My Data To S3, Which Is Password Protected. How Should I Indicate The Credentials After I Set --Storage S3://.... ?

like create multiple datasets?
create parent (all) - upload to S3
create child1 (first 100k)
create child2 (second 100k)...blah blah

Then only pull indices from children. Technically workable but not sure if its best approach since different ppl have different batch sizes in mind.

3 years ago

Show more results