SubstantialElk6

117 Questions, 310 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

282 × Eureka!

Questions 117
Answers 310

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Hi, I Was Using The K8S Glue And It Worked Fine On One Project But Didn'T Work On Another. At The Point Just Before A Git Clone Was Executed, I Get The Error

Hi, i was using the K8S Glue and it worked fine on one project but didn't work on another. At the point just before a git clone was executed, i get the error...

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

[Distributed Training] Hi, I Have A Clearml Setup With K8Sglue That Spins Up Pods Of 4 Gpus When Picking Tasks Off The Clearml Queue. We Would Now Want To Proceed With Multi-Node Training, And Some Of The Examples We Are Trying Are Here.

[Distributed Training] Hi, i have a ClearML setup with K8SGlue that spins up pods of 4 GPUs when picking tasks off the clearml queue. We would now want to pr...

clearml

one year ago

0 Votes

2 Answers

939 Views

0 Votes 2 Answers 939 Views

Hi, We Have Been Using Clearml In Our Development Environment To Train Our Models And Benchmarking Them. I Was Wondering What Is Clearml'S Role In Transition To (Production. Two Specific Points, Deployment, And Automated Retraining Pipeline.

Hi, we have been using ClearML in our development environment to train our models and benchmarking them. I was wondering what is ClearML's role in transition...

clearml

3 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hi, Several Changes Occurred Recently And I Would Like To Know If There'S A Way To Verbose Catch All The Printout That Happening Within A K8S Glue Spawned Pod. We Have An Issue Where All Of Our New Remote_Execution Tasks Are Stuck In The 'Pending' Stage.

Hi, several changes occurred recently and i would like to know if there's a way to verbose catch all the printout that happening within a k8s glue spawned po...

mlops

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi. Is This Line In The Roadmap Article Still Valid, Is It Showing Up In Clearml-Serving?

Hi. is this line in the roadmap article still valid, is it showing up in clearml-serving? https://clear.ml/blog/clearml-community-roadmap/ I believe we final...

clearml

2 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, I'Ve Multiple Tasks Setup In A Complex Pipeline. How Can I;

Hi, I've multiple tasks setup in a complex pipeline. How can I; Define prior to running the pipeline, which tasks to be running on which remote queue using w...

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

[Security] Hi, One Of Our Teams Noted That Previews Of Clearml-Data Datasets Are Saved In The Files_Server (Indicated In Clearml.Conf) Instead Of The Indicated Output_Uri In The Dataset.Create Argument. This Results In A Security Breach. May I Ask If This

[Security] Hi, one of our teams noted that previews of clearml-data datasets are saved in the files_server (indicated in ClearML.conf) instead of the indicat...

dataset

one year ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi, I Noticed That All Other Users Can See My Experiments. Does Clearml Has The Feasibility To Only Allow Certain Groups Of People To See Each Other'S Work?

Hi, i noticed that all other users can see my experiments. Does ClearML has the feasibility to only allow certain groups of people to see each other's work?

clearml

3 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hi, I'M Getting This Long Error When Running

Hi, i'm getting this long error when running task.execute_remotely(queue_name="1gpu", exit_process=True) . I also notices an error Failed to fetching activit...

clearml

3 years ago

0 Votes

29 Answers

1K Views

0 Votes 29 Answers 1K Views

Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

Hi, I started my agent using. clearml-agent daemon --gpus 0 --queue gpu --docker --foreground, with the following parameters in clearml.conf. default_docker:...

mlops

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, I Would Like To Understand The Dataflow When Using Clearml-Data. When I Use Clearml-Data Cli To Get Versioned Data. Does The Data Proxy Though Clearml Server Before Arriving To The Client, Or Clearml-Data Is Directly Pulling From The S3 Storage? Assum

Hi, i would like to understand the dataflow when using clearml-data. When i use clearml-data CLI to get versioned data. Does the data proxy though ClearML Se...

dataset

2 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, I Had A Task Successfully Completed. Then I Cloned It And Enqueued It Again Without Any Changes. But The Task Ends Up With An Error. Here'S The Logs, Not Sure What Went Wrong.

Hi, i had a task successfully completed. Then i cloned it and enqueued it again without any changes. But the task ends up with an error. Here's the logs, not...

clearml

3 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, I Was Adding Data Using Clearml-Data And Get The Following Consistent Errors.

Hi, i was adding data using clearml-data and get the following consistent errors. Retrying (Retry(total=237, connect=237, read=240, redirect=240, status=240)...

dataset

2 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi, I'M Running Clearml Agents Via K8S Glue. I Noticed That The Agent Is Not Pulling Latest Images Even Though

Hi, I'm running clearml agents via K8s glue. I noticed that the agent is not pulling latest images even though docker_force_pull is set to true. A kubectl de...

mlops

3 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hi, I Shifted My Clearml Setup To An On-Premise Disconnected Env, Which Has A Pip Repo Setup. I Noted This Warning,

Hi, i shifted my clearml setup to an on-premise disconnected env, which has a pip repo setup. I noted this warning, Trying pip install: /root/.clearml/venvs-...

pytorch

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, How Is The Priority Of The Configuration Like? Which One Takes Precedence? For Example, Output_Uri

Hi, how is the priority of the configuration like? Which one takes precedence? For example, output_uri default_output_uri in clearml.conf on client files_ser...

dataset

2 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi, How Can I Pass A Env Variable To The Docker That'S Running The Agent When I Run This? I'M Havving Issues With The Agent'S Git Clone Where It Requires Sslverification To Be Disabled. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground

Hi, how can i pass a env variable to the docker that's running the agent when i run this? I'm havving issues with the agent's git clone where it requires ssl...

mlops

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Clearml-Agent Didn'T Seem To Take The Ca Store From The Os. Where Can I Point Clearml To The Ca Certs, In Particular For Uploading Of Models Into S3. At The Moment I Am Simply Disabling Verification.

Clearml-Agent didn't seem to take the CA store from the OS. Where can i point ClearML to the CA certs, in particular for uploading of models into S3. At the ...

clearml

3 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, I Was Uploading An Image Artifact Using The Following But In The Preview I Only Get An Array Instead Of An Image. Am I Doing Something Wrong? ``` Im=Cv2.Imread('Pic.Jpg') Task.Upload_Artifact('Myimage',I'M) ```

Hi, I was uploading an image artifact using the following but in the preview I only get an array instead of an image. Am I doing something wrong? im=cv2.imre...

clearml

3 years ago

0 Votes

8 Answers

978 Views

0 Votes 8 Answers 978 Views

I Just Getting This In My Agent Run Task. Would Appreciate If Someone Can Advise Where I Externalrequirement Is Pointing At.

I just getting this in my agent run task. Would appreciate if someone can advise where i externalrequirement is pointing at. RequirementsManager handler rais...

mlops

3 years ago

0 Votes

0 Answers

945 Views

0 Votes 0 Answers 945 Views

Current Configuration (Clearml_Agent V0.17.2Rc4, Location: /Root/Clearml.Conf): ---------------------- Agent.Worker_Id = Dgxstation-2:Gpu3 Agent.Worker_Name = Dgxstation-2 Agent.Force_Git_Ssh_Protocol = False Agent.Python_Binary = Agent.Package_Manager.T

Current configuration (clearml_agent v0.17.2rc4, location: /root/clearml.conf): ---------------------- agent.worker_id = dgxstation-2:gpu3 agent.worker_name ...

pytorch

3 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi, I Can'T Seem To Set A Password To Clearml, Anyone Seems To Be Able To Just Enter The Username And They Can Enter That Username'S Workspace.

Hi, i can't seem to set a password to clearml, anyone seems to be able to just enter the username and they can enter that username's workspace.

clearml

3 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Hi, We Are Using Gitlab And It Is A Security Requirement To Use Ssh Keys To Access The Repos For Each Individual. We Are Also Using K8S Glue. Is There Any Provisions To Do This Seamlessly?

Hi, we are using GitLab and it is a security requirement to use ssh keys to access the repos for each individual. We are also using k8s glue. Is there any pr...

clearml

3 years ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Hi, I Would Like To Understand More On How Clearml Deal With Codes.

Hi, i would like to understand more on how ClearML deal with codes. I noticed that i am able to read the source codes of the python script that i have used a...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hi, We Are Working On A Mini Project To 'Integrate' Clearml Datasets With Ckan. Wondering If The Community Could Share Some Ideas.

Hi, we are working on a mini project to 'integrate' ClearML Datasets with CKAN. Wondering if the community could share some ideas.

clearml

2 years ago

0 Votes

0 Answers

891 Views

0 Votes 0 Answers 891 Views

Hi, We Are Encountering An Increasing Number Of Cases Where It Takes Quite A While Before Actual Training (Gpu Utilisation) Can Be Done. After Observing, This Is What We Discovered. The Following Are The Steps And Bottlenecks.

Hi, we are encountering an increasing number of cases where it takes quite a while before actual training (GPU utilisation) can be done. After observing, thi...

clearml

one year ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, I Was Running My Agent And Had A Few Scripts For Agent.Extra_Docker_Shell_Script. But When I Looked Through The Logs, They Were Not Executed. Any Idea Why? Using Agent V1.01R1 In K8S Glue.

Hi, i was running my agent and had a few scripts for agent.extra_docker_shell_script. but when I looked through the logs, they were not executed. Any idea wh...

mlops

3 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, Can I Default A Docker Image When Running A Pipeline? I Currently Set It As

Hi, can i default a docker image when running a pipeline? I currently set it as pipe = PipelineController(...) pipe.task.setbase_docker("ubuntu:20:04") pipe....

clearml

2 years ago

0 Votes

1 Answers

991 Views

0 Votes 1 Answers 991 Views

Hi, I'Ve Three Questions Regarding Clearml Pipelines.

Hi, I've three questions regarding clearml pipelines. - can I check when we use a clearml pipeline and data get transferred from stage to stage, do the data ...

clearml

one year ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, In The New Datasets Ui. It Doesn'T Seem To Display The Entire Lineage Of The Datasets. For Example. If A Dataset Is Create As Such Id1 (Parent)->Id2, Then Another Dataset Created As Id2(Parent)-> Id3. When You Look At Id3, It Only Shows Id2 As Parent.

Hi, in the new datasets UI. It doesn't seem to display the entire lineage of the datasets. For example. if a dataset is create as such id1 (parent)->id2, the...

clearml

2 years ago

Show more results

0 Hi, Is There Any Code Examples Of How Dataops Is Being Established?

Transform feature engineering and data processing code into recurring data ingestion workflows. Start building data stores, develop, automate, and schedule complex data processing jobs.

3 years ago

0 Hi, Is There Any Code Examples Of How Dataops Is Being Established?

Yeah that'll cover the first two points, but I don't see how it'll end up as a dataset catalogue as advertised.

3 years ago

0 Hi, I Noted That If I Run My Codes On My Laptop With Remote_Execute Off A Python3.8 Venv, And When The Remote Task Starts Executing But The Image Is Installed With A Different Version Of Python, Say Python3.8, We Would Encounter Errors With Venv. At This

They don't have the same version. I do seem to notice that if the client is using version 3.8, during remote execution will try to use that same version despite the docker image not installed with that version.

3 years ago

0 Hi, I Was Using The K8S Glue And It Worked Fine On One Project But Didn'T Work On Another. At The Point Just Before A Git Clone Was Executed, I Get The Error

Hi, so you meant i need to installl virtualenv in my base image?

3 years ago

0 Hi, I Cant Access The Clearml Ui, Is There Something Wrong With Your Servers?

Hi, it looks like the entire http://clear.ml domain is offline for more than 12 hours. Main pages and documentation are inaccessible as well.

2 years ago

0 Hi, I Was Using The K8S Glue And It Worked Fine On One Project But Didn'T Work On Another. At The Point Just Before A Git Clone Was Executed, I Get The Error

Hi this is the log. I didn't see any attempt from the agent to install virtualenv on the base image.
` 1618369068169 clearml-gpu-id-b926b4b809f544c49e99625380a1534b:gpuGPU-4ad68290-0daf-4634-6768-16fad73d47a3 DEBUG Current configuration (clearml_agent v0.17.2, location: /tmp/.clearml_agent.wgsmv2t9.cfg):

agent.worker_id = clearml-gpu-id-b926b4b809f544c49e99625380a1534b:gpuGPU-4ad68290-0daf-4634-6768-16fad73d47a3
agent.worker_name = clearml-gpu-id-b926b4b809f544c49e99625...

3 years ago

0 The Party Continues On Reddit. I'M Sure There Will Be Questions From Non-Users, I Hope You Could All Join In And Answer!

Congrats on v1.0. 🎉

3 years ago

0 Hi, Is There Any Code Examples Of How Dataops Is Being Established?

The first is probably done using pipeline controllers, the second using Datasets or HyperDatasets. Its not very clear how the last one is achieved, especially on the searchable data catalogs.

3 years ago

0 Hi, I Noticed That All Other Users Can See My Experiments. Does Clearml Has The Feasibility To Only Allow Certain Groups Of People To See Each Other'S Work?

Hi, Self-hosted using docker-compose.

3 years ago

0 Hi, I'M Running Clearml Agents Via K8S Glue. I Noticed That The Agent Is Not Pulling Latest Images Even Though

Ok. Problem was resolved with latest version of clearml-agent and clearml.

3 years ago

0 Hi, Can I Default The Clearml Fileserver To A S3 Path?

In the ClearML config that's being run by the ClearML container?

3 years ago

0 Hi, If I'Ve Clearml Agents Installed On Several Servers, Each With A Single Gpu. How Can I Train A Gpt2 Model That Would Require Multiple Gpus?

Thanks. The challenge we encountered is that we only expose our Devs to the ClearML queues, so users have no idea what's beyond the queue except that it will offer them the resources associated with the queue. In the backend, each queue is associated with more than one host.

So what we tried is as followed.
We create a train.py script much like what Tobias shared above. In this script, we use the socket library to pull the ipaddr.

import socket
hostname=socket.gethostname()
ipaddr=dock...

one year ago

0 Hi, If I'Ve Clearml Agents Installed On Several Servers, Each With A Single Gpu. How Can I Train A Gpt2 Model That Would Require Multiple Gpus?

From ClearML perspective, how would we enable this, considering we don't have direct control or even IP of the agents

one year ago

0 Hi, Clearml Console Leaks Credentials Passed In As Env Vars. The Issue Remains With Clearml Version==1.1.1.135 - 1.1.1 - 2.1.4 (As Listed On The Profile Page) I Am Using K8S Glue And The Clearml.Conf Has The Following In The Agent Section.

ok, i'll wait till i get my hands on vault then. thanks.

3 years ago

0 Hi, I'M Running Clearml Agents Via K8S Glue. I Noticed That The Agent Is Not Pulling Latest Images Even Though

thanks SuccessfulKoala55 . I verified your last comment and it works.

3 years ago

0 Hi, If I'Ve Clearml Agents Installed On Several Servers, Each With A Single Gpu. How Can I Train A Gpt2 Model That Would Require Multiple Gpus?

Yeah.. issue is ClearML unable to talk to the nodes cos pytorch distributed needs to know their IP. There is some sort of integration missing that would enable this.

one year ago

0 I Got An Interesting Question From My Devs. If They Wish To Do Distributed Training, Is Clearml K8S Glue Suitable For It? Local Multiple Gpu: Just A Matter Of Assigning More Than One Gpu In The Yaml File Sent To The K8S Glue. Question Is How To Make This

Sorry, dev end I was referring to my developers.

I didn't think Horovod needs to be as complicated as you described. It can also work by running on multiple known nodes. How would i add a glue for multinode?

Horovod does also work with other similar products such as yours (E.g. Polyaxon).

3 years ago

0 We'Re Working On Clearml Serving Right Now And Are Very Interested In What You All Are Searching For In A Serving Engine, So We Can Make The Best Serving Engine We Can

I think a related question is, ClearML replies heavily on Triton (Good thing) but Triton only support a few frameworks out of the box. So this 'engine' need to make sure its can work with Triton and use all its wonderful features such as request batching, GPU reuse...etc.

2 years ago

0 Hi. Try To Use Clearml On Work. I'M Have Problem With Clearml-Agent, Because On Work We Dont Have Internet Acceses. For Install Packages We Used Mirror Pypi (Not All Packages) And Manualy Add Package On Disk With Line In Pip.Conf --Follow-Link=~/Pypi. It

I used nvcr pytorch image and instruct clearml to inherit global dependencies. No need to install torch and work well.

3 years ago

0 Hi, I Have A Future Roadmap Question On Clearml-Datasets. The Current Implementation Works Well For Small Datasets But Its Rather In Effective For Very Large Datasets. For Example, Let'S Say I Have 10 Million Images Just For The Training Dataset, And My T

Yes! I definitely think this is important, and hopefully we will see something there

(or at least in the docs)

Hi AgitatedDove14 , any updates in the docs to demonstrate this yet?

3 years ago

0 Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

Next step to figure out if i can do all that in the python code instead of UI.

3 years ago

0 Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

Hi,
It did, nvidia/cuda:10.1-runtime-ubuntu18.04.

So if i need to set this every time, what is the following config for? And how do i pass in new env parameters?
` default_docker: {
# default docker image to use when running in docker mode
image: "dockerrepo/mydocker:custom"

    # optional arguments to pass to docker image
    # arguments: ["--ipc=host", ]
    arguments: ["--env GIT_SSL_NO_VERIFY=true",]
} `

3 years ago

0 Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

Ok thanks, that worked.

3 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

No issues. I know its hard to track open threads with Slack. I wish there's a plugin for this too. 🙂

3 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

Any idea where i can find the relevant API calls for this?

3 years ago

0 Hi, We Noted That Using K8S Glue, There Are Some Situations Where The Task Cannot Be Registered As Error And Will Be Stuck At Pending. An Example Of One Situation Is When The Task Is Pulling A Docker Image That Doesn'T Exist. Is There A Way To Catch Such

Oh, this meant i have been using the latest agent which is v1.0.0. The problems were still there.

3 years ago

0 Hi, I'Ve A Few Questions On Clearml-Session.

Unfortunately due to security, clients can't have direct access to the nodes. Is there any possible workarounds at the moment?

3 years ago

0 Sorry, I'M Asking Too Much Questions Today. I Gave Myself A Whole Day To Fully Evaluate Clearml...That'S Why. Here Goes. Regarding Automatic Logging (Automagikal), I Took Your Example (

Thanks. Have a better understanding now.

3 years ago

0 Hi, I Am Trying To Use Clearml-Data To Upload My Data To S3, Which Is Password Protected. How Should I Indicate The Credentials After I Set --Storage S3://.... ?

like create multiple datasets?
create parent (all) - upload to S3
create child1 (first 100k)
create child2 (second 100k)...blah blah

Then only pull indices from children. Technically workable but not sure if its best approach since different ppl have different batch sizes in mind.

3 years ago

Hi SuccessfulKoala55 , is there a channel here that posts version updates?

3 years ago

Show more results