SubstantialElk6

117 Questions, 310 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

282 × Eureka!

Questions 117
Answers 310

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

Hi, I'M Using The K8S Glue And Have A Few Questions.

Hi, I'm using the k8s glue and have a few questions. Noted that it's not requesting the http://nvidia.com/gpu thus the pod created doesn't have a GPU resourc...

clearml

3 years ago

0 Votes

3 Answers

973 Views

0 Votes 3 Answers 973 Views

Hi, I Have A Docker Image That Needs To Be Run In Privileged Mode. How Should I Do The Following?

Hi, i have a docker image that needs to be run in privileged mode. How should i do the following? clearml-session: Pass the --privileged option along --docker ?

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, Would You Have A Working Example On This?

Hi, would you have a working example on this?

clearml

3 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Just Wondering, Why Aren'T You Guys Getting Yourselves Known In Gtc?

Just wondering, why aren't you guys getting yourselves known in GTC?

clearml

3 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hi, I Have Been Getting The Following For A While. Is There A More Detailed Log I Can Look Into? This Happens On Both Https And Http.

Hi, i have been getting the following for a while. Is there a more detailed log i can look into? This happens on both https and http. 2021-05-27 08:47:02,539...

clearml

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, How Can I Make A Stage In A Clearml Pipeline Non-Blocking? The Scenario Is That Stages Downstream Needed Runtime Info From The First Stage, However The First Stage Needs To Continue Running To Act As A Monitor For The Other Downstream Stages.

Hi, how can i make a stage in a clearml pipeline non-blocking? The scenario is that stages downstream needed runtime info from the first stage, however the f...

clearml

one year ago

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

Hi, I'Ve A Few Questions On Clearml-Session.

Hi, I've a few questions on clearml-session. We will be running some GUI applications so is it possible to forward the GUI to the clearml-session? We have a ...

mlops

3 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, We Would Like To Incorporate Some Approval Process In Clearml. One Of The Needs Is To Attach Some Pdfs And Word Docs To A Published Experiment, Preferbly Through The Web Ui. The Attachments Could Be In The Form Of The Actual Files, Or Links To The Fil

Hi, we would like to incorporate some approval process in ClearML. One of the needs is to attach some PDFs and word docs to a published experiment, preferbly...

clearml

one year ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, The `

Hi, the https://github.com/allegroai/trains/blob/master/examples/services/jupyter-service/execute_jupyter_notebook_server.py file linked by following page is...

clearml

3 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, I Am Working On Creating Retraining Pipelines In Production. The Way I'M Doing This Is To Install Clearml-Server On My Production. Then I Recreate The Ingestion, Preprocessing And Training/Opt Tasks Into A Clearml-Pipeline. Thereafter, I Would Call

Hi, i am working on creating retraining pipelines in production. The way i'm doing this is to install clearml-server on my production. Then i recreate the in...

clearml

2 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi Recently Upgraded All The Clearml, Clearml-Server, Clearml-Agent. Now Running K8S Glue With Clearml-Agent=1.0.1Rc1.

Hi recently upgraded all the clearml, clearml-server, clearml-agent. Now running k8s glue with clearml-agent=1.0.1rc1. python3 k8s_glue_example.py --queue 1b...

clearml

3 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

Hi, v1 of agent seems to have removed agent.package_manager.force_repo_requirements_txt. Is this still available in other forms?

mlops

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, Can Clearml-Server Support Replicaset In K8S?

Hi, can Clearml-Server support ReplicaSet in K8S?

clearml

2 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, Can I Choose Not Print The Clearml-Agent Config Logs In The Console? Reason Is We Are Passing Credentials Via Env Var To The K8S Glue And Its Being Displayed In The Console As ...

Hi, can i choose not print the clearml-agent config logs in the console? Reason is we are passing credentials via env var to the k8s glue and its being displ...

clearml

3 years ago

0 Votes

7 Answers

975 Views

0 Votes 7 Answers 975 Views

Hi, I'M Attempting To Upgrade My Clearml Server On Offline Env. I Wish To Retain All Existing Data. Can I Check If It Suffice To Just Docker-Compose Down --Remove-Orphans Replace Clearml-Server:Latest And Clearml-Agent-Services:Latest With Latest Pull.

Hi, I'm attempting to upgrade my clearml server on offline env. I wish to retain all existing data. Can I check if it suffice to just docker-compose down --r...

clearml

3 years ago

0 Votes

3 Answers

977 Views

0 Votes 3 Answers 977 Views

Hi, Can I Get Clearml To Not Print Anything Other Than The Prints From My Codes? The Reason Is Because Clearml Is Printing The Username And Passwords I Passed To The Container Via Env Vars.

Hi, can i get ClearML to not print anything other than the prints from my codes? The reason is because clearml is printing the username and passwords i passe...

clearml

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, I Have A Question About Clearml-Data. Clearml-Data Probably Does Well On Data Versioning, But When It Comes To Actual Loading Of Data, Are There Examples Of How It Can Make Use Of Advanced Features Such That Those In

Hi, i have a question about clearml-data. Clearml-Data probably does well on Data Versioning, but when it comes to actual loading of data, are there examples...

clearml

3 years ago

0 Votes

9 Answers

1K Views

0 Votes 9 Answers 1K Views

Hi, We Are Having Issues With Clearml-Session For Vscode. Apparently It'S Hardcoded To Download From

Hi, we are having issues with clearml-session for vscode. Apparently it's hardcoded to download from https://github.com/microsoft/vscode-python/releases but ...

remote-ssh

3 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, I'M Wondering If Clearml Did A Comparison Of Their Clearml Pipelines With Other Solutions Such As Apache Beam? Or If Clearml Supports Integration With Such Third Party Solutions?

Hi, i'm wondering if ClearML did a comparison of their ClearML Pipelines with other solutions such as Apache Beam? Or if ClearML supports integration with su...

clearml

2 years ago

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

Hi We Have Had Some Crashes On Clearml Server And It Was Caused By Clearml Uploading The Models Into Clearml Server (By Default). Is It Possible To Have An Overriding Config So Clients Can Never Upload To Clearml Server Itself As Default?

Hi we have had some crashes on ClearML server and it was caused by ClearML uploading the models into ClearML server (by default). Is it possible to have an o...

clearml

2 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hi, I'M Getting This Long Error When Running

Hi, i'm getting this long error when running task.execute_remotely(queue_name="1gpu", exit_process=True) . I also notices an error Failed to fetching activit...

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

So I'Ve Install Allegro On Kubernetes Using Helm, How To I Perform

So i've install allegro on Kubernetes using helm, how to i perform trains-init ?

clearml

4 years ago

0 Votes

0 Answers

50 Views

0 Votes 0 Answers 50 Views

Hi Can I Ask How Clearml Support Distributed Training Via K8Sglue? Kubeflow Operator Support Distributed Training On Kubernetes Cluster, Managing The Pods Seamlessly.

Hi Can i ask how ClearML support distributed training via K8SGlue? Kubeflow Operator support distributed training on Kubernetes cluster, managing the pods se...

clearml

15 days ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi. Does Clearml Support Auto-Magical (Scalars) For Scenarios Where We Need To Execute Third Party Installed Toolkits, Which Runs Training With Common Ml Frameworks Such As Pytorch And Tensorflow? Two Examples Below. Clearml Can Capture The Console Output

Hi. Does ClearML support auto-magical (scalars) for scenarios where we need to execute third party installed toolkits, which runs training with common ML fra...

pytorch

one year ago

0 Votes

12 Answers

1K Views

0 Votes 12 Answers 1K Views

Hi, I Am Running Several Python Scripts But All For The Same Project/Task. Is It Possible To Task.Init To Existing Running/Completed Task And Adding On The Results?

Hi, i am running several python scripts but all for the same project/task. Is it possible to Task.init to existing running/completed task and adding on the r...

clearml

3 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi, Trying To Understand Clearml-Session. I Have An Agent Running On A Machine Monitoring A Queue Then I Ran Clearml-Session --Queue Myqueu --Docker Torch-Image. The Clearml Session Ended Up Tunneling Into The Physical Machine That My Agent Is Running

Hi, trying to understand clearml-session. I have an agent running on a machine monitoring a queue Then I ran clearml-session --queue myqueu --docker torch-im...

mlops remote-ssh

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, How Might I Use The Sdk To Pull Parameters Of The Agent'S Clearml.Conf Into My Code During Runtime? For Example, If I Wish To Pull The Configuration For Aws.S3.Credentials.Key And Aws.S3.Credentials.Secret?

Hi, how might i use the SDK to pull parameters of the agent's clearml.conf into my code during runtime? For example, if i wish to pull the configuration for ...

clearml

3 years ago

0 Votes

2 Answers

990 Views

0 Votes 2 Answers 990 Views

Hi, I Have This Python Package That'S Located On My Base Image..(E.G. /Code/App/Flair). Within Then Folder There'S A Package Called Flair And A Data.Py File. I Appended Python Path With /Code/App/Flair In My Base Image And Execute It Using K8S Glue. In T

Hi, I have this python package that's located on my base image..(e.g. /code/app/flair). Within then folder there's a package called flair and a data.py file....

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, I Was Trying To Add A Task Under Myproject/Taskname That Contains Dataset.Create Into A Pipeline. Before That, I Create The Dataset.Create Task But When It Completes, Its Moved To Myproject/.Datasets/Taskname. I Ended Up Not Being Able To Use The Task

Hi, i was trying to add a task under myproject/taskname that contains dataset.create into a pipeline. Before that, i create the dataset.create task but when ...

clearml

one year ago

0 Votes

9 Answers

1K Views

0 Votes 9 Answers 1K Views

Hi, Just To Check. Does The K8S Glue Install Torch By Default? I'M Getting

Hi, just to check. Does the k8s glue install torch by default? I'm getting Warning: could not resolve python wheel replacement for torch==1.8.0 even though i...

tensorflow

3 years ago

Show more results

0 Hi, Is There Any Code Examples Of How Dataops Is Being Established?

Transform feature engineering and data processing code into recurring data ingestion workflows. Start building data stores, develop, automate, and schedule complex data processing jobs.

3 years ago

0 Hi, Is There Any Code Examples Of How Dataops Is Being Established?

Yeah that'll cover the first two points, but I don't see how it'll end up as a dataset catalogue as advertised.

3 years ago

0 Hi, I Noted That If I Run My Codes On My Laptop With Remote_Execute Off A Python3.8 Venv, And When The Remote Task Starts Executing But The Image Is Installed With A Different Version Of Python, Say Python3.8, We Would Encounter Errors With Venv. At This

They don't have the same version. I do seem to notice that if the client is using version 3.8, during remote execution will try to use that same version despite the docker image not installed with that version.

3 years ago

0 Hi, I Was Using The K8S Glue And It Worked Fine On One Project But Didn'T Work On Another. At The Point Just Before A Git Clone Was Executed, I Get The Error

Hi, so you meant i need to installl virtualenv in my base image?

3 years ago

0 Hi, I Cant Access The Clearml Ui, Is There Something Wrong With Your Servers?

Hi, it looks like the entire http://clear.ml domain is offline for more than 12 hours. Main pages and documentation are inaccessible as well.

2 years ago

0 Hi, I Was Using The K8S Glue And It Worked Fine On One Project But Didn'T Work On Another. At The Point Just Before A Git Clone Was Executed, I Get The Error

Hi this is the log. I didn't see any attempt from the agent to install virtualenv on the base image.
` 1618369068169 clearml-gpu-id-b926b4b809f544c49e99625380a1534b:gpuGPU-4ad68290-0daf-4634-6768-16fad73d47a3 DEBUG Current configuration (clearml_agent v0.17.2, location: /tmp/.clearml_agent.wgsmv2t9.cfg):

agent.worker_id = clearml-gpu-id-b926b4b809f544c49e99625380a1534b:gpuGPU-4ad68290-0daf-4634-6768-16fad73d47a3
agent.worker_name = clearml-gpu-id-b926b4b809f544c49e99625...

3 years ago

0 The Party Continues On Reddit. I'M Sure There Will Be Questions From Non-Users, I Hope You Could All Join In And Answer!

Congrats on v1.0. 🎉

3 years ago

0 Hi, Is There Any Code Examples Of How Dataops Is Being Established?

I see. Is there a more elaborate codeset that describes the above interactions?

3 years ago

0 Hi, Is There Any Code Examples Of How Dataops Is Being Established?

The first is probably done using pipeline controllers, the second using Datasets or HyperDatasets. Its not very clear how the last one is achieved, especially on the searchable data catalogs.

3 years ago

0 Hi, I Noticed That All Other Users Can See My Experiments. Does Clearml Has The Feasibility To Only Allow Certain Groups Of People To See Each Other'S Work?

Hi, Self-hosted using docker-compose.

3 years ago

0 Hi, I'M Running Clearml Agents Via K8S Glue. I Noticed That The Agent Is Not Pulling Latest Images Even Though

Ok. Problem was resolved with latest version of clearml-agent and clearml.

3 years ago

0 Hi, Can I Default The Clearml Fileserver To A S3 Path?

In the ClearML config that's being run by the ClearML container?

3 years ago

0 Hi, I Would Like To Start Logging How Often My Users Uses Clearml. How Might I Query This Kind Of Information?

Ok sure. Thanks.

2 years ago

0 Hi, I Have A Docker Image That Needs To Be Run In Privileged Mode. How Should I Do The Following?

ah... thanks!

3 years ago

0 Hi, If I'Ve Clearml Agents Installed On Several Servers, Each With A Single Gpu. How Can I Train A Gpt2 Model That Would Require Multiple Gpus?

Thanks. The challenge we encountered is that we only expose our Devs to the ClearML queues, so users have no idea what's beyond the queue except that it will offer them the resources associated with the queue. In the backend, each queue is associated with more than one host.

So what we tried is as followed.
We create a train.py script much like what Tobias shared above. In this script, we use the socket library to pull the ipaddr.

import socket
hostname=socket.gethostname()
ipaddr=dock...

one year ago

0 Hi, If I'Ve Clearml Agents Installed On Several Servers, Each With A Single Gpu. How Can I Train A Gpt2 Model That Would Require Multiple Gpus?

From ClearML perspective, how would we enable this, considering we don't have direct control or even IP of the agents

one year ago

0 Hi, Clearml Console Leaks Credentials Passed In As Env Vars. The Issue Remains With Clearml Version==1.1.1.135 - 1.1.1 - 2.1.4 (As Listed On The Profile Page) I Am Using K8S Glue And The Clearml.Conf Has The Following In The Agent Section.

ok, i'll wait till i get my hands on vault then. thanks.

3 years ago

0 Hi, I'M Running Clearml Agents Via K8S Glue. I Noticed That The Agent Is Not Pulling Latest Images Even Though

thanks SuccessfulKoala55 . I verified your last comment and it works.

3 years ago

0 Hi, If I'Ve Clearml Agents Installed On Several Servers, Each With A Single Gpu. How Can I Train A Gpt2 Model That Would Require Multiple Gpus?

Yeah.. issue is ClearML unable to talk to the nodes cos pytorch distributed needs to know their IP. There is some sort of integration missing that would enable this.

one year ago

0 I Got An Interesting Question From My Devs. If They Wish To Do Distributed Training, Is Clearml K8S Glue Suitable For It? Local Multiple Gpu: Just A Matter Of Assigning More Than One Gpu In The Yaml File Sent To The K8S Glue. Question Is How To Make This

Sorry, dev end I was referring to my developers.

I didn't think Horovod needs to be as complicated as you described. It can also work by running on multiple known nodes. How would i add a glue for multinode?

Horovod does also work with other similar products such as yours (E.g. Polyaxon).

3 years ago

0 We'Re Working On Clearml Serving Right Now And Are Very Interested In What You All Are Searching For In A Serving Engine, So We Can Make The Best Serving Engine We Can

I think a related question is, ClearML replies heavily on Triton (Good thing) but Triton only support a few frameworks out of the box. So this 'engine' need to make sure its can work with Triton and use all its wonderful features such as request batching, GPU reuse...etc.

2 years ago

0 Hi, How Can I Pass A Env Variable To The Docker That'S Running The Agent When I Run This? I'M Havving Issues With The Agent'S Git Clone Where It Requires Sslverification To Be Disabled. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground

Executing task id [228caa5d25d94ac5aa10fa7e1d02f03c]:
repository = https://192.168.50.88:18443/tkahsion/pytorchmnist
branch = master
version_num = cfb833bcc70f3e10d3b6a96cfad3225ed682382b
tag =
docker_cmd = nvidia/cuda:10.1-runtime-ubuntu18.04
entry_point = pytorch_mnist.py
working_dir = .
Warning: could not locate requested Python version 3.9, reverting to version 3.6
Using base prefix '/usr'
New python executable in /root/.clearml/venvs-builds/3.6/bin/python3.6
Also creating executable i...

3 years ago

0 So I Bumped Onto This Comparison Shared By Dagshub. It Kinda Placed Clearml Is A Rather Bad Position Compared To Everything Else In The Industry.

yeah, someone should call them out.

3 years ago

0 Hi. Try To Use Clearml On Work. I'M Have Problem With Clearml-Agent, Because On Work We Dont Have Internet Acceses. For Install Packages We Used Mirror Pypi (Not All Packages) And Manualy Add Package On Disk With Line In Pip.Conf --Follow-Link=~/Pypi. It

I used nvcr pytorch image and instruct clearml to inherit global dependencies. No need to install torch and work well.

3 years ago

0 Hi, I Have A Future Roadmap Question On Clearml-Datasets. The Current Implementation Works Well For Small Datasets But Its Rather In Effective For Very Large Datasets. For Example, Let'S Say I Have 10 Million Images Just For The Training Dataset, And My T

Yes! I definitely think this is important, and hopefully we will see something there

(or at least in the docs)

Hi AgitatedDove14 , any updates in the docs to demonstrate this yet?

3 years ago

0 Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

Next step to figure out if i can do all that in the python code instead of UI.

3 years ago

0 Hi, I Would Like To Check What Would Be The Recommended Hardware Specs For The Server Host Clearml Server. I Had One Configured With 32 Cpu Cores, 64Gb Ram And I Noticed That If We Have A Surge In Remote Task Creation, The Following Delays Occurs.

We are running on a 1gbps backend.

3 years ago

After some churning, this is the answer. Change it in the clearml-agent init generated clearml.conf.

` default_docker: {
# default docker image to use when running in docker mode
image: "nvidia/cuda:10.1-runtime-ubuntu18.04"

    # optional arguments to pass to docker image
    # arguments: ["--ipc=host", ]
    arguments: ["--env GIT_SSL_NO_VERIFY=true",]
  } `

3 years ago

0 Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

Hi,
It did, nvidia/cuda:10.1-runtime-ubuntu18.04.

So if i need to set this every time, what is the following config for? And how do i pass in new env parameters?
` default_docker: {
# default docker image to use when running in docker mode
image: "dockerrepo/mydocker:custom"

    # optional arguments to pass to docker image
    # arguments: ["--ipc=host", ]
    arguments: ["--env GIT_SSL_NO_VERIFY=true",]
} `

3 years ago

0 Hi, I'M Working On A Post Deployment Data And Model Monitoring Using Clearml. The Idea Is This.

For example, it would useful to integrate https://github.com/whylabs/whylogs#features into ClearML as part of data and model monitoring. WhyLogs would have their own static page that would preferably be displayed as a new custom tab (besides logs, scalars and plots.).

3 years ago

Show more results