SubstantialElk6

117 Questions, 310 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

282 × Eureka!

Questions 117
Answers 310

0 Votes

0 Answers

71 Views

0 Votes 0 Answers 71 Views

Hi Can I Ask How Clearml Support Distributed Training Via K8Sglue? Kubeflow Operator Support Distributed Training On Kubernetes Cluster, Managing The Pods Seamlessly.

Hi Can i ask how ClearML support distributed training via K8SGlue? Kubeflow Operator support distributed training on Kubernetes cluster, managing the pods se...

clearml

20 days ago

0 Votes

1 Answers

85 Views

0 Votes 1 Answers 85 Views

Hi, Got A Few Questions On Clearml Seasion With Vscode And Jupyter 1. Are Sessions Spawned Off Clearml-Sessions Persistent? How Can We Connect Back Easily? 2. In A Vscode Session, Can I Run Ui Related Operations, E.G. Run Imgview.

Hi, got a few questions on Clearml seasion with vscode and Jupyter 1. are sessions spawned off Clearml-sessions persistent? How can we connect back easily? 2...

clearml

26 days ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, How Can I Make A Stage In A Clearml Pipeline Non-Blocking? The Scenario Is That Stages Downstream Needed Runtime Info From The First Stage, However The First Stage Needs To Continue Running To Act As A Monitor For The Other Downstream Stages.

Hi, how can i make a stage in a clearml pipeline non-blocking? The scenario is that stages downstream needed runtime info from the first stage, however the f...

clearml

one year ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi, I Notice A New Behavuour With Clearml-Agent=1.1.0. When It Is Installing The Packages I Nrequirements.Txt, It Failed With.

Hi, i notice a new behavuour with clearml-agent=1.1.0. When it is installing the packages i nrequirements.txt, it failed with. clearml_agent: ERROR: HTTPSCOn...

clearml

3 years ago

0 Votes

12 Answers

1K Views

0 Votes 12 Answers 1K Views

Hi, Is There A Command I Can Use To Generate A Report That Can

Hi, is there a command i can use to generate a report that can Give a list of user accounts created Their activity levels

clearml

3 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hi, Can I Do A Quick Check If All The Documentation I Find On Trains Are Still Valid For Clearml? Specifically, I Am Looking At Integration Of Clearml And Kubernetes.

Hi, can i do a quick check if all the documentation I find on TRAINS are still valid for ClearML? Specifically, i am looking at integration of ClearML and Ku...

clearml

3 years ago

0 Votes

17 Answers

1K Views

0 Votes 17 Answers 1K Views

Hi I'M Using Clearml Datasets. How Do I Tell From The Clearml Ui Which Datasets Version Am I Using?

Hi I'm using clearml datasets. How do I tell from the ClearML UI which datasets version am I using?

clearml

3 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, The `

Hi, the https://github.com/allegroai/trains/blob/master/examples/services/jupyter-service/execute_jupyter_notebook_server.py file linked by following page is...

clearml

3 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hi, I Shifted My Clearml Setup To An On-Premise Disconnected Env, Which Has A Pip Repo Setup. I Noted This Warning,

Hi, i shifted my clearml setup to an on-premise disconnected env, which has a pip repo setup. I noted this warning, Trying pip install: /root/.clearml/venvs-...

pytorch

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

[Distributed Training] Hi, I Have A Clearml Setup With K8Sglue That Spins Up Pods Of 4 Gpus When Picking Tasks Off The Clearml Queue. We Would Now Want To Proceed With Multi-Node Training, And Some Of The Examples We Are Trying Are Here.

[Distributed Training] Hi, i have a ClearML setup with K8SGlue that spins up pods of 4 GPUs when picking tasks off the clearml queue. We would now want to pr...

clearml

one year ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, There'S Something I Don'T Find Too Logical When Using Clearml And Its Agents. I Will Need To Run My Code Once On My Client Computer This Is Without Gpus. And Then I Will Need To Run It Via The Ui On Clearml Server That Has Gpus. Why Can'T I Configure

Hi, there's something i don't find too logical when using ClearML and its agents. I will need to run my code once on my client computer this is without GPUs....

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, Would Like To Check. So An Agent Pulled A Docker Image And Install The Pip Dependencies On It. What If I Have Os Library Dependencies As Well? (Apt Install, Rpm Install...Etc).

Hi, would like to check. So an agent pulled a docker image and install the pip dependencies on it. What if I have OS library dependencies as well? (Apt insta...

mlops

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, Is There A Pdf Version Of Your Documentation At

Hi, is there a pdf version of your documentation at https://clear.ml/docs/latest/docs ? We work mostly in an offline environment and would benefit from this ...

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, Can You Run Clearml Experiments On Docker Images That Does Not Have Root?

Hi, can you run clearml experiments on docker images that does not have root?

clearml

2 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, Can I Default A Docker Image When Running A Pipeline? I Currently Set It As

Hi, can i default a docker image when running a pipeline? I currently set it as pipe = PipelineController(...) pipe.task.setbase_docker("ubuntu:20:04") pipe....

clearml

2 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi I Upgraded Some Of My K8S Glue To Use The Latest Clearml-Agent 1.1.0 And Receive This Error When The Agent Pulls A Task. Pulling Task Xxxbbbxxxbb Launching On Kubernetes Cluster Pushing Task Xxxbbbxxxbb Into Temporary Pending Queue Kubernetes Scheduli

Hi I upgraded some of my k8s glue to use the latest clearml-agent 1.1.0 and receive this error when the agent pulls a task. Pulling task xxxbbbxxxbb launchin...

kubernetes mlops

3 years ago

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

Hi We Have Had Some Crashes On Clearml Server And It Was Caused By Clearml Uploading The Models Into Clearml Server (By Default). Is It Possible To Have An Overriding Config So Clients Can Never Upload To Clearml Server Itself As Default?

Hi we have had some crashes on ClearML server and it was caused by ClearML uploading the models into ClearML server (by default). Is it possible to have an o...

clearml

2 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi, I'M Running Clearml Agents Via K8S Glue. I Noticed That The Agent Is Not Pulling Latest Images Even Though

Hi, I'm running clearml agents via K8s glue. I noticed that the agent is not pulling latest images even though docker_force_pull is set to true. A kubectl de...

mlops

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

So I'Ve Install Allegro On Kubernetes Using Helm, How To I Perform

So i've install allegro on Kubernetes using helm, how to i perform trains-init ?

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, I;M Running Clearml Jobs Using K8Sglue. When The Job Is Running, The Scalar For Monitor:Machine Seems To Be Reporting Node Statistics Instead Of The Pod Statistics. Can This Behavior Be Changed So Its Reporting For The Pod Instead??

Hi, i;m running ClearML jobs using K8SGlue. When the job is running, the scalar for monitor:machine seems to be reporting Node statistics instead of the Pod ...

clearml

one year ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, I Am Working On Creating Retraining Pipelines In Production. The Way I'M Doing This Is To Install Clearml-Server On My Production. Then I Recreate The Ingestion, Preprocessing And Training/Opt Tasks Into A Clearml-Pipeline. Thereafter, I Would Call

Hi, i am working on creating retraining pipelines in production. The way i'm doing this is to install clearml-server on my production. Then i recreate the in...

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, Would You Have A Working Example On This?

Hi, would you have a working example on this?

clearml

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, I Would Like To Ask Around If Anyone Has Following Languages Working With Clearml? It Can Be Direct From Clearml Sdk Or Via Any Indirect Method.

Hi, i would like to ask around if anyone has following languages working with ClearML? It can be direct from ClearML SDK or via any indirect method. Julia R ...

clearml

3 years ago

0 Votes

0 Answers

899 Views

0 Votes 0 Answers 899 Views

Hi, We Are Encountering An Increasing Number Of Cases Where It Takes Quite A While Before Actual Training (Gpu Utilisation) Can Be Done. After Observing, This Is What We Discovered. The Following Are The Steps And Bottlenecks.

Hi, we are encountering an increasing number of cases where it takes quite a while before actual training (GPU utilisation) can be done. After observing, thi...

clearml

one year ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, In The New Datasets Ui. It Doesn'T Seem To Display The Entire Lineage Of The Datasets. For Example. If A Dataset Is Create As Such Id1 (Parent)->Id2, Then Another Dataset Created As Id2(Parent)-> Id3. When You Look At Id3, It Only Shows Id2 As Parent.

Hi, in the new datasets UI. It doesn't seem to display the entire lineage of the datasets. For example. if a dataset is create as such id1 (parent)->id2, the...

clearml

2 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hi, I'M Getting This Long Error When Running

Hi, i'm getting this long error when running task.execute_remotely(queue_name="1gpu", exit_process=True) . I also notices an error Failed to fetching activit...

clearml

3 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi, I Can'T Seem To Set A Password To Clearml, Anyone Seems To Be Able To Just Enter The Username And They Can Enter That Username'S Workspace.

Hi, i can't seem to set a password to clearml, anyone seems to be able to just enter the username and they can enter that username's workspace.

clearml

3 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi Recently Upgraded All The Clearml, Clearml-Server, Clearml-Agent. Now Running K8S Glue With Clearml-Agent=1.0.1Rc1.

Hi recently upgraded all the clearml, clearml-server, clearml-agent. Now running k8s glue with clearml-agent=1.0.1rc1. python3 k8s_glue_example.py --queue 1b...

clearml

3 years ago

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

Hi, Is There Any Code Examples Of How Dataops Is Being Established?

Hi, is there any code examples of how DataOps is being established? https://clear.ml/products/clearml-dataops/

clearml

3 years ago

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

Hi, I'M Using The K8S Glue And Have A Few Questions.

Hi, I'm using the k8s glue and have a few questions. Noted that it's not requesting the http://nvidia.com/gpu thus the pod created doesn't have a GPU resourc...

clearml

3 years ago

Show more results

0 Hi, I Started My Agent Using. Clearml-Agent Daemon --Gpus 0 --Queue Gpu --Docker --Foreground, With The Following Parameters In Clearml.Conf.

Ok that works. thanks.

3 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

Any idea where i can find the relevant API calls for this?

3 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

Yes I am.

3 years ago

0 Hi, Is There A Command I Can Use To Generate A Report That Can

Hi, any idea if i can acheive this? I just need a list of usernames.

3 years ago

0 Hi, I'Ve A Few Questions On Clearml-Session.

f you can directly access the machine running the agent, yes you could. If not reverse proxy is in the working

Hi AgitatedDove14 , i might have misunderstood your previous comment above. Do you mean that clearml-session can only work regardless of whether xforwarding is configured, if we have direct access to the Kubernetes worker when we run K8S glue?

We did some testing today and clearml-session tried to tunnel with a k8s cluster ip, and thus failed.

If we setup a ingress with Me...

3 years ago

0 Hi, I Have A Scenario Where When The Code Is Run Remotely Via Clearml-Agent, The Code Appears To Get Stuck At

Is there anyway to see an error log from that?

one year ago

0 Hi, I'Ve A Few Questions On Clearml-Session.

Unfortunately due to security, clients can't have direct access to the nodes. Is there any possible workarounds at the moment?

3 years ago

0 Hi, Just To Check. Does The K8S Glue Install Torch By Default? I'M Getting

Hi AgitatedDove14 , do you mean the configuration tab in the UI? No, i don't see it.

3 years ago

0 Hi, I Was Trying Out The Steps On This (

Hi yes, still getting the SSLs. It looks like some incompatibility with the OS ssl libraries.

3 years ago

0 Hi, I Am Running Several Python Scripts But All For The Same Project/Task. Is It Possible To Task.Init To Existing Running/Completed Task And Adding On The Results?

It didn't work as expected.
` task init
task report iter 10

task init
task report iter 10

The second task pushed the reporting iteration to 20 instead. `

3 years ago

0 Hi, I Am Running Several Python Scripts But All For The Same Project/Task. Is It Possible To Task.Init To Existing Running/Completed Task And Adding On The Results?

Hi, for both of them, args.lastiter is the exact same value. But when plotted out, they are 2 actually iterations apart.

3 years ago

0 Hi, I Am Running Several Python Scripts But All For The Same Project/Task. Is It Possible To Task.Init To Existing Running/Completed Task And Adding On The Results?

Just to put a ping for those on this side of the timezone to look at. Thanks.

3 years ago

0 Hi, I Am Running Several Python Scripts But All For The Same Project/Task. Is It Possible To Task.Init To Existing Running/Completed Task And Adding On The Results?

Hi TimelyPenguin76 , i am adding a debug sample to an existing task using the above method. What should i put for the iteration? I do not want to overwrite existing ones but i do not know what's the last count. This is for both scalar and media reporting.

3 years ago

0 Hi, I Am Running Several Python Scripts But All For The Same Project/Task. Is It Possible To Task.Init To Existing Running/Completed Task And Adding On The Results?

Thanks TimelyPenguin76 , let me try it out now.

3 years ago

0 Hi, Is There A Way To Use Api To Return The Urls Of All The Models In The Clearml Repository?

Ok thanks.

one year ago

0 Hi, I'Ve A Few Questions On Clearml-Session.

Ok thanks, we'll try it out on next availability.

3 years ago

0 Hi, I'Ve A Few Questions On Clearml-Session.

Hi AgitatedDove14 , thanks.
In this case i am running k8s glue (machine glue), which will then spawn off pods in kubernetes worker (machine worker). So when you say direct access, are you refering to the Glue machine or K8S Worker machine?

3 years ago

0 Hi, Can I Default The Clearml Fileserver To A S3 Path?

In the ClearML config that's being run by the ClearML container?

3 years ago

0 Hi, I'M Running The Following And Encountering Some Ssl Errors.

clearml=1.0.3
python=3.8.10
clearml-data upload --id 12314jhg42342j4j --storagehttp://ecs.ai is an on-prem DELL EMC ECS that serves as our S3 storage configured with s self signed cert.

3 years ago

0 I Got An Interesting Question From My Devs. If They Wish To Do Distributed Training, Is Clearml K8S Glue Suitable For It? Local Multiple Gpu: Just A Matter Of Assigning More Than One Gpu In The Yaml File Sent To The K8S Glue. Question Is How To Make This

Sorry, dev end I was referring to my developers.

I didn't think Horovod needs to be as complicated as you described. It can also work by running on multiple known nodes. How would i add a glue for multinode?

Horovod does also work with other similar products such as yours (E.g. Polyaxon).

3 years ago

0 Hi, I Am Running Several Python Scripts But All For The Same Project/Task. Is It Possible To Task.Init To Existing Running/Completed Task And Adding On The Results?

Hi,
basically i run this block first and ended the script.
task = Task.init(project_name="afro-nmt", task_name=args.taskname, continue_last_task=args.taskid) Logger.current_logger().report_scalar(title="BLEU",series="JW300",value=args.jwbleu, iteration=args.lastiter)Then i run another script, with series different.
` task = Task.init(project_name="afro-nmt", task_name=args.taskname, continue_last_task=args.taskid)
Logger.current_logger().report_scalar(title="BLEU",series="SS900",value=arg...

3 years ago

0 Hi, I Would Like To Ask Around If Anyone Has Following Languages Working With Clearml? It Can Be Direct From Clearml Sdk Or Via Any Indirect Method.

Hi, currently the ClearML SDK only supports python. If i want to run my ML in other languages, can i use a SDK in that language? Or is there other means such as a Web API calls that does the same as the SDK?

3 years ago

0 Hi, I Would Like To Ask Around If Anyone Has Following Languages Working With Clearml? It Can Be Direct From Clearml Sdk Or Via Any Indirect Method.

Thanks could you share the URL to this full API documentation?

3 years ago

0 Hi, We Recently Upgraded Clearml To 1.1.1-135 . 1.1.1 . 2.14. The Task Init Is

Hi. Yup the model was not physically uploaded with the up:port into the bucket, although ClearML does indicate that it's there, except that I can't download it. I also verified this with another S3 client, the model was not there as well.

3 years ago

0 Hi, We Recently Upgraded Clearml To 1.1.1-135 . 1.1.1 . 2.14. The Task Init Is

Hi, when i tried ip:port, it references the right host and bucket....BUT... the file is not found on the ECS S3 even though i can see from the logs that it states Completed model upload to s3://ecs.ai:80/clearml-models/artifacts/ ...

3 years ago

0 Hi, We Recently Upgraded Clearml To 1.1.1-135 . 1.1.1 . 2.14. The Task Init Is

Hi,
I'm running on Dell ECS storage appliance, which offers S3 compatibility.
yes http://ECS.ai is the DNS name of the server.
ClearML-models is the bucket.
Let me try with ip:port.

3 years ago

0 Hi, We Recently Upgraded Clearml To 1.1.1-135 . 1.1.1 . 2.14. The Task Init Is

No, i can't see the files. But i can see if i don't use ':port' in the URL when uploading. I can't access the machine today, i'll try to check the S3 logs when i'm back.

3 years ago

0 Can I Ask How Often Does The Hosted Clearml Reset? I'M In A Hackathon And Thought Of Using It.

ah ok, so if i see Jax's workspace on https://app.community.clear.ml/dashboard , then i'm on the right track? How regular does this reset then?

3 years ago

0 Hi, I Am Trying To Use Clearml-Data To Upload My Data To S3, Which Is Password Protected. How Should I Indicate The Credentials After I Set --Storage S3://.... ?

like create multiple datasets?
create parent (all) - upload to S3
create child1 (first 100k)
create child2 (second 100k)...blah blah

Then only pull indices from children. Technically workable but not sure if its best approach since different ppl have different batch sizes in mind.

3 years ago

0 Hi, I Have A Question About Clearml-Data. Clearml-Data Probably Does Well On Data Versioning, But When It Comes To Actual Loading Of Data, Are There Examples Of How It Can Make Use Of Advanced Features Such That Those In

Hi thanks for the examples! I will look into them. Quite a fair bit of my teams uses tf datasets to pull data directly from object stores, so tfrecords and stuff are heavily involved. I'm trying to figure if they should version the raw data or the tfrecords with ClearML, and if downloading entire set of data to local can be avoided as tf datasets is able to handle batch downloading quite well.

3 years ago

Show more results