AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Questions 48
Answers 8049

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<https://allegro.ai/docs>

https://allegro.ai/docs

clearml

4 years ago

0 Votes

0 Answers

873 Views

0 Votes 0 Answers 873 Views

Hi Gals / :robot_face: / Guys <!here> Quick update, we will be updating the GitHub repository tomorrow with the new ClearML version, together with the accompanying python packages, ETA Noon(ish) PT time. `trains` , `trains-agent` and the docker images a

Hi Gals / :robot_face: / Guys Quick update, we will be updating the GitHub repository tomorrow with the new ClearML version, together with the accompanying p...

clearml

3 years ago

0 Votes

6 Answers

986 Views

0 Votes 6 Answers 986 Views

Hi ! ClearML Server + SDK v1.9.0 is out! 🎉 🚀 🎊 Happy Holidays and Happy New Year! ❇️ 🎇 🎄

clearml

one year ago

0 Votes

3 Answers

378 Views

0 Votes 3 Answers 378 Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Lol, I Wonder What The Adblock Rule Was ;)

Lol, I wonder what the adblock rule was ;)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New Rc For Trains-Agent Is Out

New RC for trains-agent is out pip install trains-agent==0.13.2rc1

clearml

4 years ago

0 Votes

1 Answers

442 Views

0 Votes 1 Answers 442 Views

There Is No V1.0 Release Without A Prompt V1.0.1 Following It, And We Are No Different

🙏 There is no v1.0 release without a prompt v1.0.1 following it, and we are no different 😊 pip install clearml==1.0.1

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hi Guys! I Have Great News, We Finally Fully Implemented Support For Continuing Previously Trained Models

Hi Guys! I have great news, we finally fully implemented support for continuing previously trained models 🎉 Here is a quick example (this is torch, but any ...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hi Guys/Gals, If You Want To Checkout The Latest Rc We Have 0.15.0Rc0 Out :

Hi Guys/Gals, If you want to checkout the latest RC we have 0.15.0rc0 out : pip install trains==0.15.0rc0 pip install trains-agent==0.15.0rc0Many of the impr...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo server, and do get the Scalars without any issues...

YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo se...

clearml

4 years ago

0 Votes

2 Answers

392 Views

0 Votes 2 Answers 392 Views

Omg Look Who Just Joined The Pytorch Ecosystem

OMG Look who just joined the PyTorch EcoSystem None Yes! it is TRAINS 🚆 🎉 🎈

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@YummyWhale40 awesome thanks!

YummyWhale40 awesome thanks!

clearml

4 years ago

0 Votes

1 Answers

472 Views

0 Votes 1 Answers 472 Views

Lstmeow Is Back! Bots/Gals/Guys Feel Free To

LSTMeow is back! Bots/Gals/Guys feel free to 👍 None

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

4 years ago

0 Votes

1 Answers

370 Views

0 Votes 1 Answers 370 Views

Please Skip

🙏 Please skip cleaml python package v1.0.1 and just move on to v1.0.2 😊 apologies for the inconvenience 🙂 pip install clearml==1.0.2

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

I Would Guess Connectivity Issues, The Tls Is Probably Python Inaccurate Response (I Mean In A Way, It Is Also A Tls Error, But I Would Imagine This Has More To Do With The Actual Network Connection)

I would guess connectivity issues, the TLS is probably python inaccurate response (I mean in a way, it is also a TLS error, but I would imagine this has more...

clearml

4 years ago

0 Votes

0 Answers

974 Views

0 Votes 0 Answers 974 Views

@PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with what seems like an easy to use single app. From the Reddit thread it seems that it is still not

PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with...

clearml

4 years ago

Show more results

0 When Running An Experiment From A Notebook, It Knows It’S A Notebook And Automatically Adds The Notebook As An Artifact Right? And The Uncommited Changes Becomes The Nottebook Converted To A Script? In One Case I Am Seeing Actual Git Diff Coming In Instea

I always have my notebooks in git repo but suddenly it's not running them correctly.

What do you mean?

Can I switch off git diff (change detection?)

Yes, Task.init(..., auto_connect_frameworks={"detect_repository": False})

3 years ago

0 Hello, When Running A Task With A Remote Interpreter I Get

hmm DeliciousKoala34
what are you getting if you put this at the top of your code (the one you are running in the remote docker)
import os print([(k, os.environ[k]) for k in os.environ if k.startswith("CLEARML_")])

one year ago

0 Hi, In The Metric Snapshot Section Of The Overview Tab Of A Project Page, Would It Be Possible To:

Hi JitteryCoyote63
Show running experimentsIt doesn't?
Have the legend clickable, to hide/show experiments based on their status:+1:
Have a line connecting points that are SOTA (example in https://paperswithcode.com/sota/image-generation-on-cifar-10 )I like that, how is that selected? (I know FE are thinking of replacing this entire graph library, so maybe good timing in terms of what to look at)

one year ago

0 For Those Using Clearml For Model Storage - Do You Use It Just For Storing Checkpoints During Training, Or Do You Also Use It As A Canonical Storage Location For Fully Trained Models? Like For Services Using These Models That Are Deployed To Production, D

Hi ShallowArcticwolf27
First of all:

If the answer to number 2 is no, I'd loveee to write a plugin.

Always appreciated ❤

Now actually answering the Q:
Any torch.save (or any other framework save) will either register or automatically upload, the file (or folder) in the system. If this is a folder it will be zipped and uploaded, if a file just uploaded to to the assigned storage output (the cleaml-server, any object storage service, or shared folder). I'm not actually sure I...

3 years ago

0 I Have Managed To Deploy Model By Thr Clearml-Serving, Now They Are Runing On The Docker Container Engine (That Doesn'T Have Gpu In It) , What Is The Entrypoints To The Model In Order To Get Predictions?

can i run it on an agent that doesn't have gpu?

Sure this is fully supported

when i run clearml-serving it throughs me an error "please provide specific config.pbtxt definion"

Yes this is a small file that tells the Triton server how load the model:
Here is an example:
https://github.com/triton-inference-server/server/blob/main/docs/examples/model_repository/inception_graphdef/config.pbtxt

3 years ago

0 Hi Community! I Have Difficulty Using Clearml Pipeline. I Am Writing The Code Using The Pipeline Decorator, But The Pipeline Does Not Work With The Following Error When Specifying The Docker Image As A Argument Of The Decorator. How Should I Solve It?

How come the second one is one line?

11 months ago

How are you running it?

11 months ago

0 Hi, I'M Looking At Clearml As An Option To Automate Our Training Pipelines. However, From Reading The Documentation I'M Confused If Clearml Can Do What We Want. In Essence, I Would Like To Understand The Methods Of Queuing A

Hi GracefulDog98
As UnevenDolphin73 pointed you might be looking for https://clear.ml/docs/latest/docs/references/sdk/task#execute_remotely
Which will stop the current local process, and enqueue the task on the "default" queue, for the agent to execute.
Is this what you are looking for ?
The idea is you can run your code once in "development" mode, so you know everything is working, then from the UI (or programmatically) you can clone the experiment, edit the configuration (or anythin...

3 years ago

0 Anyone Know How To Override The Task Start Up Shell Script Via The Helm Charts? I Have A Aws Eks K8S Cluster W/ Port 80 Closed To Force All Traffic Over Port 443 Which Is Causing Issues W/

Hi ZippyAlligator65
You can configure it in the clearml.conf: see here:
https://github.com/allegroai/clearml-agent/blob/ebb955187dea384f574a52d059c02e16a49aeead/clearml_agent/backend_api/config/default/agent.conf#L202

one year ago

0 Hi All—First Off, Thanks For Being Such A Helpful And Thorough Group Of People. I Learn A Ton Just Searching Through The Channel For Problems. I’M Seeing A Weird Issue. I Have A Conda Env On My Linux Machine, And I Can Successfully Run A Training Script

(torchvision vs. cuda compatibility, will work on that),

The agent will pull the correct torch based on the cuda version that is available at runtime (or configured via the clearml.conf)

3 years ago

0 Hi All! I’M Currently Working On A Project Where I’M Making Use Of Clearml For Hyperparameter Tuning. In My Workflow, I Have A Python Script That I Usually Run With The Following Command:

What are you seeing in the Task that was cloned (i.e. the one the HPO created not the original training task)?
by that I mean, configuration section, do you have the Args there ? (seems like the pic you attached, but I just want to make sure)

Also in the train.py file, do you also have Task.init ?

one year ago

0 Is It Possible To Launch A

Interesting...
We could followup the .env configuration, and allow the clearml-task to add configuration files from cmd line. This will be relatively easy to add. We could expand the Environment support (that somewhat exists), and add the ability to read variables from .emv and Add them to an "hyperparemeter" section, named Environment. wdyt?

3 years ago

0 Can Anyone Recommend A Good Workflow For

None
No they are not, they are taking the vscode backend and put it behind a webserver-ish

one year ago

0 Hi Everyone! I Am Using Clearml-Serving When I Am Trying To Add New Endpoint Like This

I'm happy tp hear you found a work around
Seems like there is something wrong with the way the pbtxt is being merged, but I need some more information

{'detail': "Error processing request: object of type 'NoneType' has no len()"}

Where are you seeing this error?
What are you seeing in the docker-compose log.

one year ago

0 Hi. I'M Using Clearml For Logging My Experiments. Can I Compare Experiments By Plotting Graphs? For Example, Every Experiment Logs The Time Per Training Iteration And The Accuracy Per Epoch. I Want To Create A Graph With "Average Time Per Iteration" As X-

SoreDragonfly16 . In the hyper parameters Tab, you have "parallel coordinates" (next to the "add experiment" the button saying "values" press on it and there should be " parallel coordinates")
Is that it?

3 years ago

0 I Am Working Up With The Autoscaler, After Setting Up The Autoscaler Instance I Am Getting The Following Error When I Launch The Autoscaler Googleapiclient.Errors.Httperror: <Httperror 404 When Requesting

Thanks!
Hmm from here : None
Could it be you do not have privileges to the resource, or that you did not provide credentials ?
Did that autoscaler work before ?

one year ago

0 Hi!

Or is it already expected behavior that ClearML agent rewrites ...

Yep, that should work

3 years ago

0 Warning:Root:Could Not Delete Task Id=6Cd7F02Be36C4361965Adf9F027Bcda5, Task Id "6Cd7F02Be36C4361965Adf9F027Bcda5" Could Not Be Found 2021-07-15 20:58:48,046 - Clearml.Task - Error - Action Failed <400/101: Tasks.Get_By_Id/V1.0 (Invalid Task Id: Id=Ff308E

Seems like a Task contained an invalid artifact link.
I wouldn't sweat over it, it basically a warning that it could not locate the actual file to delete (albeit an ugly warning 🙂 )
I think AnxiousSeal95 would know when will the new version be ready.
regardless, is it actually deleting old Tasks ?

3 years ago

0 Hi All! Question Around Resource Management Using

Containers (and Pods) do not share GPUs. There's no overcommitting of GPUs.Actually I am as well, this is Kubernets doing the resource scheduling and actually Kubernetes decided it is okay to run two pods on the Same GPU, which is cool, but I was not aware Nvidia already added this feature (I know it was in beta for a long time)
https://developer.nvidia.com/blog/improving-gpu-utilization-in-kubernetes/
I also see thety added dynamic slicing and Memory Proteciton:
Notice you can control ...

2 years ago

Could it be pandas was not installed on the local machine ?

3 years ago

actually no

hmm, are those packages correct ?

3 years ago

conda list | grep matplotlib ?

3 years ago

pip freeze | grep pandas

3 years ago

I think the main issue is running with python -m module.name --args
Which is a bit different, when trying to "understand" what is the actual repository.
Can you try to run it from the repository folder (same command, just to see if it will have any effect on the detected packages)

3 years ago

BTW: how is it missing listing torch ? Do you have "import torch" in the code ?

3 years ago

0 Hi, I Went Through This Slack'S History And The Problem Already Popped Up A Couple Of Times But Doesn'T Look Like Solved. On My Machine I Currently Have 4 Gpus, No Problems If I Want To Allocate All 4 Or Just 1 Using

BTW:

Error response from daemon: cannot set both Count and DeviceIDs on device request.

Googling it points to a docker issue (which makes sense considering):
https://github.com/NVIDIA/nvidia-docker/issues/1026
What is the host OS?

3 years ago

Also what is the docker vserion?

3 years ago

Hmm, let me check something

3 years ago

Ubuntu? which version?

3 years ago

Okay, I'll make sure we always qoute " , since it seems to work either way.
We will release an RC soon, with this fix.
Sounds good?

3 years ago

Show more results