AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Questions 49
Answers 8126

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

3 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Is You Server Using Https ?!

Is you server using https ?!

clearml

5 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

This Will Close It

This will close it Task.current_task().close()I think we should rename completed() because it just marks the Task as completed on the backend but does not ac...

clearml

4 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Finally

clearml

5 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi ! ClearML Server + SDK v1.9.0 is out! 🎉 🚀 🎊 Happy Holidays and Happy New Year! ❇️ 🎇 🎄

clearml

2 years ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Thank You All For Taking The Time To Answer Our Survey (If You Haven'T Already, We Urge You To

Thank you all for taking the time to answer our survey (If you haven't already, we urge you to do so ). Your feedback has a major impact on what we build, do...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

New Rc For Trains-Agent Is Out

New RC for trains-agent is out pip install trains-agent==0.13.2rc1

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

docs are up

clearml

5 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi ! trains 0.16.2 is finally out with the new pipelines interface! Check out the new example https://github.com/allegroai/trains/blob/master/examples/pipeli...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Slack Security ... Go Figure

Slack security ... Go figure 😉

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of Trains :smile_cat: ) <https://twitter.com/PyTorch/status/1272919483980500999>

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<https://allegro.ai/docs>

https://allegro.ai/docs

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

@YummyWhale40 awesome thanks!

YummyWhale40 awesome thanks!

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi Guys! I Have Great News, We Finally Fully Implemented Support For Continuing Previously Trained Models

Hi Guys! I have great news, we finally fully implemented support for continuing previously trained models 🎉 Here is a quick example (this is torch, but any ...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

@YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo server, and do get the Scalars without any issues...

YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo se...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi Guys/Gals, If You Want To Checkout The Latest Rc We Have 0.15.0Rc0 Out :

Hi Guys/Gals, If you want to checkout the latest RC we have 0.15.0rc0 out : pip install trains==0.15.0rc0 pip install trains-agent==0.15.0rc0Many of the impr...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Https://M.Facebook.Com/Story.Php?Story_Fbid=2484620658505570&Id=1620822758218702&Refid=52&__Tn__=-R

https://m.facebook.com/story.php?story_fbid=2484620658505570&id=1620822758218702&refid=52&tn=-R

clearml

5 years ago

Show more results

0 Hi, I’M Having Troubles Initializing Connection To Clearml (“Error: Could Not Verify Credentials:“). Who Can Help? Thanks

No worries 🙂 glad it worked

4 years ago

0 Hi, I Am Wondering Why Do I Need To Create Files Before Applying Diff ?

Okay how do I reproduce it ?

4 years ago

0 Hi, When I Save Model Using Tf.Keras.Save_Model Or Using Modelcheckpoint Model Is Not Saved As An Artifact. Output Uri Is Set To Google Cloud Bucket. When Reporting With Logger Everything Is Stored Correctly. Do You Maybe Have Any Idea Why This Would Not

Hi OutrageousGiraffe8
I was not able to reproduce 😞
Python 3.8 Ubuntu + TF 2.8
I get both metrics and model stored and uploaded
Any idea?

3 years ago

0 When I Run An Experiment (Self Hosted), I Only See Scalars For Gpu And System Performance. How Do I See Additional Scalars? I Have

Thanks BoredHedgehog47 !
And yes if the Task.init() call was only in main.py then the TB inside the subprocess (train.py) would as you perceived not be captured.
Did you by any chance test calling Task.init in Both main.py and train.py ?

3 years ago

0 Hi There,

it is agg

2 years ago

0 Hello Community, I Had A Query Regarding Clearml-Data , Can The Dataset Be Queried Against Some Metadata Using Ui And/Or Cli ?

Hi HarebrainedBear62
What's the type of data ?

4 years ago

0 Hello Everyone! The Question About Dataset.Squash(). The Squash Operation Copies All The Data And Is No Longer Linked To Previous Commits? I Thought This Operation Is Like Git Squash But It Seems To Me That Clearml Dataset.Squash() Create Just A Copy Of S

The Squash operation copies all the data and is no longer linked to previous commits?

Yes, basically the idea is if you have data version that relies on many parents that needs to be merged, the squash will create a merged copy and push it all as a single version, and then yes the parent versions are no longer needed

I thought this operation is like git squash but it seems to me

yeah... we did not want to actually delete the parents because unlike git, the operation is done ...

one year ago

0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

👍

4 years ago

0 What Is The Suggested Way Of Running Trains-Agent With Slurm? I Was Able To Do A Very Naive Setup: Trains-Agent Runs A Slurm Job. It Has The Disadvantage That This Slurm Job Is Blocking A Gpu Even If The Worker Is Not Running Any Task. Is There An Easy Wa

Okay this more complicated but possible.
The idea is to write a glue layer (service) that pulls from the (i.e UI) queue
sets the slurm job
and puts it in a pending queue (so you know the job s waiting in the slurm scheduler)
There is a template here:
https://github.com/allegroai/trains-agent/blob/master/trains_agent/glue/k8s.py
I would love to help and setup a slurm glue in a similar manner
what do you think?

5 years ago

0 Hi Guys, I’M Trying To Install It My Lab Server, But When I Try To Create Credentials, It Says Error And Gives More Info: Error 301 : Invalid User Id: Id=F46262Bde88B4928997351A657901D8B, Company=D1Bd92A3B039400Cbafc60A7A5B1E52B

This is assuming you can just run two copies of your code, and they will become aware of one another.

4 years ago

0 Hey, What Is The Exact Difference Between

It should work 🙂 as long as the versions match, if they don't the venv will install the version you need (which is great, only penalty is the install, download wise it will be cached)

5 years ago

0 I Seem To Be Missing Something ... I'Ve Only Got One Task Running To Train A Segmentation Model On My Local Machine, And In A Few Days It'S Hit Over 1.15M Api Calls. It Looks Like It'S Sending Every Single Console Output ... Are There Settings To Control

I had no idea it was going to do that and sent your servers over 1.4M API hits unintentionally

Yeah, that is way too much, I think relates to the frequency it updates the console 😞

2 years ago

0 Hi Folks, I Did A Deployment Of Clearml Using The K8S Helm Chart, And I Set The Agent Using K8S Glue. I Run A Task Locally, And I Went To The Ui Cloned The Experiment And Scheduled It In The Default Queue. After Doing This, I See That The Experiment Is Q

okay this points to an issue with the k8s glue, I think it somehow failed to launch the pod. Can you send me the log of the clearml-k8s-glue ?

3 years ago

0 Is It Possible To Launch A

Hi ShallowArcticwolf27

from the command line to a remote machine while loading a local

.env

file as a configuration object?

Where would the ".env" go to ? Are we trying to pass it to the remote machine somehow ?

4 years ago

0 Is It Possible To Create A Serving Endpoint With Pytorch Jit File In Web Interface Only?

First let's try to test if everything works as expected. Since 405 really feels odd to me here. Can I suggest following one of the examples start to end to test the setup, before adding your model?

3 years ago

0 Hi! I Need Help Debugging The Following Issue Please. I'M Training A Cnn And Plotting The Confusion Matrices For Train And Val In Each Epoch. When I Get To Epoch 101, The Ui Kind Of Breaks..It Starts Showing Me The Images For Epoch 1. When I Right Click O

Okay I found the issue ( I think),
If the images are reported very quickly, it will "decide" you are about to override the previous one (i.e. 101 -> overwriting 0, which makes sense, the bug was it would disable the 101 from uploading and not the 0 🙂 )
Test fix:
in /backend_interface/metrics/events.py , line 292, change:
` last_count = self._get_metric_count(self.metric, self.variant, next=False)
if abs(self._count - last_count) > int(self._file_history_size):
...

4 years ago

0 A Question About Ssh Keys Mount To A Clearnl-Agent Running In Docker Mode. I Noticed That Only When The Task Is Created And Enqueued (Using Python Script), The Local .Ssh Folder Will Be Bind With The Container, But If I Later Reset (Or Clone) And Enqueue

but when I run the same task again it does not map the keys.. (edited)

SparklingElephant70 what do you mean by "map the keys" ?

3 years ago

0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

We should probably have a section on that (i.e. running two agents on the same GPU, then explain how top use it)

5 years ago

0 Hi, I Just Started Using Clearml, And It Is Amazing! However I'Ve Run Into An Issue - I Have A Windows Machine Which I'Ve Setup As A Worker, With An Agent Running. I'Ve Thus Far Been Able To Run The Hello World Tasks And Have Also Set It Up To Store All D

Hi CrookedAlligator14

Hi, I just started using clearml, and it is amazing!

Thank you! 😍

When I enqueue the task, the venv is setup and starts to install all the packages from the

requirements.txt

file, but at the end I get the following in the console:

Can you try with the latest agent, we improved the support for pytorch (they now have a proper pypi compatible repo), can you see if that solves it?
pip3 install clearml-agent==1.5.0rc0

3 years ago

0 I Saw Some Talk Of Clearml + Kedro On Reddit. Is That A Good Approach?

one can containerise the whole pipeline and run it pretty much anywhere.

Does that mean the entire pipeline will be running on the instance spinning the container ?
From here: this is what I understand:
https://kedro.readthedocs.io/en/stable/10_deployment/06_kubeflow.html

My thinking was I can use one command and run all steps locally while still registering all "nodes/functions/inputs/outputs etc" with clearml such that I could also then later go into the interface and clone an...

4 years ago

0 Hello! I Try Add Dataset To Clearml Using Clearml-Data. All Images In One Folder, Size Around 5Gb. After Upload To Cloud I Get This Error Clearml.Metrics - Error - Failed Reporting Metrics: <400/0: Unknown (Error: Events.Add_Batch Request Exceeds Limit 75

Thanks!

4 years ago

0 I’M

This is already part of the docker-compose file,
https://github.com/allegroai/clearml-server/blob/master/docker/docker-compose.yml

2 years ago

0 Hi All, I Am Running Into Ssl Verification Issues With Trying To Upload Model Artifacts To Minio. We Are Running The Clearml Agent In A Container, Have Mounted A Ca Bundle To The Container And Referenced It On Env Vars So That Aws Cli/Boto And Requests Us

(Venv mode makes sense if running inside a container, if you need docker support you will need to mount the docker socket inside)
What is exactly the error you re getting from clearml? And what do you have in the configuration file?

4 years ago

0 I Recently Attended The Llms In Production Conference Organized By The Mlops Community In San Francisco And Was Blown Away By The Wealth Of Knowledge And Insights Shared By The Speakers At The Conference. To Share These Learnings And Insights From Practi

@<1587253076522176512:profile|HollowPeacock33>
Is this a commercial ad? this seems like out of scope for this channel
Can you expand?

2 years ago

0 Hi, I Have A Question Regarding

... the one for the last epoch and not the best one for that experiment,

well

Now we realized there is an option tu use

"min_global"

on the sign, is this what we need?

Yes 🙂 (or max_global)

3 years ago

0 Hey There, Since A Bit I Often Find Experiments Being Stuck While Training A Model. It Seems To Happen Randomly And I Could Not Find A Reproducible Scenario So Far, But It Happens Often Enough To Be Annoying (I'D Say 1 Out Of 5 Experiments). The Symptoms

Most likely yes, but I don't see how clearml would have an impact here, I am more inclined to think it would be a pytorch dataloader issue, although I don't see why

These are most certainly dataloader process. But clearml-agent when killing the process should also kill all subprocesses, and it might be there is something going on that prenets it from killing the subprocesses ...

Is this easily reproducible ? Can you verify it is still the case with the latest RC of clearml-agent ?

2 years ago

0 How Can I Run A New Version Of A Pipeline, Wait For It To Finish And Then Check Its Completion/Failure Status? I Want To Kick Off The Pipeline And Then Check Completion

Does this require you run the pipeline locally (I see you have set default execution queue) or do any other specific set-up?

Yes this mean the pipeline Logic runs manually/locally (logic means launching components, not actually compute)
Please have a go at it, I'm sure some quirks in the psuedo code are missing but it should work, and I'll gladly help set it up

2 years ago

0 Hi New With Clearml I Create Clearml Server On Gcp With Docker Now I’M Training Yolov5 And I Want To Save All The Info (Model And Metrics ) With Clearml To My Bucket.. (So I Can Have Small Server And No Memory Issue ) Where Should I Start? Its Should Be C

AstonishingRabbit13
https://github.com/googleapis/google-cloud-python/issues/4941#issuecomment-369472576
check the openssl and the date, this seems like SSL low level error (even before authentication)

3 years ago

0 Hello, Is There A Dark Theme For Clearml Ui ?

Hi @<1623491856241266688:profile|TenseCrab59>
It is kind of dark or are you asking about the graphs ?

2 years ago

0 Hi! I Was Wondering Why Clearml Recognize Scikit-Learn Scalers As Input Models... Am I Missing Something Here? For Me It Would Make Sense To Include The Scalers As A Configuration Object Of The Trained Model, Not Outside

Thank you!

4 years ago

Show more results