AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Questions 49
Answers 8124

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

:confetti_ball: :champagne: Happy new year <!everyone>! :fireworks: :sparkler: We wanted to thank you all for the great feedback, contribution and general support you guys give us. It is truly fulfilling to see users enjoying the product you build, and y

🎊 🍾 Happy new year ! 🎆 🎇 We wanted to thank you all for the great feedback, contribution and general support you guys give us. It is truly fulfilling to ...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<https://allegro.ai/docs>

https://allegro.ai/docs

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

I Would Guess Connectivity Issues, The Tls Is Probably Python Inaccurate Response (I Mean In A Way, It Is Also A Tls Error, But I Would Imagine This Has More To Do With The Actual Network Connection)

I would guess connectivity issues, the TLS is probably python inaccurate response (I mean in a way, it is also a TLS error, but I would imagine this has more...

clearml

5 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

Gals, Guys &

Gals, Guys & :robot_face: , if you want to checkout the Hyper-Parameters automation (Using Bayesian Optimization Hyper-Band) We have an example on the demo s...

clearml

5 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Happy Friday Everyone ! We Have A New Repo Release We Would Love To Get Your Feedback On

Happy Friday everyone ! We have a new repo release we would love to get your feedback on 🚀 🎉 Finally easy FRACTIONAL GPU on any NVIDIA GPU 🎊 Run our nvidi...

clearml

one year ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Omg Look Who Just Joined The Pytorch Ecosystem

OMG Look who just joined the PyTorch EcoSystem None Yes! it is TRAINS 🚆 🎉 🎈

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Slack Security ... Go Figure

Slack security ... Go figure 😉

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Finally

clearml

5 years ago

0 Votes

9 Answers

2K Views

0 Votes 9 Answers 2K Views

Hi https://github.com/allegroai/trains/releases/tag/0.15.1 / https://github.com/allegroai/trains-server/releases/tag/0.15.1 / https://github.com/allegroai/tr...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

@PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with what seems like an easy to use single app. From the Reddit thread it seems that it is still not

PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hey <!here> Just a heads up, starting *Jan 25th*, the default <http://demoapp.demo.clear.ml/|ClearML demo server> will move to a *daily* reset cycle (replacing the current weekly cycle). Anybody needing more than 24h data retention is welcome to use our <

Hey Just a heads up, starting Jan 25th , the default http://demoapp.demo.clear.ml/ will move to a daily reset cycle (replacing the current weekly cycle). Any...

clearml

4 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi ! ClearML Server + SDK v1.9.0 is out! 🎉 🚀 🎊 Happy Holidays and Happy New Year! ❇️ 🎇 🎄

clearml

2 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

@YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo server, and do get the Scalars without any issues...

YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo se...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

5 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

2 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi :robot_face: , humans We have the new documentation site up and running 🎉 None 🎊 This is still a work in progress, so we keep the previous version alive...

clearml

4 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

We Recently Released A New Version Of

we recently released a new version of clearml-session with Persistent Workspace support! 🚀 🎉 Finally you can develop on remote machines with workspace fold...

remote-ssh

one year ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

5 years ago

Show more results

0 How Do I Create Sub Projects With The New Version 1.0?

Add '/' , like you would with a file system.
Task.init(project_name='main_project/sub_project', task_name='test')

4 years ago

0 Hey Guys, Do You Have Any Plans To Add Functionality To Export Training Config With All Hyperparameters To The Different Formats, Such As Training Command Line Command, Yaml, Etc.?

You can always access the entire experiment data from python
'Task.get_task(Id).data'
It should all be there.
What's the exact use case you had in mind?

5 years ago

0 Is It Necessary To Serve Keras Model Using Triton Engine? I'M Trying To Serve An Endpoint, And Trying To Debug, But The Error Given Not Helping Much. Is There A Flag I Can Pass To See More Logs?

Hi @<1567321739677929472:profile|StoutGorilla30>

Is it necessary to serve keras model using triton engine?

It is not, but it is the most efficient way to serve keras models, and this is why by default clearml-serving is using Nvidia Triton (we are talking 10x factors)
I would start with the keras example, see that it works and then work your way into your example (notice you always need to provide the layers form the in/out of the model)
[None](https://github.com/allegroai/clearml-s...

2 years ago

0 How Do I Create Sub Projects With The New Version 1.0?

does this work for multiple levels?

Yep 😄

4 years ago

0 Thanks For Releasing This Awesome Experiment Manager! I Was Logging A Single Training Session On Multiple Gpus (Using Detectron2), And Torch.Mp Is Called For Each Gpu. This Creates A Separate Task In Trains For Each Gpu, And Only One Of The Tasks Has The

Since this fix is all about synchronizing different processes, we wanted to be extra careful with the release. That said I think that what we have now should be quite stable. Plan is to have the RC available right after the weekend.

5 years ago

0 Hello Everyone. I Don'T Uderstand Why Is My Training Slower With Connected Tensorboard Than Without It. I Have Some Thoughts About It But I Not Sure. My Internet Traffic Looks Wierd.I Think This Is Because Tensorboard Logs Too Much Data On Each Batch And

Okay so the way it works is that it moves all the logging to background process, But if you have a Lot of data, actually pushing the data between python processes is Not very efficient. This line basically tells it to just use background thread (instead of background process), for sending the data to the server.
The idea behind using background process in the first place is to better support pytorch workers that spin a lot of subprocesses, and we do not want to add a thread per process and in...

3 years ago

0 Is There Any Examples Of Mounting An Aws Efs Mount To A Self Hosted K8 Agent Deploy?

My task starts up and checks the mounted EFS volume for x data, if x data does not exist there, it then pulls x data from S3.

BoredHedgehog47 you can just use StorageManager and configure clearml cache for the EFS, it will essentially do the same 🙂
Regrading helm chart with EFS,
you need to configure the clearml-glue pod template with the EFS mount
example :
https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/e7f647f4e6fc76f983d61522e635353005f1472f/examples/kubernetes/volu...

2 years ago

0 Hey Guys, Do You Have Any Plans To Add Functionality To Export Training Config With All Hyperparameters To The Different Formats, Such As Training Command Line Command, Yaml, Etc.?

DilapidatedDucks58 if you have so many parameters, why don't you use the
task.connect_configuration(dict)
It will put it in the artifacts, as an editable json alike string.

5 years ago

0 And One More Question. How Can I Get Loaded Model In Preporcess Class In Clearml Serving?

AbruptHedgehog21 what exactly do you store as a Mode file ? is this a python object pickled ?

3 years ago

0 Hi Abit Of A Crazy Question... But Is It Possible To Use Clearml In Rust, Without Writing A Wrapper. I Noticed The Api Doesnt Cover Dataset Operations But The Cli Can.

Hi @<1747428509627715584:profile|CumbersomeDuck6>

but is it possible to use ClearML in Rust, without writing a wrapper.

With the RestAPI you can...

noticed the API doesnt cover dataset operations but the CLI can.

Yes the CLI will fetch/create datasets for you,
wdyt?

11 months ago

0 Hi, Is There A Way To List All Agents Running In A Host, I Do Not Find Relevant One In Clearml-Agent -H.

@<1523701304709353472:profile|OddShrimp85> are you trying to shut down the one running on your machine ?

one year ago

0 Hey, I'M Looking Into The Aws Autoscaler. I Couldn'T Find The Task In My Ui, So I Ran The

That should spin up an instance, right? (it currently doesn't, and I'm not sure where to debug)

Do you see the AWS scaler Task running ?
(This is the code/process that actually spins a new EC2 instance)

4 years ago

0 Hi, Is There A Way To List All Agents Running In A Host, I Do Not Find Relevant One In Clearml-Agent -H.

In the UI you can see all the agents and their IDs
Then you can so

clearml-agent daemon --stop <agent id>

one year ago

0 Hi All, I Have An Issue With The Way Hyper Parameters Are Logged Under Configuration, The Values That Are Stored Seem To Add Unnecessary Escape Characters To The Original Values.. Is It A Known Issue? Is There A Way To Change It? Thanks

Hi DepressedChimpanzee34
How do I reproduce the issue ?
What are we expecting to get there ?
Is that a Colab issue or hyper-parameter encoding issue ?

4 years ago

0 Hi, We Saw 2 More Issues With Images Logging: 1. You Need To Use Tf.Summary.Image And Not Summary_Ops_V2.Image 2. Image Needs To Be In Range [0, 1] And Not [0, 255] (Matplotlib And Tensorboard Can Handle Either One) Is That Expected?

So it seems to get the "hint" from the type:
This will work
tf.summary.image('toy255', (ex * 255).astype(np.uint8), step=step, max_outputs=10)wdyt, should it actually check min/max and manually cast it ?

4 years ago

0 Hi All, Is It Possible To Control The Number Of Steps Of The Pipeline During Run Time. Eg. If User Wants #N Parallel Steps In The Pipeline

. but when we try to do a "New Run" from UI, it tries to follow the DAG of previous run (the run with all child nodes skipped) and the new run fails too.

This is odd, is this reproducible ? what's the clearml python package version ?

2 years ago

0 Hi All, Is It Possible To Control The Number Of Steps Of The Pipeline During Run Time. Eg. If User Wants #N Parallel Steps In The Pipeline

yes

argument saying always create from code

can be helpful

@<1523701523954012160:profile|ShallowCormorant89> any chance you can open a github issue on that, just so we do not forget ?

if we can edit the configuration objects of a pipeline, that can be beneficial too. which we're unable to do from UI

Actually you already can, after you clone the pipeline, you can press on details then go to configuration Tab, and edit the pipeline object. The format is HOCON (...

2 years ago

0 Hi New With Clearml I Create Clearml Server On Gcp With Docker Now I’M Training Yolov5 And I Want To Save All The Info (Model And Metrics ) With Clearml To My Bucket.. (So I Can Have Small Server And No Memory Issue ) Where Should I Start? Its Should Be C

(Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record macWhere is the code running (agent) GCP instance ? your machine ?

2 years ago

0 Hello, I Am Using Clearml In Docker Mode. I Have A Simple Script That Runs Locally, Runs On The Target Machine Running The Same Tensorflow Container, But Doesn'T Run When I Deploy It Using Clearml. Here'S The Log Of The Error:

What about Calling Taskl.init Without the agent?

2 years ago

The confusion matrix shows under debug sample, but the image is empty, is that correct?

2 years ago

0 Hi, Is It Possible To Disable Some Of The System Metrics Monitored? And Also Downsample The Rate Of Logging?

Hi JitteryCoyote63 report_frequency_sec=30. controller how frequently monitoring events are sent to the server, default is every 30 seconds (you can change the UI display to wall-time to review). You can change it to 180 so it will only send an event every 3 minutes (for example).
sample_frequency_per_sec is the sampling frequency it uses internally, then it will average the results over the course of the report_frequency_sec time window, and send the averaged result on the repo...

4 years ago

the error for uploading is weird

wait, are you still getting this error?

2 years ago

0 Hi, I Would Like To Understand How I Can Set The Pip Cache Location For My Agent, I Thought That I Already Had The Right Setting With

So it should cache the venvs right?

Correct,

path: /clearml-cache/venvs-cache

Just making sure, this is the path to the host cache folder

ClumsyElephant70 I think I lost track of the current issue 😞 what's exactly not being cached (or working)?

3 years ago

0 Base_Template_Keras_Simply.Py

DeliciousBluewhale87 could you send the full log of the Task?

4 years ago

0 Hi. Question About Dataset Upload Errors: When Uploading A

Hi PanickyMoth78 an RC with a fix is out, let me know if it works (notice you can now set the max_workers from CLI or Dataset functions) pip install clearml==1.8.1rc1

2 years ago

its should logged all in the end as I understand

Hmm let me check the code for a minute

2 years ago

0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

Thanks NonchalantDeer14 !
BTW: how do you submit the multi GPU job? Is it multi-gpu or multi node ?

3 years ago

0 Is There A Way To Automatically Plot A Metric Vs Epoch ( Example Train_Accuracy, Val_Accuracy, Learning Rate) , Or Do I Need To Store The Metric For Each Epoch And Then Plot It Manually?

Hi FriendlyKoala70 , trains will report all the tensorboard graphs, I'm assuming that's who is creating the epoch_lr graph. On top of it, you can always report manually with logger (as you pointed). Does that make sense to you?

5 years ago

0 I Have A Question About "Logging Artifacts", Lets Say I Run An Experiments On A Remote Machine, And Then I Want To Look At The "Logs" For Completed Runs In My Local Machine, Is It Possible? Can I Copy Paste Some Trains-Artifacts And Then Load Them In The

Hi RipeGoose2 all PR's are welcome, feel free to submit :)

5 years ago

0 Hello, I Am Trying To Use The Sdk Function

Hi @<1644147961996775424:profile|HurtStarfish47>

. I see

Add image.jpg

being printed for all my data items ...

I assume you forgot to call upload ? the sync "marks" files for uploaded / deletion but the upload call actually does the work,
Kind of like git add / push , if that makes sense ?

one year ago

Show more results