AgitatedDove14

49 Questions, 8056 Answers

Active since 10 January 2023

Last activity 9 months ago

Reputation

Badges 1

25 × Eureka!

Questions 49
Answers 8056

0 Votes

4 Answers

42 Views

0 Votes 4 Answers 42 Views

Happy New Year Everyone!

Happy new year everyone! 🥂 🎆 Last minute 🎁 v2.0 is now out, with a new UI design! now finally supporting light & dark mode 🤩 Lot's more to come this year...

clearml

9 days ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with what seems like an easy to use single app. From the Reddit thread it seems that it is still not

PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with...

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Gals, Guys &

Gals, Guys & :robot_face: , if you want to checkout the Hyper-Parameters automation (Using Bayesian Optimization Hyper-Band) We have an example on the demo s...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Is It A One Time Thing? Or Recurring?

Is it a one time thing? or recurring?

clearml

4 years ago

0 Votes

9 Answers

1K Views

0 Votes 9 Answers 1K Views

Hi https://github.com/allegroai/trains/releases/tag/0.15.1 / https://github.com/allegroai/trains-server/releases/tag/0.15.1 / https://github.com/allegroai/tr...

clearml

4 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi , v0.15 is out, 🎉 🚀 Your feedback had a major influence on the features we added 🙂 thank you! A selected list of features: Column resizing / ordering /...

clearml

4 years ago

0 Votes

6 Answers

578 Views

0 Votes 6 Answers 578 Views

Hi :robot_face: , humans We have the new documentation site up and running 🎉 None 🎊 This is still a work in progress, so we keep the previous version alive...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

docs are up

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Finally

clearml

4 years ago

0 Votes

3 Answers

546 Views

0 Votes 3 Answers 546 Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi ClearML v0.17.1 and ClearML-Agent v0.17.0 are now the official packages & repositories 🎉 🎊 👋 🛤️ This new name brings on many changes, mainly replace a...

clearml

4 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi ! ClearML Server + SDK v1.9.0 is out! 🎉 🚀 🎊 Happy Holidays and Happy New Year! ❇️ 🎇 🎄

clearml

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi ! trains 0.16.2 is finally out with the new pipelines interface! Check out the new example https://github.com/allegroai/trains/blob/master/examples/pipeli...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<!channel> *important notice* : it seems Nvidia broke some of their PPA's security :confused: , causing `apt-get updates` to fail inside containers. This in term will cause `clearml-agent` to fail with specific Nvidia containers. _If you are seeing simila

important notice : it seems Nvidia broke some of their PPA's security 😕 , causing apt-get updates to fail inside containers. This in term will cause clearml...

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

:confetti_ball: :champagne: Happy new year <!everyone>! :fireworks: :sparkler: We wanted to thank you all for the great feedback, contribution and general support you guys give us. It is truly fulfilling to see users enjoying the product you build, and y

🎊 🍾 Happy new year ! 🎆 🎇 We wanted to thank you all for the great feedback, contribution and general support you guys give us. It is truly fulfilling to ...

clearml

4 years ago

0 Votes

3 Answers

670 Views

0 Votes 3 Answers 670 Views

We Recently Released A New Version Of

we recently released a new version of clearml-session with Persistent Workspace support! 🚀 🎉 Finally you can develop on remote machines with workspace fold...

remote-ssh

10 months ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

4 years ago

Show more results

0 I Want To Upload Models To The Server, But Store Data Locally Like Dvc, And Only Manage Data Meta-Information In Clearml. What Should I Do?

Hi @<1561885941545570304:profile|PunyKangaroo87>
What do mean by store data locally?
Like clearml-data? I.e Dataset?
You can always use file:///root/path/folder as destination, this will store everything into the local folder, is that it?

one year ago

0 Hi Clearml, Does Clearml Orchestration Have The Ability To Break Gpu Devices Into Virtual Ones?

BattyLizard6 to my knowledge the main issue with fractional GPU, is there is no real restriction on GPU memory allocation (with the exception of MIG slices, which is limited in other ways).
Basically one process/container can consume the maximum GPU ram on the allocated card (this also includes http://run.ai fractional solution, at least from what I understand).
This means that developer A can allocate memory so that developer B on the same GPU will start getting out-of-memory
(Notice in a...

3 years ago

0 Hi, Currently It Seems That Trains-Agent Writes Files With The User "Nobody", Group "Nogroup" And Permissions 777 To Created Files. How Can I Change That? To The Very Least, Change The User Group It Uses? Running On Linux Ubuntu

why would root cause the user to become nobody with group nogroup?

It is exactly the case, they inherit the cron service user (uid/gid) which would look like nobody/nogroup

4 years ago

0 Hi! I Am Trying To Build And Run A Pipeline. I Pass My Dataset As Parameter Of Pipeline:

I pass my dataset as parameter of pipeline:

@<1523704757024198656:profile|MysteriousWalrus11> I think you were expecting the dataset_df dataframe to be automatically serialized and passed, is that correct ?
If you are using add_step, all arguments are simple types (i.e. str, int etc.)
If you want to pass complex types, your code should be able to upload it as an artifact and then you can pass the artifact url (or name) for the next step.

Another option is to use pipeline from dec...

one year ago

0 Hello! Since Today I Get

Just tried: also works with 0.17.2

Great!

3 years ago

0 Question About The File Server. Currently, We Have A Machine With Minio Installed, And All File Communication Is Made Using The Minio Sdk Client. [Minio Is Just Like An S3 Bucket, Fully Compliant With S3 Protocol]. In The Examples I'Ve Seen The

WackyRabbit7 this section is what you need, un mark it, and fill it in
https://github.com/allegroai/trains/blob/c9fac89bcd87550b7eb40e6be64bd19d4384b515/docs/trains.conf#L88

4 years ago

0 Reducing Docker Container Spin-Up Time With Clearml Agent

Woot woot!
awesome, this RC is stable you can feel free to use it, the official release is probably due to be out next week :)

2 years ago

0 Clearml Server Deployment Uses Node Storage. If More Than One Node Is Labeled As App=Clearml, And You Redeploy Or Update Later, Then Clearml Server May Not Locate All Your Data.

Hi TrickySheep9
You should probably check the new https://github.com/allegroai/clearml-server-helm-cloud-ready helm chart 😉
https://github.com/allegroai/clearml-server-helm-cloud-ready

3 years ago

0 Hi, Can You Pls Help Me? I Am Using V 0.14 (Will Update It Soon) And I Got The Following Error: /Usr/Bin/Python3.6: No Module Named Virtualenv Trains_Agent: Error: Command '['Python3.6', '-M', 'Virtualenv', '/Home/Ubuntu/.Trains/Venvs-Builds.2/3.6']' Ret

at the end of the manual execution

4 years ago

0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

JitteryCoyote63 that makes total sense!!
The reporting subprocess is not being updated with the new value! Let me check how we can pass it along...

3 years ago

0 Hi! I Was Taking A Look At The

Hi GrievingTurkey78
I think the main issue is the lack of support for jsonargparse , is that correct ?
(vanilla pytorch lightning is using argpraser, which seems to work out of the box)

3 years ago

0 Hello! I’M Wondering If There Is An Option To Run A Termination Hook Script

basically

would allow blocking the machine from being scaled-in when

Oh this is what I was missing 🙂 That makes sense to me!
So what you are saying is that the AWS autoscaler agent, when it is launching a Task, inside the container you will set "protection flag" when the Task ends, you will unset "protection flag"
Is that correct?

2 years ago

0 Hello Channel, Two Other Related Questions:

Hi @<1556812486840160256:profile|SuccessfulRaven86>

it does not when I run a flask command inside my codebase. Is it an expected behavior? Do you have some workarounds for this?

Hmm where do you have your Task.init ?
(btw: what's the use case of a flask app tracking?)

Then I deleted those workers,

How did you delete those workers? the autoscaler is supposed to spin the ec2 instances down when they are idle, in theory there is no need for manual spin down.

one year ago

0 Hello! I’M Wondering If There Is An Option To Run A Termination Hook Script

A single query will return if the agent is running anything, and for how long, but I do not think you can get the idle time ...

2 years ago

0 Hey, Could You Help Me? I’Ve Tried Update Clearml-Server In K8S Old And New Clearml In The Different Namespaces, But After Migrate I Got The Error Error 101 : Inconsistent Data Encountered In Document: Document=Output, Field=Model How It Fix?

ResponsiveCamel97
could you attach the full log?

3 years ago

0 Is There Any Reason Why Doing The Following Is Not Possible? Am I Doing It Right? I Want To Run A Pipeline With Different Parameters But I Get The Following Error?

Quick update, I found the issue, working on a fix 🙂

3 years ago

0 So I Bumped Onto This Comparison Shared By Dagshub. It Kinda Placed Clearml Is A Rather Bad Position Compared To Everything Else In The Industry.

Thanks CynicalBee90 I appreciate the discussion! since I'm assuming you will actually amend the misrepresentation in your table, let me followup here.
1.

SPSS license may be a significant consideration for some, and so we thought it was important to point this out clearly.

SPSS is fully open-source compliant unless you have the intention of selling it as a service, I hardly think this is any users consideration, just like anyone would be using mongodb or elastic search without think...

3 years ago

0 Hello! I'M Just Starting Out With Clearml, And I Seem To Be Having Some Sort Of Conflict Between

One additional question, if you import clearml after you call torch does it work ?

3 years ago

0 Hi, I Would Like To Pass In Some Pip Arguments That Clearml-Agent Would Include When Setting Up The Venv On The Containers. How Should I Specify This? The Argument In Question Are --Trusted-Host And --Find-Links . I Need Them As I'Ve Installed A Pypi Repo

Hmm let me check something

3 years ago

0 Hi Again. As I Am Running My Experiment From Server Using Agent, I Am Failing On The Point, Where The Arguments Of Argparse Are Processed. When Is The Agent Task Registered. I Am Getting None For Task.Current_Task() At The Begining Of My Script.

It will store the entire content of the file, then you can edit it in the UI, and in remote it will return a new local copy of the file (based on the data in the UI) for you to read.

4 years ago

0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

not sure what is the "right way" 🙂
But I do pkill -f "trains-agent --gpus 0" This will kill a process that started "trains-agent --gpus 0" Notice it matches the cmd pattern so it has to match the way you executed the agent. You can check it with ps -Af | grep trains-agent

4 years ago

0 Is There Any Examples Of Mounting An Aws Efs Mount To A Self Hosted K8 Agent Deploy?

So there is no copying of the data to the pod, it is simply references via the EFS

Correct

2 years ago

0 Hi, I Am Trying To Upload A Plot To An Existing Task Using The

Thanks you !
I'll look into it

4 years ago

0 I Am Using `

Many thanks!

4 years ago

0 Hello! I Have A Quick Question About The Clearml Hyperparameter Optimizations Module. Is It Possible To Use It Without Using The Clearml Agent System? In Other Words, Launch A Script From A Few Machines Manually But The Hyperparameters Are Given From Cle

if so is there any doc/examples about this?

Good point, passing to docs 🙂
https://github.com/allegroai/clearml/blob/51af6e833ddc5a8ba1efaaf75980f58616b25e85/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py#L123
I mean it is mentioned, but we should highlight it better

2 years ago

0 Hi, I Wanted To Try Model Versioning, Suppose That I'Ve A Model And Want To Have Multiple Versions Of The Same Model And To Be Able To Have Inference On These Models(For Example

@<1671689437261598720:profile|FranticWhale40> could you test the fix? just pull & run

allegroai/clearml-serving-triton:1.3.1
allegroai/clearml-serving-inference:1.3.1

10 months ago

0 Hi, I Wanted To Try Model Versioning, Suppose That I'Ve A Model And Want To Have Multiple Versions Of The Same Model And To Be Able To Have Inference On These Models(For Example

making me realize that this may have been optional

I think it is optional, and this is why it was not entered in the first place.
Could you double check and just remove it from your manual pbtxt ?

10 months ago

0 Hi, I Wanted To Try Model Versioning, Suppose That I'Ve A Model And Want To Have Multiple Versions Of The Same Model And To Be Able To Have Inference On These Models(For Example

Thanks @<1671689437261598720:profile|FranticWhale40> !
I was able to locate the issue, fix should be released later today (or worst case tomorrow)

10 months ago

0 Hi, I Wanted To Try Model Versioning, Suppose That I'Ve A Model And Want To Have Multiple Versions Of The Same Model And To Be Able To Have Inference On These Models(For Example

Hi @<1671689437261598720:profile|FranticWhale40>
Are you positive the Triton container finished syncing ?
Could you provide the docker log (both the serving and the triton)?
What is the clearml-serving version you are using ?
Could you add a print in the "preprocess" function, just to validate you are getting to the correct model version ?

10 months ago

0 Hi, I Wanted To Try Model Versioning, Suppose That I'Ve A Model And Want To Have Multiple Versions Of The Same Model And To Be Able To Have Inference On These Models(For Example

we also provide a custom

aux-config

file. We also had to make sure to update the name inside

config.pbtxt

so that Triton is happy:

Good point, what would be the logic of the auto "config.pbtxt" patching we should employ ?

10 months ago

Show more results