AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Questions 49
Answers 8124

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi ! ClearML Server + SDK v1.9.0 is out! 🎉 🚀 🎊 Happy Holidays and Happy New Year! ❇️ 🎇 🎄

clearml

2 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!here> New video is out :slightly_smiling_face: Cloud Autoscalers are awesome <https://www.youtube.com/watch?v=j4XVMAaUt3E>

New video is out 🙂 Cloud Autoscalers are awesome https://www.youtube.com/watch?v=j4XVMAaUt3E

clearml

3 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

We Are At Aaai Ny, Come Look Us Up :)

We are at AAAI NY, come look us up :)

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!channel> *important notice* : it seems Nvidia broke some of their PPA's security :confused: , causing `apt-get updates` to fail inside containers. This in term will cause `clearml-agent` to fail with specific Nvidia containers. _If you are seeing simila

important notice : it seems Nvidia broke some of their PPA's security 😕 , causing apt-get updates to fail inside containers. This in term will cause clearml...

clearml

3 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Lol, I Wonder What The Adblock Rule Was ;)

Lol, I wonder what the adblock rule was ;)

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

@YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo server, and do get the Scalars without any issues...

YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo se...

clearml

5 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Lstmeow Is Back! Bots/Gals/Guys Feel Free To

LSTMeow is back! Bots/Gals/Guys feel free to 👍 None

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<https://allegro.ai/docs>

https://allegro.ai/docs

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

New releases: ```pip install trains==0.13.3``` <https://github.com/allegroai/trains/releases/tag/0.13.3> ```pip install trains-agent==0.13.2``` <https://github.com/allegroai/trains-agent/releases/tag/0.13.2>

New releases: pip install trains==0.13.3https://github.com/allegroai/trains/releases/tag/0.13.3 pip install trains-agent==0.13.2https://github.com/allegroai/...

clearml

5 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

We Recently Released A New Version Of

we recently released a new version of clearml-session with Persistent Workspace support! 🚀 🎉 Finally you can develop on remote machines with workspace fold...

remote-ssh

one year ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

YEY!!!! *Download as CSV* :exploding_head:

YEY!!!! Download as CSV 🤯

clearml

3 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

2 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Apparently Everyone Can ...

apparently everyone can ...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Slack Security ... Go Figure

Slack security ... Go figure 😉

clearml

5 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

This Will Close It

This will close it Task.current_task().close()I think we should rename completed() because it just marks the Task as completed on the backend but does not ac...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hey <!here> Just a heads up, starting *Jan 25th*, the default <http://demoapp.demo.clear.ml/|ClearML demo server> will move to a *daily* reset cycle (replacing the current weekly cycle). Anybody needing more than 24h data retention is welcome to use our <

Hey Just a heads up, starting Jan 25th , the default http://demoapp.demo.clear.ml/ will move to a daily reset cycle (replacing the current weekly cycle). Any...

clearml

4 years ago

Show more results

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

PricklyRaven28 did you set the iam role support in the conf?
https://github.com/allegroai/clearml/blob/0397f2b41e41325db2a191070e01b218251bc8b2/docs/clearml.conf#L86

3 years ago

0 Question About The Usage Of Trains Agents. In Our Company We Have 3 Hpc Servers, Two Of Them Have Multiple Gpus, One Is Cpu Only. I Saw In The Docs The Multiple Agents Can Be Run Separately Assigning Gpus In Whatever Manner You Want. My Questions Are 1

So I assume, trains assumes I have nvidia-docker installed on the agent machine?

docker + nvidia-docker-runtime are assumed to be installed
nvidia/cuda docaker image is pulled when requested (like any other container image)

Moreover, since I'm going to use Task.execute_remotely (and not through the UI) is there any code way to specify the docker image to be used?

Sure, task.set_base_docker(docker_cmd='nvidia/cuda -v /mnt:/tmp')
Notice that you can not only pass the dock...

5 years ago

0 I Wanted To Ask, How To Run Pipeline Steps Conditionally? E.G If Step Returns A Specific Value, Exit The Pipeline Or Run Another Step Instead Of The Sequential Step

VexedCat68 both are valid. In case the step was cached (i.e. already executed) the node.job will be None, so it is probably safer to get the Task based on the "executed" field which stores the Task ID used.

3 years ago

0 I Cannot Get The Configuration From A Task: I Run

In the documentation it warns about

.close()

"Only call Task.close if you are certain the Task is not needed."

Maybe this is not clear enough, this means you do not need to automatically Add/Log/Track things into the Task in the current process.
This does Not mean you cannot access the Task or its artifacts

Mark closed means to externally (i..e not from the process that crated the Task, maybe even from a different machine) close and mark the task as completed (this...

2 years ago

0 Hello! Since Today I Get

Well, in that case, just change the order it should solve it (I'll make sure we have that as the default:

conda_channels: ["pytorch", "conda-forge", "defaults", ]

It should solve the issue 🙂

4 years ago

0 Hi, I'Ve Recently Upgraded To 0.15.1 From 0.14.2, And For Some Reason A Code That Previously Worked In Which I'M Getting The Tags Of A Model Using

BTW: how are you using them? should we have a direct interface to those ?

5 years ago

0 Hi Community

Great to hear!

2 years ago

0 Hi Again Everybody, Is There A Way To Cache The Docker Containers Used By The Agents? As Far As I Understood Every Time It Spawns A New Container With A Shared Apt Cache (And Venv Cache If Configured)

Hi TrickyFox41

is there a way to cache the docker containers used by the agents

You mean for the apt get install part? or the venv?
(the apt packages themselves are cached on the host machine)
for the venv I would recommend turning on cache here:
https://github.com/allegroai/clearml-agent/blob/76c533a2e8e8e3403bfd25c94ba8000ae98857c1/docs/clearml.conf#L131

2 years ago

0 Hi All! I'M Using Clearml With Hydra As Configuration Manager. I'M Trying To Rerun A Task By Overriding Some Of The Configurations From The Ui. I Tried To Change The Config_Name Args In The Args Section And Also The Omegaconf Configuration In Configuratio

For example, could you test if this one works:
https://github.com/allegroai/clearml/blob/master/examples/frameworks/hydra/hydra_example.py

4 years ago

0 Hello! When Trying To Use Clearml Datasets With Google Cloud Storage With The Authorized User Credentials It Will Fail And Say Some Fields Are Missing From The Json. This Isn'T An Issue If The User Is Using A Service Account Json Key, Is A Service Account

Hi ShortElephant92

This isn't an issue if the user is using a Service Account JSON Key,

Are you saying that when you are using GS python sdk directly it works?

For context, the google cloud storage SDK allows an authorized user credentials.

ClearML actually uses the google python SDK, the JSON is just a way to pass the credentials to the google SDK, I'm not sure it points to "service account"? where did that requirement came from ?
is it from here ` Service account info was n...

2 years ago

0 Is There Any Problem With The Website?

seems it was fixed 🙂
MagnificentWorm7 thank you for noticing ! 🙏

3 years ago

0 Hello Guys, I Read About Trains Some Days Ago And Think It Is Exectly What I Was Looking For, So I Ran The Docker Image And Started Thinking Of What I Would Like To Do And The Processing Steps I Would Like To Automize Which I Currently Run Manually Trigge

Hi WickedGoat98
This sounds like a great design (obviously you have scale in mind 😉 ) Feel free to ask "stupid" questions, based on what you already wrote I doubt they will be
A few questions that come to mind (probably a few others after):
You mentioned FS synchronization, from where? i.e. what is the single source of truth ? K8s (Rancher 2.0 is basically k8s manager) can take care of mounting volumes, so no need to sync, is this a valid solution ?

BTW : (you can drag and drop an i...

4 years ago

0 Base_Template_Keras_Simply.Py

DeliciousBluewhale87 could you send the new log?

4 years ago

0 Hi, I Have This Issue With Clearml Datasets. Do You Know Hot To Solve It?

Hi LazyFish41
Could it be some permission issue on /home/quetalasj/.clearml/cache/ ?

4 years ago

0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

SarcasticSquirrel56

if I configure manually the pods for the different nodes, how do I make clearml server aware that those agents exist?

Basically the agent register themselves on your cleaml-server, and they register on which Queue(s) they listen to. In other words the interface to choose the different types of machines/gpus is by enqueue the Task to different queues.
For example: Queue(1): "CUDA11_GPUx1" , Queue(2): "CUDA10_GPUx1"
Make sense ?

EDIT:

I guess to achieve what I w...

3 years ago

0 Can Someone Help Me With Deploying This Example Model (From Triton Inference Server) Deployed In Clearml-Serving? Too Many Random Errors For Me To Figure It Out

damn I think this is the issue:
https://github.com/allegroai/clearml-serving/blob/b5f5d72046f878bd09505606ca1147d93a5df069/clearml_serving/serving_service.py#L553

4 years ago

0 Can Someone Help Me With Deploying This Example Model (From Triton Inference Server) Deployed In Clearml-Serving? Too Many Random Errors For Me To Figure It Out

On my to do list, but will have to wait for later this week (feel free to ping on this thread to remind me).
Regrading the issue at hand, let me check the requirements it is using.

4 years ago

0 Hi Folks I Have A Problem I Can'T Understand. Plots Are Not Shown When Experiments Are Executed From The Ui. For Example, If I Run The Code On My Laptop, And I Go To The Experiment Page I Can See Correctly The Plots: But If I Then Clone The Task, And Ex

I changed them to the one exposed to the users (the same I have in my local clearml.conf) and things work.

Nice!

But I can't really figure out why that would be the case...

So the thing is, the link to the files are generated by the clients, which means the actual code generated a link an internal link to the file server (i.e. a link that only works inside the k8s cluster). When you wanted to see the image/plot you were accessing it from outside the cluster, and the link simply ...

3 years ago

0 I'Ve Tried Setting Up A Clearml Application On Openshift Using The Helm Chart But The Pods Cannot Go Up Because They Are Trying To Write To Files And Directories That Aren'T Open To Non Root Users During Their Setup. This Is A Problem On Openshift Because

i've tried setting up a clearml application on openshift

First, my condolences 🙂 openshift ...
Second, what you need to make sure is that each container (i.e. ELK/Monogo etc) has their own PV for persistent storage , I'm assuming this is the root cause for the error.
Make sense to you ?

3 years ago

0 Hi. I'M Using Clearml For Logging My Experiments. Can I Compare Experiments By Plotting Graphs? For Example, Every Experiment Logs The Time Per Training Iteration And The Accuracy Per Epoch. I Want To Create A Graph With "Average Time Per Iteration" As X-

SoreDragonfly16 . In the hyper parameters Tab, you have "parallel coordinates" (next to the "add experiment" the button saying "values" press on it and there should be " parallel coordinates")
Is that it?

4 years ago

because comparing experiments using graphs is very useful. I think it is a nice to have feature.

So currently when you compare the graphs you can select the specific scalars to compare, and it Update in Real Time!
You can also bookmark the actual URL and it is fully reproducible (i.e. full state is stored)
You can also add custom columns to the experiment table (with the metrics) and sort / filter based on them, and create a summary dashboard (again like ll pages in the web app, URL is...

4 years ago

0 Hey! I'M Having A Weird Issue When I Run Pip Freeze Locally It'S Showing Version "Clearml==0.17.5Rc6" But When I Initiate The Task It'S Always Starting With "Clearml==0.17.2" - This Version Isn'T Accepting Tags Through The Code Etc. (I'M Manually Fixing I

Hmmm, are you running inside pycharm, or similar ?

4 years ago

0 Colors Of Cm Reporting Are Strange... Is It Possible To Adjust The Default Ones

Could you maybe send a screenshot? This is very strange? Also what's the trains version?

5 years ago

0 Hi All, I Have Deployed A Clearml Server With Docker To One Of Our Local Machine. I Had Set Up The Filesserver Folder As Mount Point To The Cloud. How Easy Is It To Migrate Our Existing Experiments Later On To A Clearml Server That We Deploy In The Cloud

Oh, I was assuming you are passing the entire DB backups to the cloud.
Are you saying you just want the file server on the cloud ? if this is the case, I would just use S3

2 years ago

0 I'D Been Following The Clearml Serving Example On Its Github Repo Here. It Basically Deploys A Keras Mnist Model. The Tutorial However Ends Once The Model Is Deployed However And I'Ve Tried Going Through Resources On How To Do Inference But Have Had Troub

Thank you!!!

3 years ago

0 Is It Possible To Increase The Polling Interval For K8S Glue? Currently It Is 5 Seconds I Believe. Would Adding An Argument For It Help? Can Do A Pr If So

This is the thread checking the state of the running pods (and updating the Task status, so you have visibility into the state of the pod inside the cluster before it starts running)

4 years ago

0 Hi All, I Am Having Trouble Using The

Can you print the actual values you are passing? (i.e. local_file remote_url )

3 years ago

0 Hi! I'M Currently Saving A Dataframe With Predictions Inside The Task. To Do So, I Save A Dataframe As Pickle File In

BTW: this is probably more efficient than pickling
https://pandas.pydata.org/pandas-docs/version/1.1.5/reference/api/pandas.DataFrame.to_parquet.html

4 years ago

0 Hello, A Question About Pipelines. I Have A Repository With One Pipeline Using Decorators, Defined In

Hi @<1798887585121046528:profile|WobblyFrog79>

. When I execute the pipeline remotely in Kubernetes, those components

two things, one, make sure you specify the repo you need the components from in the decorator function, what will happen is the repo will be cloned into the container running on k8s, then inside the repo root your script (i.e. pipeline step) will be running.
[None](https://github.com/clearml/clearml/blob/9c93aa9e538075c848647dcd88e3e12bec051b5f/clearml/automation/con...

7 months ago

0 Is It Possible To Launch A

Hi ShallowArcticwolf27

from the command line to a remote machine while loading a local

.env

file as a configuration object?

Where would the ".env" go to ? Are we trying to pass it to the remote machine somehow ?

4 years ago

Show more results