AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 Hey Everyone, In Case Anyone Is Interested, I Created A Utility Script For Making Backup Snapshots Of A Local Clearml Server Without Server Shutdowns. My Team Is Working With Large Datasets And Long Running Tasks Which Makes Periodic Server Shutdowns Real

Thanks @<1547028074090991616:profile|ShaggySwan64> !!
Passing to the backend guys to take a look

2 months ago

0 Hi, Can You Help Me Pls, I Got: Environment Setup Completed Successfully Starting Task Execution: Traceback (Most Recent Call Last): File "Agro_Api.Py", Line 13, In From Help_Models.Consts Import Urls Importerror: No Module Named 'Help_Models'

That sounds like an issue with "working dir" , check the "Execution" "Working Directory" field.
'.' means the root of the git repository
'subfolder' means run the script from the subfolder etc. also make sure that the script path is adjusted accordingly.

btw: Trains should have filled in all the correct paths... If you have time get the latest trains (0.14.3) and run again see if the problem consts, we should probably fix that bug 🙂

5 years ago

0 A Suggestion. Sometimes Newcomers That Join An Existing Project That Uses Clearml Forget To Configure Their Clearml For The Organization'S Server Resulting In Them Launching Experiments To The Public Cloud Possibly With Sensitive Data - I Think That If Y

We listen to you guys 🙂

3 years ago

0 Fyi: Conda Installation Of Pytorch Is Broken Again. My Old Tasks Which Worked Before Now Fail Since They Do Not Find Torch. However, I Can See In The Execution That Conda Had Errors. Most Probably It Happens Because Pytorch 1.8.1 Has Been Released, But I

Yea I know, I reported this

LOL, apologies these days it a miracle I still remember my login passwords 😉

4 years ago

0 Hi, Is There Any Document About Migration Clearml-Server. Currently, I Have Clearml-Server Running On Servera But I Want To Move All Data (Including Artifacts, Task, Dataset) From Servera To Serverb.

VictoriousPenguin97 I'm assuming the exact same server version ?

3 years ago

0 Playing Around With Hpo For First Time. I Am Giving This As Hyperparameter:

Yeah, Curious - is a lot of clearml usecases not geared for notebooks?

That is somewhat correct, notebooks are not actually used with a lot of deep-learning projects as they require entire repository to support.
I guess generally speaking the workflow is, "test your code" (i.e. small scale with limited data), then clone and enqueue for remote execution.
That said, I think it will be great to expand the support.
TrickySheep9 I like the idea of context for Tasks, can you expand on how...

4 years ago

0 Hi, I Have A Question Regarding

sure thing 🙂

2 years ago

0 Happy Friday Everyone

BTW: if you only need the git diff you can just copy them from the UI into a txt file and do:
git apply <copied-diff.txt>

3 years ago

0 Hello, How To Change The File Path Of Task’S Artifacts. I’M Migrating Clearml-Server From Servera To Serverb. I’D Also Like To Move The File Storage From Minioa To Miniob As Well. Currently, All Artifacts Have Been Cloned To Miniob. But When I Go To Clear

VictoriousPenguin97 I'm not sure there is an easy solution, basically you have to edit both MongoDB (artifacts) and Elastic (think debug samples) 😞

3 years ago

0 Hi Folks, I Did A Deployment Of Clearml Using The K8S Helm Chart, And I Set The Agent Using K8S Glue. I Run A Task Locally, And I Went To The Ui Cloned The Experiment And Scheduled It In The Default Queue. After Doing This, I See That The Experiment Is Q

StickyLizard47 apologies for the https://github.com/allegroai/clearml-server/issues/140 not being followed (probably slipped through the cracks of backend guys, I can see the 1.5 release happened in parallel). Let me make sure it is followed.
SarcasticSquirrel56 specifically, did you also spin a clearml-k8s glue? or are the agents statically allocated on the helm chart?

3 years ago

0 Hi Everyone, I'M Using Clearml-Serving With Triton And Have A Couple Of Questions Regarding Model Management:

... Would not work for huge llm style models.

yes I agree... but then if the model is small enough then you can just keep it in memory ...

one year ago

0 I Am Back With Another Question: Is There A File Similar To The

Could you give an example of such configurations ?
(e.g. what would be diff from one to another)

4 years ago

0 When I Run An Experiment (Self Hosted), I Only See Scalars For Gpu And System Performance. How Do I See Additional Scalars? I Have

BoredHedgehog47 you need to make sure "<path here>/train.py" also calls Task.init (again no need to worry about calling it twice with different project/name)
The Task.init call will make sure the auto-connect works.
BTW: if you do os.fork , then there is no need for the Task.init, the main difference is that POpen starts a whole new process, and we need to make sure the newly created process is auto-connected as well (i.e. calling Task.init)

2 years ago

0 How Can I Tell Clearml To Ignore Certain Submodules Existing In The Project? My Projects Consists Of Multiple Git Submodules And It Is Rather Annoying That The Task Always Tries To Fetch All Submodules, When They Are Not Even Necessary. I Don'T Know How I

Hi @<1694157594333024256:profile|DisturbedParrot38>
You mean how to tell the agent to pull only some submodules of your git?
If this is the case you can actually remove them on your git branch, submodule is a file with a soft link. Wdyt?

one year ago

0 Sorry Folks Too Many Questions - If I Have A Project (And I Set The Output Uri In It While Creating, To A S3 Folder) How Can I Ensure That A Experiment (Task) That I Run On My Local Outputs The Model To The Uri?

Hope you don’t mind linking to that repo

LOL 😄

4 years ago

0 Encountered An Odd Bug. Upon Attempting To Write Images To Clearml (3D Projected, Matplotlib),

The issue only arises upon sending Images. (Both numpy, mpl and PIL)

BTW: they should appear under debug-samples Tab in the results

4 years ago

0 Our Mac Users Are Having Some Issues. They Have Their Respective ~/Clearml.Conf, And Yet They Get: Clearml 1.1.5

Are they expanded in the "api_server" ? (I verified on a linux machine, same error, the env in the api_server is not being resolved)

3 years ago

0 Hi, Currently It Seems That Trains-Agent Writes Files With The User "Nobody", Group "Nogroup" And Permissions 777 To Created Files. How Can I Change That? To The Very Least, Change The User Group It Uses? Running On Linux Ubuntu

You cannot change the user once you have mount the shared folder with wither CIFS or NFS

4 years ago

0 Hi All, There Is A Way To Get From A Task-Object The Experiment Source Code? In Other Words, Assume I Have Access To A Specific Trains Server And Want To Store From A Particular Task The Experiment Source Code In A Temp File. There Is A Convenient Way To

SpotlessFish46 unless all the code is under "uncommitted changes" section, what you have is a link to the git repo + commit id

4 years ago

0 Hi, Is There Any Way To Upload Data To A Clearml Dataset Without Compression At All? I Have Very Small Text Files That Make Up A Dataset And Compression Seems To Take Most Of The Upload Time And It Provide Almost No Benefits W.R.T Size

HugeArcticwolf77 oh no, I think you are correct 😞
Do you want to quickly PR a fix ?

2 years ago

0 Hey Guys, Anyone Knows What It Means If I Deployed A New Trains Server And When I Access It My Tab Looks Like This?

Could you send the logs?

4 years ago

0 Hey, I'Ve Spin Up A Worker Using Aws Autoscaler In Clearml Self Hosted Server Running On Kubernetes. However, I Can'T Find The Agent On The Workers Page. Any Idea Why It'S Not Showing Up? Full_Log:

@<1595587997728772096:profile|MuddyRobin9> are you sure it was able to spin the EC2 instance ? which clearml version autoscaler are you running ?

2 years ago

0 When We Train The Models, We Often Choose Checkpoint Based On The Validation Accuracy, But Test Set Accuracy (Or Specific Class Validation Accuracy) Is Not Necessarily The Best For This Checkpoint. Right Now There Are Options To Add Columns With Max And L

DilapidatedDucks58 I see ...
This might be more complicated that one would imagine, a simple solution might be to store a snapshot of the values every-time we reach a new maximum, a quick hack might be to add it as text on one of the task's parameters or properties (that we can later add to the table as custom column).
wdyt?

4 years ago

0 Upon Calling Task.Init(), I Get Below Error: Failed Getting Token (Error 401 From

command line 🙂
cmd.exe / bash

5 years ago

0 Hello Everyone, I Have A Question About Ssh/Credentials: Let'S Say I Have Multiple Users / Multiple Ssh Credentials That I Do Not Want To Share With The Clearml-Agent Workstations. Is There A Way To Send Credentials To The Agent In The Task? So For Exampl

Hi ReassuredTiger98

I do not want to share with the clearml-agent workstations.

Long story short, no 😞
The agent is responsible to spin all jobs, regardless of users, basically it has to have a read-only user for all the repositories. I "think" the enterprise version has a vault feature, that allows you to store these kind of secrets on the User itself.
What exactly is the use case?

4 years ago

0 Two Simple Lineage Related Questions:

You’ll just need the user to

name them

as part of loading them in the code (in case they are loading multiple datasets/models).

Exactly! (and yes UI visualization is coming 🙂 )

3 years ago

0 Hey Guys, Sorry For The Rapid Fire Questions In The Past Few Days. I Have Another Issue Though. I Initially Ran A Task, Directly From A Repo. It Succesfully Installed The Requirements From The Requirements File In The Repo And Ran The Task Without Any Iss

Anyway, in the docs, there is a function called task.register_artifact()

Yes, this is rather deprecated... The idea is that it will monitor an obejct and auto sync it (i.e. serialize and upload).
That said, it is just so much easier to do task.upload_artifact and you can always update/overrwrite if you are passing the same name, that I cannot see the actual use case. Does that make sense? What are you using it for ?

3 years ago

0 Hello Everyone. I Don'T Uderstand Why Is My Training Slower With Connected Tensorboard Than Without It. I Have Some Thoughts About It But I Not Sure. My Internet Traffic Looks Wierd.I Think This Is Because Tensorboard Logs Too Much Data On Each Batch And

What's the OS / Python version?

3 years ago

0 Hello, Is There A Way To Disable Dataset Caching So That When

FreshParrot56 we could add this capability, but the main caveat is that f your version depends on multiple parent versions you still need to download and extract all the parent versions, which means that when you clear them you might hurt later performance. Does that make sense? What is the use-case / scenario for you?

2 years ago

0 Hi, We’Re Deploying Clearml On The Eks And Have An Issue With Authenticating The Server With The S3 Bucket. The Connection To S3 Bucket Is Not Working. Our Current Diagnosis: Clearml Internally Uses Aws_Access_Key_Id And Aws_Secret_Access_Key. But We A

Is it possible to make a connection to a S3 bucket via this authentication method with the open source version on EKS?

Hi BoredBluewhale23
In your setup, are we talking about agents running inside the Kubernetes cluster, or clients connecting from their own machine ?

3 years ago

Show more results