AgitatedDove14

48 Questions, 8051 Answers

Active since 10 January 2023

Last activity 8 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8051

0 Hi Everyone, Is There Something Like A Clearml Context Manager To Disable Automatic Logging? I Use Torch.Save And Torch.Load To Temporarily Cache Something On Disk. I Delete It Afterwards. I Do Not Want Clearml To Push It To The Clearml-Server As An Artif

Hi @<1523701868901961728:profile|ReassuredTiger98>

is there something like a clearml context manager to disable automatic logging?

Sure just do a wildcard with the files you actually want to autolog the rest will be ignored:
None

task = Task.init(..., auto_connect_frameworks={'pytorch' : '*.pt'}

one year ago

0 Sorry Folks Too Many Questions - If I Have A Project (And I Set The Output Uri In It While Creating, To A S3 Folder) How Can I Ensure That A Experiment (Task) That I Run On My Local Outputs The Model To The Uri?

sdk.conf will add it to the default loaded values (as I think you deduced).
can copy paste the sdk.conf here? (maybe something is missing there?)

3 years ago

0 I Seem To Be Missing Something ... I'Ve Only Got One Task Running To Train A Segmentation Model On My Local Machine, And In A Few Days It'S Hit Over 1.15M Api Calls. It Looks Like It'S Sending Every Single Console Output ... Are There Settings To Control

Under your profile you should be able to see it

one year ago

You added it to the deprecated function 🙂
https://github.com/allegroai/clearml/blob/8496b99888deda45955bdd95f22dc929a88ec67b/clearml/backend_config/config.py#L400
See this line
https://github.com/allegroai/clearml/blob/8496b99888deda45955bdd95f22dc929a88ec67b/clearml/backend_config/config.py#L338

3 years ago

0 When Using Docker Mode (And Specifically K8S Glue), What Are The Options For Caching? One Option Is Definitely Having A Base Image That Has The Things Needed. Anything Else? Thanks!

Yep 🙂

3 years ago

0 Hi, I'M Trying To Set Storage Manager To Use Our Internal Miniio Installation But I Ran Into This Issue With This Testing Code:

at that point we define a queue and the agents will take care of training

This is my preferred way as well :)

4 years ago

0 Hi, I'M Trying To Set Storage Manager To Use Our Internal Miniio Installation But I Ran Into This Issue With This Testing Code:

Sounds good to me 🙂

4 years ago

0 When Using Docker Mode (And Specifically K8S Glue), What Are The Options For Caching? One Option Is Definitely Having A Base Image That Has The Things Needed. Anything Else? Thanks!

A few examples:
https://medium.com/@Sushil_Kumar/readwritemany-persistent-volumes-in-google-kubernetes-engine-a0b93e203180
https://docs.openshift.com/enterprise/3.1/install_config/storage_examples/shared_storage.html

3 years ago

0 Hi, We Have Been Using Clearml In Our Development Environment To Train Our Models And Benchmarking Them. I Was Wondering What Is Clearml'S Role In Transition To (Production. Two Specific Points, Deployment, And Automated Retraining Pipeline.

Hi SubstantialElk6

Generically, we would 'export' the preprocessing steps, setup an inference server, and then pipe data through the above to get results. How should we achieve this with ClearML?

We are working on integrating the OpenVino serving and Nvidia Triton serving engiones, into ClearML (they will be both available soon)

Automated retraining

In cases of data drift, retraining of models would be necessary. Generically, we pass newly labelled data to fine...

3 years ago

0 Hi, I Was Running My Agent And Had A Few Scripts For Agent.Extra_Docker_Shell_Script. But When I Looked Through The Logs, They Were Not Executed. Any Idea Why? Using Agent V1.01R1 In K8S Glue.

Using agent v1.01r1 in k8s glue.

I think a fix was recently committed, let me check it

3 years ago

0 Heya, The Owner Of My Current Pro Saas Deployment Workspace Has Changed Of Google Account And The Google Account He Used To Create The Workspace Has Been Closed, Is There Any Mean He Can Retrieve/Transfer Ownership Of The Workspace To Another Google Ident

Hi FierceHamster54
I'm this is solvable, get in touch with them either in the contact form on the website or email support@clear.ml , should not be complicated to fix 🙂

2 years ago

Hope you don’t mind linking to that repo

LOL 😄

3 years ago

0 Hi, Some Properties Of The Task Object Are Not Listed In The Documentation (Such As Task.Parent, Which Is Not Clear Whether It Is The Parent Task Object Itself Or The Id Of The Parent Task).

NICE!

4 years ago

0 Hi, I Have A Pre-Processing Steps Not Been Implemented In Python, But Being A Shell Script Calling Wget To Synchronize Data And Creating Intermediate Sqlite Dbs By A Script Been Implemented In 'R' And Would Like To Ask, If Trains Can Be Used Just To Trigg

Hi WickedGoat98

Will I need to wrap their execution in python by system calls?

That would probably be the easiest solution 🙂

Then you can plug it into your pipeline as a preprocessing Task:

You can check this example:
https://github.com/allegroai/trains/tree/master/examples/pipeline

4 years ago

0 Hi, Community! For The Test I Logged My New Model To Clearml-Server File Host And Take Models For Clearml-Serving From There. And It Works With Clearml-Serving Model Add, But For Clearml-Serving Model Auto-Update I Do Not Exactly Understand What Happens.

Hi AbruptHedgehog21
can you send the two models info page (i.e. the original and the updated one) ?
do you see the two endpoints ?
BTW: --version would add a version to the model (i.e. create a new endpoint with version "endpoint/{version}"

2 years ago

Yes, it's a bit confusing, the gist of it is that we wanted to have the ability to have diff configurations for diff buckets

3 years ago

0 I'M Using Tensorboard Summarywriter To Add Scalar Metrics For The Experiment. If Experiment Crashed, And I Want To Continue It From Checkpoint, For Some Reason It Plots Metrics In A Really Weird Way. Even Though I Pass Global_Step=Epoch To The Summarywrit

DilapidatedDucks58 by default if you continue to execution, it will automatically continue reporting from the last iteration . I think this is what you are seeing

3 years ago

Many thanks!

3 years ago

sorry that I keep bothering you, I love ClearML and try to promote it whenever I can, but this thing is a real pain in the ass

No worries I totally feel you.
As a quick hack in the actual code of the Task itself, is it reasonable to have:
task = Task.init(....) task.set_initial_iteration(0)

3 years ago

0 How, If At All, Should We Cite Clearml In A Research Paper? Would You Like Us To? How About A Footnote/Acknowledgement?

SmallDeer34 I have to admit this reference is relatively old, maybe we should update to auther http://clearml.ml (would that make sense ?)

3 years ago

0 Does Clearml Have A Good Story For Offline/Batch Inference In Production? I Worked In The Airflow World For 2 Years And These Are The General Features We Used To Accomplish This. Are These Possible With Clearml?

Hi @<1541954607595393024:profile|BattyCrocodile47>

Does clearML have a good story for offline/batch inference in production?

Not sure I follow, you mean like a case study ?

Triggering:

We'd want to be able to trigger a batch inference:

(rarely) on a schedule
(often) via a trigger in an event-based system, like maybe from AWS lambda function(2) Yes there is a great API for that, checkout the github actions it is essentially the same idea (RestAPI also available) ...

one year ago

0 Hi All! Currently I Am Trying To Create A Tool That Can Perform Certain Operations On Dataset Ids, This Is A Skeleton Of What I Have In Mind (Based On The Examples):

Hi GrievingTurkey78
First, I would look at the CLI clearml-data as a baseline for implementing such a tool:
Docs:
https://github.com/allegroai/clearml/blob/master/docs/datasets.md
Implementation :
https://github.com/allegroai/clearml/blob/master/clearml/cli/data/main.py
Regrading your questions:
(1) No, a new dataset version will only store the diff from the parent (if files are removed it stored the metadata that says the file was removed)
(2) Yes any get operation will downl...

3 years ago

0 I Have A Notebook Which Is Uncommited. It Is Being Run On A Remote Machine With Clearml-Agent Through Clearml-Session. Everything With Newest Versions, Server Is Community-Hosted. Under Uncommitted Changes I See

Hi FiercePenguin76
It seems it fails detecting the notebook server and thinks this is a "script running".
What is exactly your setup?
docker image ?
jupyter-lab version ?
clearml version?
Also are you getting any warning when calling Task.init ?

3 years ago

0 How Can I Log My Configuration Like This? I Have A Dict Params = {'Data':{'Data_Key':123}, 'Model':{'Model_Key':123}}, But It Become Data/Datakey Instead Of An Foldable Config. In Addition, I Don'T Want To Name It As "General", Where Can I Change It?

NICE!

4 years ago

FiercePenguin76
So running the Task.init from the jupyter-lab works, but running the Task.init from the VSCode notebook does not work?

3 years ago

EnviousStarfish54 generally speaking the hyper parameters are flat key/value pairs. you can have as many sections as you like, but inside each section, key/value pairs. If you pass a nested dict, it will be stored as path/to/key:value (as you witnessed).
If you need to store a more complicated configuration dict (nesting, lists etc), use the connect_configuration, it will convert your dict to text (in HOCON format) and store that.
In both cases you can edit the configuration and then when ru...

4 years ago

Hmm so VSCode running locally connected to the remote machine over the SSH?
(I'm trying to figure out how to replicate the setup for testing)

3 years ago

okay, let me check it, but I suspect the issue is running over SSH, to overcome these issues with pycharm we have specific plugin to pass the git info to the remote machine. Let me check what we can do here.
FiercePenguin76 BTW, you can do the following to add / update packages on the remote session
clearml-session --packages "newpackge>x.y" "jupyterlab>6"

3 years ago

diff line by line is probably not useful for my data config

You could request a better configuration diff feature 🙂 Feel free to add to GitHub

But this also mean I have to first load all the configuration to a dictionary first.

Yes 😞

4 years ago

0 Hi, I'M Trying To Set Storage Manager To Use Our Internal Miniio Installation But I Ran Into This Issue With This Testing Code:

Notice that the StorageManager has default configuration here:
https://github.com/allegroai/trains/blob/f27aed767cb3aa3ea83d8f273e48460dd79a90df/docs/trains.conf#L76
Then a per bucket credentials list, with detials:
https://github.com/allegroai/trains/blob/f27aed767cb3aa3ea83d8f273e48460dd79a90df/docs/trains.conf#L81

4 years ago

Show more results