AgitatedDove14

49 Questions, 8094 Answers

Active since 10 January 2023

Last activity 10 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8094

0 Hi There,

Hmm okay, this does point to a mem leak, any chance this is reproducible?

one year ago

0 Can We Use Dynamodb With Clearml Helm Charts Instead Of Mongodb? We'D Like To Move All Stateful Storage To Aws As A Separate Service And That Would Be A Nice Alternative

I think the main risk is ClearML upgrades to MongoDB vX.Y, and mongo changed the API (which they did because of amazon), and now the API call (aka the mongo driver) stops working.
Long story short, I would not recommend it 🙂

2 years ago

0 Hi Trains Team: Question - I Am Trying To Implement Unbiasing To My Datasets. I Was Wondering If Trains Has Anything In Its Toolset That Already Implements Something Like That

Hi MinuteWalrus85
This is great question, and super important when training models. This is why we designed a whole system to manage datasets (including storage querying, balancing data, and caching). Unfortunately this is only available in the paid tier of Allegro... You are welcome to https://allegro.ai/enterprise/ the sales guys.
🙂

4 years ago

0 Hi, I Am Getting Following Error While Trying To Checkout A Gut Hub Rep. Error: Rpc Failed; Curl 56 Gnutls Recv Error (-54): Error In The Pull Function. Fatal: The Remote End Hung Up Unexpectedly Fatal: Early Eof Fatal: Index-Pack Failed Repository Cloni

Hmm, you are missing the entry point in the execution (script path).
Also as I mentioned you can either have a git repo or script in the uncommitted changes, but not both (if you have a git repo then the uncommitted changes are the git diff)

4 years ago

0 Hello, We Are Currently Working On A Hyperparameter Tuning Job For Object Detection Following This Tutorial

DeterminedToad86
Yes I think this is the issue, on SageMaker a specific compiled version of torchvision was installed (probably part of the image)
Edit the Task (before enqueuing) and change the torchvision URL to:
torchvision==0.7.0Let me know if it worked

4 years ago

0 Hi, Are The Experiments Logs Stored In S3 Or In The Trains-Server? (When Using S3 As Artifact Storage)

Hi JitteryCoyote63

experiments logs ...

You mean the console outputs ?

4 years ago

0 I Seem To Be Missing Something ... I'Ve Only Got One Task Running To Train A Segmentation Model On My Local Machine, And In A Few Days It'S Hit Over 1.15M Api Calls. It Looks Like It'S Sending Every Single Console Output ... Are There Settings To Control

In case of scalars it is easy to see (maximum number of iterations is a good starting point

one year ago

0 Hi There, I Used

BTW: UnevenDolphin73 you should never actually do "task = clearml.Task.get_task(clearml.config.get_remote_task_id())"
You should just do " Task.init() " it will automatically take the "get_remote_task_id" and do all sorts of internal setups, you will end up with the same object but in an ordered fashion

Yes even without any arguments give to Task.init() , it has everything from the server

3 years ago

0 When I Run Experiments I Set

Thanks IntriguedRat44 !
I'll follow up on GitHub 🙂

4 years ago

0 Hi Guys, Since I Am Done With Implementing The Aws Autoscaler, I Would Like To Share Some Pain Points That I Encountered In The Process With The Hope That They Can Be Documented To Help Other Users:

Thanks JitteryCoyote63 !
Any chance you want to open github issue with the exact details or fix with a PR ?
(I just want to make sure we fix it as soon as we can 🙂 )

4 years ago

0 Hi, I Have A Question About Clearml-Data. Clearml-Data Probably Does Well On Data Versioning, But When It Comes To Actual Loading Of Data, Are There Examples Of How It Can Make Use Of Advanced Features Such That Those In

tf datasets is able to handle batch downloading quite well.

SubstantialElk6 I was not aware of that, I was under the impression tf dataset is accessed on a file level, no?

3 years ago

0 Hello, Everyone. I Have A Model, And In

Hi @<1657918706052763648:profile|SillyRobin38>

I have included some print statements

you should see those under the Task of the inference instance.
You can also do:

import clearml
...
def preprocess(...):
  clearml.Logger.current_logger().report_text(...)
  clearml.Logger.current_logger().report_scalar(...)

, specifically within the containers where the inferencing occurs.

it might be that fastapi is capturing the prints...
[None](https://github.com/tiangolo/uvicor...

11 months ago

0 I Am Seeing That Some Steps In A Pipeline Are Being Skipped. Like For Example, In A Pipeline With 4 Steps, It’S Directly Starting At Step 3. Is There Some Reason For This, Some Optimization Kicking In?

Yes there was a bug that it was always cached, just upgrade the clearml
pip install git+

3 years ago

No worries, just wanted to make sure it doesn't slip away 😉

4 years ago

0 Hello, Is It Possible To Run Trains Offline Where There'S No Http Connection Between The Node Running The Job And Where The Web Ui Runs? I See In Your Diagram The Connection Between Training Machine And Trains Server (Which Contains The Web Ui) Is Over Ht

But this will require some code changes...

4 years ago

0 Hi, Does Anyone Have Some Issues With Cloning Git Repos Within Alegro? I Always Got Some Error Massage: Fatal: Unable To Access '

Is trains-agent using docker-mode or virtual-env ?

4 years ago

0 Hope Everyone'S Having A Nice Holiday Period. I'Ve Been Debating Between Cron And The Clearml Taskscheduler Cron Is The Solution I'M Currently Using But I Wanted To Understand The Advantages To Using The Taskscheduler. Right Now I'M Using The Classic Cro

Hmm I guess we should better state that, I'll pass it on 🙂

one year ago

0 Is There Some Automated Migration For Existing Tasks From Other Mlops Frameworks To Clearml? (Specifically, Interested In Migrating From Mlflow)

UnevenDolphin73 something like this one?
https://github.com/allegroai/clearml/pull/225

3 years ago

0 Hi, I Encountered An Issue That Might Affect Others As Well: When Using "

IrritableJellyfish76 point taken, suggestions on improving the interface ?

2 years ago

0 Hello, Does Anybody Here Have Much Experience In Creating Sub-Tasks Or Sub-Pipelines? I'M Not Sure The Concept Is Particularly Well Established But The Docs Mention:

using caching where specified but the pipeline page doesn't show anything at all.

What do you mean by " the pipeline page doesn't show anything at all."? are you running the pipeline ? how ?
Notice PipelineDecorator.component needs to be Top level not nested inside the pipeline logic, like in the original example

@PipelineDecorator.component(
        cache=True,
        name=f'append_string_{x}',
    )

2 years ago

0 Hi There, How Can I Set The Model Metadata Using Code? The Model Object Has The

Hi IrritableGiraffe81
You can access the model object with, task.models['output']
To set the model metadata I would recommend making sure you have the latest clearml package, I think this is relatively new addition

2 years ago

0 Hi, Can I Choose Not Print The Clearml-Agent Config Logs In The Console? Reason Is We Are Passing Credentials Via Env Var To The K8S Glue And Its Being Displayed In The Console As ...

Hi SubstantialElk6
where exactly in the log do you see the credentials ?

/tmp/.clearml_agent.234234e24s.cfg

What's the exact setup ? (I mean are you using the glue? if that's the case I think the temp config file is only created inside the pod/docker so upon completion it will be deleted along side the pod.

3 years ago

0 Hi, I Encountered An Issue That Might Affect Others As Well: When Using "

IrritableJellyfish76 hmm maybe we should an an extra argument partial_name_matching=False to maintain backwards compatibility?

2 years ago

0 Hi All! I’M Currently Working On A Project Where I’M Making Use Of Clearml For Hyperparameter Tuning. In My Workflow, I Have A Python Script That I Usually Run With The Following Command:

What are you seeing in the Task that was cloned (i.e. the one the HPO created not the original training task)?
by that I mean, configuration section, do you have the Args there ? (seems like the pic you attached, but I just want to make sure)

Also in the train.py file, do you also have Task.init ?

one year ago

0 Hello Everyone, I’M Newcomer For Clearml. I Have Question Related To

This looks exactly like the timeout you are getting.
I'm just not sure what's the diff between the Model autoupload and the manual upload.

3 years ago

0 Hello! I Try Add Dataset To Clearml Using Clearml-Data. All Images In One Folder, Size Around 5Gb. After Upload To Cloud I Get This Error Clearml.Metrics - Error - Failed Reporting Metrics: <400/0: Unknown (Error: Events.Add_Batch Request Exceeds Limit 75

When exactly are you getting this error ?

3 years ago

0 Please Tell Me, When Migrating A Local Server, We Have Problems That The Saved Images Are Not Displayed, It Says "Unable To Load Image" And Links To The Old Server If You Click "Copy Image Url" Or "Open Image". The Migration Was Done According To Backup'

CheerfulGorilla72

yes, IP-based access,

hmm so this is the main downside of using IP based server, the links (debug images, models, artifacts) store the full URL (e.g. http://IP:8081/ http://IP:8081/... ) This means if you switched IP they will no longer work. Any chance to fix the new server to the old IP?
(the other option is somehow edit the DB with the links, I guess doable but quite risky)

2 years ago

0 Hi. I Get Some Problem With Clearml Agent. I Start Training On My Local Device, Clone Run, And Start This Run In Docker On Cluster. But, Seems Like Clearml Agent Сaches Environment(Package Weels, Python Version, Etc). Can I Config Clearml Agent To Not Сac

I'm running agent inside docker.

So this means venv mode...

Unfortunately, right now I can not attach the logs, I will attach them a little later.

No worries, feel free to DM them if you feel this is to much to post them here

2 years ago

0 Hello, We Are Currently Working On A Hyperparameter Tuning Job For Object Detection Following This Tutorial

Nicely done DeterminedToad86 🙂
Wasn't this issue resolved by torch?

4 years ago

0 Hey, Would It Possible To Add An Option To Make

Hmm, not a bad idea 🙂
Could you please open a Git Issue, so it will not get forgotten ?
(btw: I'm not sure how trivial it is to implement, nonetheless obviously possible 😉

4 years ago

Show more results