GiganticMole91

17 Questions, 47 Answers

Active since 10 January 2023

Last activity 2 months ago

Reputation

Badges 1

47 × Eureka!

Questions 17
Answers 47

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hey, I'M Looking Into Clearml Pipelines For The First Time, So I Have Likely Not Fully Understood The Documentation Yet, But; Is There Any Way Where I Can Use Pipelines To Setup A Process That Will Run When An Experiment Is Published? Thanks :-)

Hey, I'm looking into ClearML Pipelines for the first time, so I have likely not fully understood the documentation yet, but; Is there any way where I can us...

clearml

2 years ago

0 Votes

0 Answers

237 Views

0 Votes 0 Answers 237 Views

Hi I Just Updated Our Server To The Latest Version, But It Seems To Have Broken All Our Running Experiments. Scalars Is Totally Down, I Just Get This Error When Going To The Scalars Tab:

Hi I just updated our server to the latest version, but it seems to have broken all our running experiments. Scalars is totally down, I just get this error w...

clearml

3 months ago

0 Votes

18 Answers

1K Views

0 Votes 18 Answers 1K Views

Hi Guys, I'M In The Process Of Setting Up A Clearml Server For Experiment Tracking. I Have The Server Hosted In A Virtual Linux Machine On Azure And Run Experiments From Some Local Compute. Our Training Environment Is Pytorch Lightning And I Have Written

Hi guys, I'm in the process of setting up a ClearML server for experiment tracking. I have the server hosted in a virtual Linux machine on Azure and run expe...

clearml

2 years ago

0 Votes

1 Answers

112 Views

0 Votes 1 Answers 112 Views

Hi, Is There Anyone In The Clearml Team That Would Like To Review My Pr On Clearml-Agent? I’M Worried That It Might Have Slipped Under The Radar. It Adds Support For Using Uv As A Package Manager :-)

Hi, is there anyone in the ClearML team that would like to review my PR on clearml-agent? I’m worried that it might have slipped under the radar. It adds sup...

clearml

one month ago

0 Votes

3 Answers

771 Views

0 Votes 3 Answers 771 Views

I Have An Issue With How Clearml Logs Checkpoints. We Have A Training Setup With Pytorch-Lightning + Clearml, Where We Use

I have an issue with how clearml logs checkpoints. We have a training setup with pytorch-lightning + clearml, where we use lightning.pytorch.ModelCheckpoint ...

clearml

11 months ago

0 Votes

15 Answers

180 Views

0 Votes 15 Answers 180 Views

Hi There, Our. Self-Hosted Server Is Periodically Very Slow To React In The Web Ui. We'Ve Been Debugging For Quite Some Time, And It Would Seem That Elastisearch Might Be The Culprit. Looking At The Elastisearch Index, We Have An Index Of Around 80G Of Tr

Hi there, Our. self-hosted server is periodically very slow to react in the web UI. We've been debugging for quite some time, and it would seem that elastise...

clearml

2 months ago

0 Votes

4 Answers

829 Views

0 Votes 4 Answers 829 Views

Hi All, Is There A Way To Force An Agent To Use Https Although The Scheduled Task Is Using Ssh For Git?

Hi all, Is there a way to force an agent to use https although the scheduled task is using ssh for git?

mlops

10 months ago

0 Votes

9 Answers

1K Views

0 Votes 9 Answers 1K Views

Hey, We'Re Seeing A Lot Of Issues With Our Clearml Self-Hosted Server These Days; It Seems Like The Api Times Out While Talking To Elasticsearch:

Hey, We're seeing a lot of issues with our ClearML self-hosted server these days; it seems like the API times out while talking to elasticsearch: 2022-10-22 ...

clearml

2 years ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Hi, I Have Some Questions About Hyperparameter Optimization. We Have A Setup Where We Use Pytorchlightning Cli With Clearml For Experiment Tracking And Hyperparameter Optimization. Now, All Our Configurations Are Config-File Based. Sometime We Have Linke

Hi, I have some questions about hyperparameter optimization. We have a setup where we use PytorchLightning CLI with ClearML for experiment tracking and hyper...

clearml

2 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi Guys, Is There A Way, Analogous To Using

Hi guys, Is there a way, analogous to using Task.set_credentials(...) , to set credentials for storage programmatically? Like, Task.setup_storage(...) ? I'm ...

clearml

2 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi Guys, I'M Trying To Familiarize Myself With Hyperparameter Optimization Using Clearml. It Seems Like There Is A Discrepancy Between

Hi guys, I'm trying to familiarize myself with Hyperparameter Optimization using ClearML. It seems like there is a discrepancy between clearml-param-search C...

clearml

2 years ago

0 Votes

12 Answers

771 Views

0 Votes 12 Answers 771 Views

Hi All. I'M Setting Up An Model Export Script That Will Export Trained Models For Edge Deployment. I Initially Thought About Setting It Up As A Trigger Scheduler, And To Have It Trigger On Tags On A Published Model, But As Time Goes By The Trigger Schedul

Hi all. I'm setting up an model export script that will export trained models for edge deployment. I initially thought about setting it up as a trigger sched...

clearml

10 months ago

0 Votes

0 Answers

151 Views

0 Votes 0 Answers 151 Views

It Seems To Be Related To Elastisearch

It seems to be related to elastisearch clearml-elastic | "stacktrace": ["org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed", I...

clearml

3 months ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi Guys, I'M Setting Up A Bunch Of Machines As Clearml Agents And Have Run Into An Issue With Caching. We Are Using Poetry For Python Dependency Management, So The Agents Are Configured To Use That Too, But They Are Not Caching The Venvs Between Tasks. Th

Hi guys, I'm setting up a bunch of machines as clearml agents and have run into an issue with caching. We are using Poetry for python dependency management, ...

clearml

one year ago

0 Votes

4 Answers

782 Views

0 Votes 4 Answers 782 Views

Hi Guys. I'M Struggling To Get The Cleanup Service Working On Our On-Prem Setup. We Are Using The Built In Service (

Hi guys. I'm struggling to get the Cleanup Service working on our on-prem setup. We are using the built in service ( None ) but see loads of errors like: Cou...

clearml

11 months ago

0 Votes

7 Answers

141 Views

0 Votes 7 Answers 141 Views

Hi, I'M Using

Hi, I'm using Task.register_abort_callback to store the latest model checkpoint, but the ergonomics of the callback feel weird to me. I have to do these work...

pytorch

3 months ago

0 Votes

8 Answers

193 Views

0 Votes 8 Answers 193 Views

Rolling Back To 1.15.0 Seemed To Fix The Error For Now. Is There Something One Should Be Aware Of Between Server Versions 1.15 And 1.16 Related To Versions Of The

Rolling back to 1.15.0 seemed to fix the error for now. Is there something one should be aware of between server versions 1.15 and 1.16 related to versions o...

clearml

3 months ago

0 Hi All. I'M Setting Up An Model Export Script That Will Export Trained Models For Edge Deployment. I Initially Thought About Setting It Up As A Trigger Scheduler, And To Have It Trigger On Tags On A Published Model, But As Time Goes By The Trigger Schedul

Just wanted to share a workaround for using a TriggerScheduler to execute a script using the latest commit of a given branch, without relying on cloning a Task. Don't know if it has been shown before in here 🙂

from clearml import Model, Task
from clearml.automation import TriggerScheduler

def trigger_model_func(model_id: str):
    model = Model(model_id)

    print(f"Triggered model export for model '{model.name}' ({model_id})")

    # NOTE: To execute from the branch of
    # task...

10 months ago

Well, one solution could be to say that models can only be exported from main/master and then have devops start a new trigger on PR completion. That would require some logic for stopping the existing TriggerScheduler, but that shouldn't be too difficult.

However, the most flexible solution would be to have some way of triggering the execution of a script in the parent task environment, something along the lines of clearml-agent build ... . I just can't wrap my head around triggering that ...

10 months ago

Well, consider the case where you start the trigger scheduler on commit A, then you do some work that defines a new model and commit as commit B, train some model and now you want to export/deploy the model by publishing it and tagging it with some tag that triggers the export, as in your example. The scheduler will then fail, because the model is not implemented at commit A.

Anyways, I think I've solved it, I'll post the workaround when I get around to it 🙂
You can create a task in the t...

10 months ago

0 Hey, We'Re Seeing A Lot Of Issues With Our Clearml Self-Hosted Server These Days; It Seems Like The Api Times Out While Talking To Elasticsearch:

We are running the latest version (WebApp: 1.7.0-232 • Server: 1.7.0-232 • API: 2.21).
When I run docker logs clearml-elastic I get lots logs like this one:
{"type": "server", "timestamp": "2022-10-24T08:51:35,003Z", "level": "INFO", "component": "o.e.i.g.DatabaseNodeService", "cluster.name": "clearml", "node
.name": "clearml", "message": "successfully reloaded changed geoip database file [/tmp/elasticsearch-3596639242536548410/geoip-databases/cX7aMqJ4SwCxqM7s
YM-S9Q/GeoLite2-City.mmdb]...

2 years ago

0 Hey, We'Re Seeing A Lot Of Issues With Our Clearml Self-Hosted Server These Days; It Seems Like The Api Times Out While Talking To Elasticsearch:

No, not at all. I recon we started seeing errors around mid-last week. We are using default settings for everything except some password-stuff on the server.

2 years ago

0 Hi Guys, I'M In The Process Of Setting Up A Clearml Server For Experiment Tracking. I Have The Server Hosted In A Virtual Linux Machine On Azure And Run Experiments From Some Local Compute. Our Training Environment Is Pytorch Lightning And I Have Written

The server will never access the storage - only the clients (SDK/WebApp etc.) will access it

Oh okay. So that's the reason I can access media when the client and server is running on the same machine?

2 years ago

I've tried setting the output_uri on Task.init, but that seems to only affect model checkpoints and artifacts

2 years ago

SuccessfulKoala55 Thanks for the help. I've setup my client to use my blob storage now, and it works wonderfully.

I've also added a token to my server, so now I can access the audio samples from the server.
Is there a way to add a common token serverside so the other members of the team don't have to create a token?

I also struggle a bit with report_matplotlib_figure() in which plots does not appear in the web ui. I have implemented the following snippet in my pytorch lightning logger:
` @...

2 years ago

Sure. Really, I'm just using the default client:
# ClearML SDK configuration file
api {
web_server: http://server.azure.com:8080
api_server: http://server.azure.com:8008
files_server: http://server.azure.com:8081
credentials {
"access_key" = "..."
"secret_key" = "..."
}

}
sdk {
# ClearML - default SDK configuration

storage {
    cache {
        # Defaults to system temp folder / cache
        default_base_dir: "~/.clearml/c...

2 years ago

How does it look in the Web UI?

I just had a look, and they are visible under debug samples, but not under plots, as I had expected.
I thought that by using report_matplotlib_figure it would get grouped under plots? 🙂

2 years ago

It's actually complementary - the SDK will use the clearml.conf configuration by matching that configuration with the destination you provided

Would you recommend doing both then? :-)

2 years ago

On the server or the client? :)

2 years ago

Hey SweetBadger76 , thanks for answering. I'll check it out! Does that correspond to filling out azure.storage in the clearml.conf file?

And how do I ensure that the server can access the files from the blob storage?

2 years ago

Do you mean to the Web UI?

Yes that's what I meant, sorry I'm still coming to terms with ClearML terminology 😅 . Is it possible to store the web app cloud access token serverside so we don't have to input it in the Web UI? 🙂

2 years ago

0 Hi There, Our. Self-Hosted Server Is Periodically Very Slow To React In The Web Ui. We'Ve Been Debugging For Quite Some Time, And It Would Seem That Elastisearch Might Be The Culprit. Looking At The Elastisearch Index, We Have An Index Of Around 80G Of Tr

@<1722061389024989184:profile|ResponsiveKoala38> cool, thanks! I guess it will then be straightforward to script then.

What is your gut feeling regarding the size of the index? Is 87G a lot for an elastisearch index?

2 months ago

Hi @<1523701070390366208:profile|CostlyOstrich36>
Is 87G a lot for an index? Enough that you would consider adding more RAM?

And also, how can I check that we are not storing scalars for deleted tasks? ClearML used to write a lot of errors in the cleanup script, although that seems to have been fixed in recent updates

2 months ago

Any tips on how to check if we are storing data on deleted tasks? Maybe @<1722061389024989184:profile|ResponsiveKoala38> knows? Is there a field on each scalar that I can cross check with ClearML?

2 months ago

Show more results