Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I Seem To Be Missing Something ... I'Ve Only Got One Task Running To Train A Segmentation Model On My Local Machine, And In A Few Days It'S Hit Over 1.15M Api Calls. It Looks Like It'S Sending Every Single Console Output ... Are There Settings To Control

I seem to be missing something ... I've only got one task running to train a segmentation model on my local machine, and in a few days it's hit over 1.15M API calls. It looks like it's sending every single console output ... are there settings to control what gets logged? I only care about the results from each epoch. I don't need each line of the console posted up ( that's 99% of the API usage right there ). I can't find a way to prevent this and can see each line in the clearml console that's already in my terminal window ( each tick in the progress bar for each epoch seems to be an API call to post that local console output to clearml ). Any tips to stop console from getting sent?

  
  
Posted 10 months ago
Votes Newest

Answers 51


It'd be great if it just posted to clearml after each epoch is completed and the CSV with the results gets updated . I only care about using the dashboard to track completed progress . I can use my local computers terminal window to monitor current epoch training . No need to send that to clearml every second ;) Results once an hour or so is fine after each completes :)

  
  
Posted 10 months ago

Hi @<1572395184505753600:profile|GleamingSeagull15>
Try adjusting:
None
to 30 sec
It will reduce the number of log reports (i.e. API calls)

  
  
Posted 10 months ago

@<1572395184505753600:profile|GleamingSeagull15> see " Can I control what ClearML automatically logs? " in None (specifically the auto_connect_frameworks argument to Task.init() )

  
  
Posted 10 months ago

well, in my case, if I am trying to make sure I do not go over the allotted usage, it matters, as I am already hitting the ceiling and I have no idea what is pushing this volume of data

  
  
Posted 10 months ago

is number of calls performed, not what those calls were.

oh, yes this is just a measure of how many API calls are sent.
It does not really matter which ones

  
  
Posted 10 months ago

Since it's literally something we have to pay for ( which I signed up to do ) I would love to know what drives this cost

  
  
Posted 10 months ago

In case of scalars it is easy to see (maximum number of iterations is a good starting point

  
  
Posted 10 months ago

well from 2 to 30sec is a factor of 15, I think this is a good start 🙂

  
  
Posted 10 months ago

(Not sure it actually has that information)

  
  
Posted 10 months ago

might be a feature request then, as ya, having transparency into something we are charged for would be nice. At this point, I have zero idea what is driving this usage and just want to make sure the costs for training do not bloat too much. I personally am just using ClearML as a central dashboard for a few people. I don't need it to be live data, I just need a rough overview of progress. Even if it only posted updates to ClearML once an hour, that is honestly fine.

  
  
Posted 10 months ago

I appreciate your help @<1523701205467926528:profile|AgitatedDove14> 🙂

  
  
Posted 10 months ago

this one, right ? report_period_sec in ~/clearml.conf correct ?

  
  
Posted 10 months ago

FYI, found log_stdout in that same setting and default for that was true so set that to false so it would not log all stdout & stderr

  
  
Posted 10 months ago

Maybe ClearML is using tensorboard in ways that I can fine tune? I saw there was a manual way if you were not using tensorboard to send over data, but the videos I saw from your team used this solution when demoing YOLOv8 on YouTube ( there were a few collab videos your team did with theirs, so I just followed their instructions ). But my gut is telling me that might be the issue for the remaining data being sent over that I have no insight into.

  
  
Posted 10 months ago

I did notice that the last 24 hours I dropped quite a bit, so my theory that the 140K might have some spillover from previous day might have been correct. Last 24 hours went from 1.24M to 1.32M, so about half as much as the day before, with the same training running.

  
  
Posted 10 months ago

Literally all there is, ha ha
image

  
  
Posted 10 months ago

In future collab community videos and sample source for YoloV8, might be worthwhile to call that out as something folks might want to turn off unless they need it :) . Like I mentioned, I had no idea it was going to do that and sent your servers over 1.4M API hits unintentionally : (

  
  
Posted 10 months ago

Welp, it's been a day with the new settings, and stats went up 140K for API calls

... going to check again tomorrow to see if any of that was spill over from yesterday

140K calls a day, how often are you sending scalars ? how long is it running? how many experiments are running ?

  
  
Posted 10 months ago

I would love to be able to fine tune this as needed, but in my profile I only see a Billings & Usage, and it states at the top that "Usage data is updated once every day" ... and even then, all the shows under "Platform Usage" is number of calls performed, not what those calls were.

  
  
Posted 10 months ago

I'm not sure on the frequency it updates though

  
  
Posted 10 months ago

I think we're good now :) Appreciate the help !!!

  
  
Posted 10 months ago

So, might be in the minority here, but seems like capturing stdout and sending that over to clearml via API should be disabled by default. Like I get maybe capturing stderr, but stdout? In a training scenario, that's MILLIONS of API calls just in progress bar indicators, right? Like it might actually be better for the ClearML servers just in general to make the user turn that on if they want it, otherwise we're just blasting your servers. In my case, I did not even know it was sending that over until I got into digging where these API calls were coming from, and saw the CONSOLE tab in clearml that had every single line of stdout captured.
image

  
  
Posted 10 months ago

But I will try to set the reduce the number of log reports first

  
  
Posted 10 months ago

It was at 1.1M when I shut it down yesterday, and today it's at 1.24M

  
  
Posted 10 months ago

One single experiment using the code above. I have no idea how many scalars I am sending since as far as I can tell, I am not setting anything specific to define what I am sending over to ClearML, literally first time using YoloV8 or ClearML. Just using the super basic python to run.

  
  
Posted 10 months ago

I am running this on a 3090 GPU locally, just been letting it run for about two weeks now I think. Just have the one GPU, ha ha. It's at epoch 368 out of the 1,000 I have it set to cap out on ( if it does not hit the default YOLO "patience" limit of 50 before then and self terminate ).

  
  
Posted 10 months ago

Math checks out that if I was generating around 140K a day, and this had been running for 9 days, it had 1.2M when I caught it . So I think the next day after I shut it down I was seeing previous days numbers before shut down added . And another 24 hours it barely changed, so ya, it was 100% the stdout logging .

  
  
Posted 10 months ago

Ya, sorry, I meant that if you needed more info on what was being run, it was in that screenshot ( showed instances/epochs/batch size, etc ) . But yes, it's since been disabled .

  
  
Posted 10 months ago

@<1523701087100473344:profile|SuccessfulKoala55> You are my hero !!! This is EXACTLY what I needed !!!

  
  
Posted 10 months ago

My training is on roughly 50 classes as a subset of the Open Images Dataset for Segmentation

  
  
Posted 10 months ago