Math checks out that if I was generating around 140K a day, and this had been running for 9 days, it had 1.2M when I caught it . So I think the next day after I shut it down I was seeing previous days numbers before shut down added . And another 24 hours it barely changed, so ya, it was 100% the stdout logging .
Would love to just cap it at a fixed amount for a month for API calls.
Try the timeout configuration, I think this shoud solve all your issues, and will be fairly easy to set for everyone
I am running this on a 3090 GPU locally, just been letting it run for about two weeks now I think. Just have the one GPU, ha ha. It's at epoch 368 out of the 1,000 I have it set to cap out on ( if it does not hit the default YOLO "patience" limit of 50 before then and self terminate ).
My training is on roughly 50 classes as a subset of the Open Images Dataset for Segmentation
each epoch runs about 55 minutes, and that screenshot I posted earlier kind of show the logs for the rest of the info being output, if you wanted to check that out None
I would love to be able to fine tune this as needed, but in my profile I only see a Billings & Usage, and it states at the top that "Usage data is updated once every day" ... and even then, all the shows under "Platform Usage" is number of calls performed, not what those calls were.
is number of calls performed, not what those calls were.
oh, yes this is just a measure of how many API calls are sent.
It does not really matter which ones
I'm not sure on the frequency it updates though
I guess last followup question, is there a way to cap costs?
Scale tier ? (I know it is not per usage, but it is probably more than 15$ per user 🙂 )
Actually looking at the counts today, they've barely changed. So I think this actually fixed it, and was just that the counts are only updated daily so I needed to get 48 hours out from when I made the change to see clean results to assure no spill over counts from previous days.
Under your profile you should be able to see it
Maybe ClearML is using tensorboard
in ways that I can fine tune? I saw there was a manual way if you were not using tensorboard
to send over data, but the videos I saw from your team used this solution when demoing YOLOv8 on YouTube ( there were a few collab videos your team did with theirs, so I just followed their instructions ). But my gut is telling me that might be the issue for the remaining data being sent over that I have no insight into.
But I will try to set the reduce the number of log reports first
It was at 1.1M when I shut it down yesterday, and today it's at 1.24M
this one, right ? report_period_sec
in ~/clearml.conf
correct ?
@<1572395184505753600:profile|GleamingSeagull15> see " Can I control what ClearML automatically logs? " in None (specifically the auto_connect_frameworks
argument to Task.init()
)
I did notice that the last 24 hours I dropped quite a bit, so my theory that the 140K might have some spillover from previous day might have been correct. Last 24 hours went from 1.24M to 1.32M, so about half as much as the day before, with the same training running.
If you do not have a lot of workers, that I would guess console outputs
I appreciate your help @<1523701205467926528:profile|AgitatedDove14> 🙂