(Not sure it actually has that information)
I guess last followup question, is there a way to cap costs? Like if this is running at this scale, I am not sure I can use ClearML for my purpose if I am just going to get overage charged repeatedly ( which I am already looking like I will be doing ).
I guess last followup question, is there a way to cap costs?
Scale tier ? (I know it is not per usage, but it is probably more than 15$ per user 🙂 )
I think we're good now :) Appreciate the help !!!
Would love to just cap it at a fixed amount for a month for API calls.
Try the timeout configuration, I think this shoud solve all your issues, and will be fairly easy to set for everyone
I appreciate your help @<1523701205467926528:profile|AgitatedDove14> 🙂
Ya, sorry, I meant that if you needed more info on what was being run, it was in that screenshot ( showed instances/epochs/batch size, etc ) . But yes, it's since been disabled .
Is there a place in ClearML that shows Platform Usage? Like, what's actually taking up the API calls?
Math checks out that if I was generating around 140K a day, and this had been running for 9 days, it had 1.2M when I caught it . So I think the next day after I shut it down I was seeing previous days numbers before shut down added . And another 24 hours it barely changed, so ya, it was 100% the stdout logging .
Since it's literally something we have to pay for ( which I signed up to do ) I would love to know what drives this cost
FYI, found log_stdout
in that same setting and default for that was true
so set that to false
so it would not log all stdout & stderr
In case of scalars it is easy to see (maximum number of iterations is a good starting point
But I will try to set the reduce the number of log reports first
each epoch runs about 55 minutes, and that screenshot I posted earlier kind of show the logs for the rest of the info being output, if you wanted to check that out None
well from 2 to 30sec is a factor of 15, I think this is a good start 🙂
It was at 1.1M when I shut it down yesterday, and today it's at 1.24M
@<1523701087100473344:profile|SuccessfulKoala55> You are my hero !!! This is EXACTLY what I needed !!!
FYI, I did not even know to look into this until I logged in and saw that I was being throttled because I had hit my monthly limit with API calls ( on my very first use of your platform ), and my last dozen or so epochs were just not even logged ( also a bummer ). I only had that one model in training, and thought there was no way I sent over a million API requests, so had to figure out where those were coming from, and tracked it down to that STDOUT, and was like ... wait, what?!?! Found that console tab, which I did not even use before, and saw that screenshot I posted, and was like ... well, there's your problem, ha ha
Maybe ClearML is using tensorboard
in ways that I can fine tune? I saw there was a manual way if you were not using tensorboard
to send over data, but the videos I saw from your team used this solution when demoing YOLOv8 on YouTube ( there were a few collab videos your team did with theirs, so I just followed their instructions ). But my gut is telling me that might be the issue for the remaining data being sent over that I have no insight into.
If you do not have a lot of workers, that I would guess console outputs