
Reputation
Badges 1
14 × Eureka!Hi ScaryBluewhale66 , I believe the new server that's about the be released soon (this \ next week), we'll allow you to report a "single value metric". so if you want to report just a number per experiment you can, then you can also compare between runs.
report_scalar() with a constant iteration, is a hack that you can use in the meantime 🙂
Not sure I follow your suggestion 🙂
This is how my code compare looks, it's ok because I see the tags:
If you can open a git issue to help tracking and improve visibility, that'll be very awesome!
Let me circle this back to the UI folks and see if I can get some sort of date attached to this 🙂
That's how I see the scalar comparison, no idea which is the "good" and which is the "bad"
You mean add some list on top of all experiments with tags and their ID?
Yeah, that makes lots of sense!
JitteryCoyote63 Fair point 😅 , I'll be lying to say we haven't been slow on documenting new features 🙂 That being said, since what you're looking for seems REALLY straightforward (at least to people who know how it works internally 😛 ) we can probably do something about it rather quickly 🙂
As for your question, yes, our effort was diverted into other avenues and not a lot of public progress has been made.
That said, what is your plan for integration of the tools? automatically promote models to be served from within clearml?
Hi Jevgeni! September is always a slow month in Israel as it's holiday season 🙂 So progress is slower than usual and we didn't have an update!
Next week will be the next community talk and publishing of the next version of the roadmap, a separate message will follow
FiercePenguin76 Thanks! That's great input! If you're around tomorrow, feel free to ask us questions in our community talk! We'd be happy to discuss 😄
AFAIK max spinup time is max life of agent (busy or idle) and max idle is maximum allowed time to be idle
Hi FierceHamster54 can you try another instance type? I just tried with n1 and it works. We are looking to see if it's instance type related
Hi ZanyPig66 ,
I assume you're using torch.save() to save your model? A good place to start is with David's suggestion with specifying output_uri = True in the Task.init() code.
ZanyPig66 , the 2 agents can run from the same ubuntu account and use the same clearml.conf. if you want each to have its own configuration file just add --config-file PATH_TO_CONF_FILE and it would take another config file. Makes sense?
If these indices tend to grow large, I think it would be cool if there was a flag that would periodically remove them. probably a lot of users aren't aware that these take up so much space
JitteryParrot8 in the new SDK we'll have dataset.add_description() which will do the same as KindChimpanzee37 provided but with a nicer interface 😄
Hmm... My thoughts drift towards the ending of each scalar series, which ATM is the beginning of the Task ID (which probably doesn't tell you much). What if we replace the tags? BTW, in your use case, do you have 1 tag different? multiple?
Hi Jax, I'm working on a few more examples of how to use clearml-data. should be released in a few weeks (with some other documentation updates). These however don't include the use case you're talking about. Would you care to elaborate more on that? Are you looking to store the code that created the data, in the execution part of the task that saves the data itself?
Hi GentleSwallow91 let me try and answer your questions 😄
The serving service controller is basically, the main Task that controls the serving functionality itself. AFAIK: clearml-serving-alertmanager - a container that runs the alertmanager by prometheus ( https://prometheus.io/docs/alerting/latest/alertmanager/ ) clearml-serving-inference - the container that runs inference code clearml-serving-statistics - I believe that it runs software that reports to the prometheus reporting ...
Hi Mathis, actually, we fixed this in our latest SDK! you can use Task.query_tasks() and you'll get the id's of all the tasks that match the query. The reason we don't get task objects themselves is that it can be quite large and can take a long time.
Hi, what we do is indeed override the configuration so instead of taking the info from the file, we take from the input from the UI. If you're using the content of the file in other places directly, you might end up with different results.
If you can have a short code snippet to show me how you're reading \ tracking the file, I can maybe better help and provide a suggestion of what can be done!
To add onto what Martin wrote, you can see here: https://clear.ml/docs/latest/docs/guides/data%20management/data_man_cifar_classification
How it's interfaced with a torch dataloader. You only replace the path for where the files come from
We'll check this. I assume we don't catch the error somehow or the proccess doesn't indicate it died failing
We can't officially confirm nor deny this but yes :sleuth_or_spy:
Hi JitteryParrot8
Do you mean Task? If you create a dataset with ClearML data, the Task's Icon would indicate it's a dataset task. Same goes for Experiment. You are in luck 🙂 The new SDK (which is about to be released any day now) would log the dataset used, every time you do Dataset.get().
Regardless we are in the final execution phases of a major overhaul to the dataset UI so stay tuned for our next server release that would, hopefully, make your life easier 😄
If spot is taken from you then yes. It will be. (unless there's some drive persistence)