
Reputation
Badges 1
8 × Eureka!Are you doing imshow or savefig? Is this the matplotlib oop or original subplot? Any warning message relevant to this?
Rather unfortunate that such a vendor would risk such an inaccurate comparison...
Hi ManiacalPuppy53 glad to see that you got over the mongo problem by advancing to PI4 π
All should work, again - is it much slower than without trains?
HugePelican43 as AgitatedDove14 says, that's a slippery slope to out-of-memory land. If you have Nvidia A100 you can use multiple agents in MIG mode, sort of like containerized hardware if you never heard of it.
Other than that I do not recommend. Max out utilisation for each task instead.
Hi, I was just answering your previous question. can you explain a bit what you mean by "under utilized"? e.g. do you have 2 gpus and are using only one of them for a task?
or are maxing out resources but do not get to 100% utilization (which might be a data pipeline issue)
I guess the product offering is not so clear yet (pun intended) the self-deployed option is completely free and open source. The enterprise offering is something entirely different
https://clear.ml/pricing/
The "feature store" you see in the free tier is what I am alluding to
Hi Elron, I think the easiest way is to print the results of !nvidia-smi or use the framework interface to get these and log them as a clearml artifact. for example -
https://pytorch.org/docs/stable/cuda.html
First of all I'd like to thank you for pointing out that our messaging is confusing. We'll fix that.
To the point: Nice interface and optimized db access for feature stores is part of our paid, enterprise offering.
Managing data/features as part of your pipelines and getting version-controlled features == offline feature store
The latter is doable with the current ClearML open source tools, and I intend to show it very soon. But right now you won't have a different pane for DataOps, it'll...
So basically export a webapp view as csv?
Hi SubstantialElk6 , have a look at Task.execute_remotely, and it's especially for that. For instance in the recent webinar, I used pytorch-cpu on my laptop and task.execute_remotely. the agent automatically installs the GPU version. Example https://github.com/abiller/events/blob/webinars/webinars/flower_detection_rnd/A1_dataset_input.py
OddAlligator72 I think you got sidetracked into the wrong corner here, lets decompose what you are asking for please, tell me if I am getting somewhere near what you mean:
you have an experiment you already ran you want to change the parameters in it and run it again if possible you only want to run a single function in the file attached to that experiment
OddAlligator72 can you link to the wandb docs? Looks like you want a custom entry point, I'm thinking "maybe" but probably the answer is that we do it a little differently here.
Thanks @<1523701205467926528:profile|AgitatedDove14> , also I think you're missing a few pronouns there π
Hi, it is under construction, but it is going to be there.
fine. Can I open a feature request on our github for you, refering this conversation?
then your devops can delete the data and then delete the models pointing to that data
WickedGoat98 I gave you a slight twitter push π if I were I would make sure that the app credentials you put on your screen shot are revoked π π
same name == same path, assuming no upload is taking place? *just making sure
and should it work across your workspace, i.e. does not matter if task id changed? just always keep a single model with the same filename? i'm really worried this could break a lot of the reproducible/repeateable flow for your process.
if you want something that could work in either case, then maybe the second option is better
with upload I would strongly recommend against doing this
script runs, tries to register 4 models, each one of them is exactly found in the path, size/timestamp is different. then it will update the old 4 models with the new details and erase all the other fields
so what I am describing is exactly this - once you try to create an output model from the same task, if the name already exists - do not create a new model, just update the timestamp on the old one
wait, I thought this is without upload
what a turn of events π so lets summarize again:
upkeep script - for each task, find out if there are several models created by it with the same name if so, make some log so that devops can erase files DESTRUCTIVELY delete all the models from the trains-server that are in DRAFT mode, except the last one
https://github.com/allegroai/trains/issues/193 for future reference (I will update later)