Reputation
Badges 1
8 × Eureka!Yeah the file system on those VMs is really slow
Cool. I found Logger.tensorboard_single_series_per_graph()
Sounds odd, I bet there is a way to make it work witout implicit logging statement. This is with tf2 or pytorch? Which trains version are you using?
CloudyHamster42 it will only affect news tasks created with the config file...sorry
That will be sdk.metrics.tensorboard_single_series_per_graph
same name == same path, assuming no upload is taking place? *just making sure
Should work on new tasks of you use this command in the script. If you'd rather to keep the scripts as clean as possible, you can also configure it globally for all new tasks in trains.conf
I think there is a way, I'll have to check. BTW when you compare two tasks you do get separate graphs, right?
Its built in π and Its for... "Services"
https://github.com/allegroai/trains-server#trains-agent-services--
WackyRabbit7 It is conceptually different than actually training, etc.
The service agent is mostly one without a gpu, runs several tasks each on their own container, for example: autoscaler, the orchestrators for our hyperparameter opt and/or pipelines. I think it even uses the same hardware (by default?) of the trains-server.
Also, if I'm not mistaken some people are using it (planning to?) to push models to production.
I wonder if anyone else can share their view since this is a relati...
The log storage can be configured if you spin your own clearml-server, but it won't have repository structure. And it shouldn't have btw. If you need secondary backup of everything, it is possible to set something up as well.
Sorry for being late to the party WearyLeopard29 , if you want to see get_mutable_copy() in the wild you can check the last cell of this notebook:
https://github.com/abiller/events/blob/webinars/videos/the_clear_show/S02/E05/dataset_edit_00.ipynb
Or skip to 3:30 in this video:
ClearML Free or self-hosted?
isn't it better if it just updates the timestamp on the old models?
what a turn of events π so lets summarize again:
upkeep script - for each task, find out if there are several models created by it with the same name if so, make some log so that devops can erase files DESTRUCTIVELY delete all the models from the trains-server that are in DRAFT mode, except the last one
wait, I thought this is without upload
Hi Dan, please take a look at this answer, the webapp interface mimics this. Does this click for you?
Hmm, anything -m will solve? https://docs.docker.com/config/containers/resource_constraints/
or is it a segfault inside the container becuase ulimit
isn't set to -s unlimited
?
then your devops can delete the data and then delete the models pointing to that data
script runs, tries to register 4 models, each one of them is exactly found in the path, size/timestamp is different. then it will update the old 4 models with the new details and erase all the other fields
and should it work across your workspace, i.e. does not matter if task id changed? just always keep a single model with the same filename? i'm really worried this could break a lot of the reproducible/repeateable flow for your process.
Hi, this really depends on what your organisation agrees is within MLOps control and what isn't. I think this blogpost is a must read:
https://laszlo.substack.com/p/mlops-vs-devops
and here is a list of infinite amount of MLOps content:
https://github.com/visenger/awesome-mlops
Also, are you familiar with the wonderful MLOPS.community? The meetup and podcasts are magnificent (also look for me in their slack)
https://mlops.community/
Hi, it is under construction, but it is going to be there.
BattyLion34 we're here if you have more questions. Have you seen my recent webinar? https://youtu.be/Y5tPfUm9Ghg
SubstantialBaldeagle49
hopefully you can reuse the same code you used to render the images until now, just not inside a training loop. I would recommend against integrating with trains, but you can query the trains-server from any app, just make sure you serve it with the appropriate trains.conf and manage the security π you can even manage the visualization server from within trains using trains-agent. Open source is so much fun!
https://github.com/allegroai/trains/issues/193 for future reference (I will update later)
Hi BattyLion34 , could you clarify a little? If I understand correctly, you wish to use a code repository to store artifacts and ClearML logs?
I will dig around to see how all of this could be accomplished.
Right now I see it done in two ways:
a function that you must remember to call each time that would do the upkeep a script that you can run once in a while to do cleanupwhich one would you prefer that I will pursue?