Reputation
Badges 1
6 × Eureka!clearml-deploy is clearml-serving but with other parts more intwined such as ci/cd prompts/callbacks, if you think clearml-deploy has a bit more love given to it, I believe that will put you on the right track, but at it's core, it's the same idea Sir.
the hyper datasets have always been there in the enterprise offering. It allows you to query datasets and perform functions such as updating labels on an image without an entire re-batching. I think we are trying to find a way to bring this to...
can you show me the complete output from 'docker-compose ps' please ? 🙂
Hey there waves
Not sure about plans to automate this in the future, as this is more how docker behaves and not really clearml, especially with the overlay2 filesystem. The biggest offender usually is your json logfiles. have a look in /var/lib/docker/containers/ for *.log
assuming this IS the case, you can tell docker to only log upto a max-size .. I have mine set to 100m or some such
Evening Geoffrey, sorry for getting to this later in the day, I have been rather swamped today. All good though, all good.
What you raise is a good question. A very good question.
One of things that we have been thinking of around these parts is doing deep dives and interviews with users on how they came to ClearML, what setup they are using, key technologies and languages. In short, a sort of interview which will lead to a recipe book in the spirit of cooking (I would say CookBook but I th...
@<1687643893996195840:profile|RoundCat60> you set it once, inside the docker-compose itself.. it will affect all docker containers but, to be honest, docker tends to log everything
there is a --docker flag for clearml-agent that will build containers :)
Ohhh... that makes sense.. use best of breed in areas where we don't overlap.
I would think that a combination of kubernetes (I believe the preferred way to support multiple users at once, but open to being wrong) and individual queue's is probably the solution here.
for example; in kubernetes you could setup an agent to listen to bob-queue and another agent to listen to alice-queue. In the kubernetes dashboard you could assign a certain amount of cpu/memory and if using taints, gpu or not.
one last tiny thing TrickySheep9 .. please do let us know how you get on, good or bad.. and if you bump into anything unexpected then please do scream and let us know 🙂
Hello.. I don't think so. Code of ethics can obviously vary from one job to another, and of course, so can legal compliance. You obviously have something very specific in mind, if you can expand on what you are looking for specifically, we maybe able to help.
usually though, genearlly speaking, a tools ethics and legality are set by the business side - not really something software would enforce on you. I hope I understand your question.
the part that I am concerned on is that the first pair of graphs you showed, the dataset (even from jst looking at it) are very different 😕
aaahhh.. I will wager good money Sir that you are then using ipython in vscode which is probably trying to do something "fancy" with the interpreter
understand. Are you comfortable with docker ? If so, I would probably suggest doing a docker run -it <identifier> bash and seeing if that folder does, indeed, exist in the docker image
There is already a pre-built AMI fwiw 🙂
that is one of the things I am working away on, even as we speak! If you have any items that you want to see sooner rather than later, please let me know 👍
I am guessing it could be but.. I don't feel that k8s is clearml-session's main focus/push
hhrrmm.. in the initial problem, you mentioned that the /var/lib/docker/overlay2 was growing large in size.. but.. 4GB seems "fine" for docker images.. I wonder .. does your nvme0n1p1 ever report like 85% or 90% used or do you think that the 4GB is a lot ? when you restart the server, does the % used noticeably drop ? that would suggest tmp files inside the docker image itself which.. is possible with docker (weird but, possible)
Clear-ml agent does have a build flag install-globally.. That may get you where you want to go
since this is an enterprise machine, and you don't have sudo/root, I am wondering if there is already other docker networks/composer setups running/in use
if you see it in the community server, then I believe the answer is "yes" - although don't hold me accountable on this 😄
I know the storage can be swapped out to using S3 (obviously)
do you have code that you can share ?
Hey Manoj, I am not sure how clearml-session would know how to setup kube-proxy, if that's your intent.
Personally, I would run the clearml-server and agents on the k8s cluster, and then expose the server endpoints via kube proxy or some other nicer ingress. Then you can run jupyter locally and you should be good. Jupyter session remotely running on k8s would be a logistical nightmare 🙂
The way I read that is if you have exposed your clearml-server via k8s ingress then you can, from outside the k8s, say to clearml-session this is the k8s ingress/ip
Howdy and Morning @<1687643893996195840:profile|RoundCat60> .. docker when using overlay2 doesn't have it's mount points show up in a 'df' btw, they will only appear in a 'df -a', mostly because since they are simply 'overlays', they don't (technically) consume any space (I mean, the files are still in the /var/lib but not for the space counting practices used by df)
this is why I was suggesting a find, maybe with a 'du' .. actually.. let me try that here.. 2s
Hey Federico, since you are doing this from inside python, you could always call the 'get_parameters_as_dict' from the Task you have cloned, merge/update whichever ones you want to (or not), and then call ' set_parameters_as_dict ' .. I believe that should get you where you want to go 🙂
hello Emanuel 👋
I assume you are going to use python, in which case, inside each ClearML Task there is a method called get_reported_scalars that should have all the data I think.
you may want to read the warning at https://allegro.ai/clearml/docs/rst/references/clearml_python_ref/task_module/task_task.html#clearml.task.Task.get_reported_scalars on this.. and cache yourself as appropriate.. actually, the docs for the API are pretty thorough.. so if this isn't the exact itch you ne...