Reputation
Badges 1
25 × Eureka!In that case, no the helm chart does not spin a default agent (You should however spin a service mode agent for running pipelines logic)
Thanks PompousBeetle71
Quick question, what frameworks are you using?
Do you use save
method directly to file stream (or any other direct storage)?
Also, is there a way to reproduce this issue of not capturing the model?
Hi DepressedFish57
In my case download each part takes ~5 second, and unzip ~15.
We run into that, and the new version will employ multithreading approach for the unzip (meaning the unzipping will happen in the background)
UnevenDolphin73 something like this one?
https://github.com/allegroai/clearml/pull/225
Hi RoundMosquito25
The main problem here is there is no way to know before running the Task how much memory it would need ... And without that parameter maximizing GPUs is quite challenging. wdyt?
Sure, try to run the clearml-agent withclearml-agent daemon -O
https://clear.ml/docs/latest/docs/clearml_agent/clearml_agent_daemon
Hi CheerfulGorilla72
is it ideological...
Lol, no 😀
Since some of the comparisons are done client side (browser, mostly the text comparisons) it is a bit heavy , so we added a limit. We want to change it so it does some on the backend, but in the meantime we can actually expand the limit, and maybe only lazy compare the text areas. Hopefully in the next version 🤞
Hi Guys,
I hear you guys, and I know this is planned but probably bump down priority.
I know the main issue is the "Execution Tab" comparison, the rest is not an issue.
Maybe a quick Hack to only compare the first 10 in the Execution, and remove the limit on the others ? (The main isue with the execution is the git-diff / installed packages comparison that is quite taxing on the FE)
Thoughts ?
CheerfulGorilla72 as I understand there were some delays wit the current release, so it is going to be out this week. The one after that includes this feature and as far as I understand would be mid Dec.
I could improve the cost-efficiency of my provisionned GCP A100 instances
But their pricing is linear, if you do not need a100 get a cheaper instance ?! no?
Well it should work, make sure you see the Task "holds" all the information needed (under the execution tab). repo / uncommitted changes / python packages etc.
Then configure your agent (choose pip/conda/poetry as package managers), and spin it up (by default in venv/coda mode, or in docker mode)
Should work 🙂
Let me check, it was supposed to be automatically aborted
Hi BoredSquirrel45
as of today, my required packages aren't being recognized in cloned
Are you saying you are editing the code directly in the cloned Task, then enqueue the Task an the agent does not "auto recognize" the package ?
okay, wait I'll see if I can come up with something .
Hi VivaciousBadger56
Basically you can think of MLRun as "amazon lambda service without amazon". It is designed to run a "function" in scale on multiple nodes.
ClearML on the other hand is an MLOps platform. It does the experiment tracking, it orchestrates Task (think jobs), it does data management and lastly we recently released the serving. These are two different use cases.
Am I making sense here?
Is there a way to force clearml not to upload these models?
DistressedGoat23 is it uploading models or registering them? to disable both set auto_connect_frameworks https://clear.ml/docs/latest/docs/clearml_sdk/task_sdk#automatic-logging
Their name only contain the task name and some unique id so how can i know to which exact training
You mean the models or the experiments being created ?
Hi RobustHippopotamus53
The way "latest from branch" works:
On the Task you specify the branch name (e.g. "master", no need to add the origin/ prefix) The agent then pulls the latest commit from that branch and updates back the Task to the current commit ID (the latest on the branch at the time of execution) This process ensures reproduciblity and traceability as we can always be certain the exact commit that was executed.Could it be the you "forced-push" a commit/squash, hence the "origina...
The confusion matrix shows under debug sample, but the image is empty, is that correct?
Hi RoundMosquito25
This is a bit old but probably a good start:
https://clear.ml/blog/stacking-up-against-the-competition/
tl;dr
ClearML advantages (at least a few I can think of)
Scales way better Enables out of the box experiment orchestration (i.e. remote execution etc) Data management Nicer UI Full RestAPI Full MLops platform Model serving Query-able model repositoryProbably more 🙂
In the UI you can edit the base container image + add "SETUP SHELL SCRIPT", with any missing "apt update && apt-get install -y ..."
Yes 🙂
BTW: do you guys do remote machine development (i.e. Jupyter / vscode-server) ?
time-based, dataset creation, model publish (tag),
Anything you think is missing ?
SlipperyDove40 I just installed a fresh copy py3.6 and plotly on ubuntu. the entire venv dir is ~86MB