Reputation
Badges 1
25 × Eureka!Hi @<1523703397830627328:profile|CrookedMonkey33>
If you click on the "Task Information" (on the Version Info panel, right hand-side). It will open the Task details page, there you have the "hamburger" menu top right, where you have publish
(Maybe we should add that to the main right click menu?!)
Hi NastyFox63
This seems like most of the reports are converted to pngs (which is what the automagic does if it fails to convert the matplotlib into interactive plot).
no more than 114 plots are shown in the plots tab.
Are you saying we have 114 limit on plots ?
Is this true for "full screen" mode (i.e. not in the experiments table but switch to full detailed view)
I have to assume that I do not know the dataset ID
Sorry I mean:
datasets = Dataset.list_datasets(dataset_project="some_project")
for d in datasets:
d["version"] = Dataset.get(dataset_id=d["id"]).version
wdyt?
one can containerise the whole pipeline and run it pretty much anywhere.
Does that mean the entire pipeline will be running on the instance spinning the container ?
From here: this is what I understand:
https://kedro.readthedocs.io/en/stable/10_deployment/06_kubeflow.html
My thinking was I can use one command and run all steps locally while still registering all "nodes/functions/inputs/outputs etc" with clearml such that I could also then later go into the interface and clone an...
LOL, if this is important we probably could add some support (meaning you will be able to specify it in the "installed packages" section, per Task).
If you find an actual scenario where it is needed, I'll make sure we support it π
That might be me, let me check...
You mean to add these two to the model when deploying?
β βββ model_NVIDIA_GeForce_RTX_3080.plan
β βββ model_Tesla_T4.plan
Notice the preprocess.py
is Not running on the GPU instance, it is running on a CPU instance (technically not the same machine)
I want to run only that sub-dag on all historical data in ad-hoc manner
But wouldn't that be covered by the caching mechanism ?
Can you post the toml file? Maybe the answer is there
Hi DisgustedDove53
Is redis used as permanent data storage or just cache?
Mostly cache (Ithink)
Would there be any problems if it is restarted and comes up clean?
Pretty sure it should be fine, why do you ask ?
Hmm let me rerun (offline mode right ?)
Hmm UnevenDolphin73 I just checked with v.1.1.6, the first time the configuration file is loaded is when calling Task.init (if not running with an agent, which is your case).
But the main point I just realized I missed π€―"http://"${CLEARML_ENDPOINT}":8080"
The code does not try to resolve OS environments there!
Which, well, is a nice feature to add
https://github.com/allegroai/clearml/blob/d3e986393ac8d1a1ea48302224962570ab8e6f9e/clearml/backend_api/session/session.py#L576
should p...
I think you are correct the env variable is not resolved in "time". It might be it's resolved at import not at Task.init
CloudyHamster42 FYI the warning will not be shown in the next Trains version, the issue is now fixed, thank you π
Regrading the double axes, see if adding plt.clf()
helps. It seems the axes are leftover from the previous figure, that somehow are still there...
AFAIK that's the only way right now (see my comment here - https://clearml.slack.com/archives/CTK20V944/p1657720159903739?thread_ts=1657699287.630779&cid=CTK20V944 )
Or then if you have the ClearML paid service, I believe there is a "vaults" service, right AgitatedDove14 ?
Yep UnevenDolphin73 :)
Getting the last checkpoint can be done via.
Task.get_task(task_id='aabbcc').models['output'][-1]
RoundMosquito25 this is a good point, I mean in theory it could be done, the question is the actual Bayesian optimization you are using.
Is it optuna (OptimizerOptuna) or OptimizerBOHB?
not really, the OS will almost never allow for that, actually it is based on fairness and priority. we can set the entire agent to have the same low priority for all of them, then the OS will always take CPU when needed (most of the time it won't) and all the agents will split the CPU's among them, no one will get starved π With GPUs , it is a different story, there is no actual context switching or fairness mechanisms like in CPU
Hi @<1571308003204796416:profile|HollowPeacock58>
I'm assuming this is the arm support (i,e, you are running on new mac) fix we released in one one of the last clearml-agent versions. could you update to the latest clearml-agent?
pip3 install clearml-agent==1.6.0rc2
okay this points to an issue with the k8s glue, I think it somehow failed to launch the pod. Can you send me the log of the clearml-k8s-glue ?
Hi TrickyRaccoon92
Yes please update me once you can, I would love to be able to reproduce the issue so we could fix for the next RC π
now i cant download neither of them
would be nice if address of the artifacts (state and zips) was assembled on the fly and not hardcoded into db.
The idea is this is fully federated, the server is not actually aware of it, so users can manage multiple storage locations in a transparent way.
if you have any tips how to fix it in the mongo db that would be great ....
Yes that should be similar, but the links would be in artifact property on the Tasks object
not exactly...
As we canβt create keys in our AWS due to infosec requirements
Hmmm
It may have been killed or evicted or something after a day or 2.
Actually the ideal setup is to have a "services" pod running all these service on a single pod, with clearml-agent --services-mode. This Pod should always be on and pull jobs from a dedicated queue.
Maybe a nice way to do that is to have the single Task serialize itself, then have the a Pod run the Task every X hours and spin it down
So I would like to to know what it send to the server to create the task/pipeline, ...
will my datasets be stored on the same machine that hosts the clearml server?
By default yes, they will be stored to the files-server (but you can change it, this is an argument for both the CLI and the python interface)
Ohh no I see, yes that makes sense, and I was able to reproduce m thanks!
Hi BoredHedgehog47 I'm assuming the nginx on the k8s ingest is refusing the upload to the files server
JuicyFox94 wdyt?