Reputation
Badges 1
16 × Eureka!hmmā¦ ReassuredOwl55 a lot has changed in datasets internals since then. please refer to the docs and videos to see how many exciting features were added.
FWIW I think it will turn out okay once you finish uploading the state file
AgitatedDove14 I tried the squash solution, however this somehow caused a download of all the datasets into my /tmp folder, filling up the instance? I have a special drive for .clearml cache, how can I tell clearml-data to only use that?
Yeah the hack would work but iām trying to use it form the command line to put in airflow. Iāll post on GH
ps, the agent is in docker mode, I wonder why it uses the host mapping for the clearml cache folder
Okay that was because it wasnāt on docker mode for this reproduction
VivaciousBadger56 Youāre basically answering yourself š so kedro = lean feature strong community, ClearML many features small (growing) community and mlrun has a good name
CostlyOstrich36 seeing an awful lot
DEBUG:urllib3.connectionpool:Resetting dropped connection: api.clear.ml
SweetBadger76 , AgitatedDove14 , creating a dataset with parents worked very well and produced great visuals on the UI!
CostlyOstrich36 I ran using the deafult docker, still a tunell problem. this is what I got eventually:
` Creating config file /etc/ssh/sshd_config with new version
Creating SSH2 RSA key; this may take some time ...
2048 SHA256:TTE+YCJmi2NOpH/ykzdHiP+MgCfKkZXocwUyu58GuAA root@Merlin-dev (RSA)
Creating SSH2 ECDSA key; this may take some time ...
256 SHA256:ks6yr6FpKp5pyLU9NRLK/K96BYieuivwqw7RKAaQHIA root@Merlin-dev (ECDSA)
Creating SSH2 ED25519 key; this may take some time ...
256 SHA256:0JxV...
used Nvidia pytorch container 22.04 instead of the default one, tried to put also jupyterlab (opened up the default ports on azure console). task seems successful, sill no ssh tunnel.
ok scratch that - you can override TMPDIR in the env. much better!
super makes sense, but can it NOT use /tmp for this iām merging about 100GB of files and it is quite heavy on the partition. maybe I could put an env variable to divert it to scratch?
AgitatedDove14 CostlyOstrich36 yes! that did the trick. I added the 10022 on the azure networking pane and session is now working!!
okay I was prematurely happy. will update soon
why are you not starting threads from user issues, is beyond me. anyways iirc it can also happen if you are using the same virtualenv for two trains-agents [mistakenly] and one of them uninstalls certifi
wow i had the same problem, this should go into the FAQ AgitatedDove14
@<1523701435869433856:profile|SmugDolphin23> only set max_worker=1 and it seems to work. thanks!
will do and report back! thanks
Hi, just chiming in with a lesson learnt on my subreddit r/mlops - when shortlisting open-source MLOps infra, the bundled features are less important KPIs than stability and longevity markers:
community adoption active slack channel good documentation clear monetization scheme (how much does it cost if you decide to go SaaS instead of paying for own infra) - even if you never intend to go SaaS, it helps to understand if the OSS is actually āfreemiumā or not.
Hope that helps!