Hi @<1697056701116583936:profile|JealousArcticwolf24>
You have clearml Datasets None
It will version catalog and store meta-data of your datasets.
Each version only stores the delta from the parent version, but delta is on a file granularity not a "block" granularity
Notice that under the hood of course it uses storage solutions to store and cache the underlying immutable copy of the data. What's your use case?
Hi @<1523701205467926528:profile|AgitatedDove14>
Actually trying to understand) does clearml infrastructure better than common popular stack like dvc/lakefs + mlflow + cubeflow/airflow)
You mean does one solution is better than combining maintaining and automating 3+ solutions (dvc/lakefs + mlflow + cubeflow/airflow)
Yes I'd say it is. BTW if you have airflow running for other automations you can very easily combine the automation with clearml and have a single airflow automation for everything, but the main difference now airflow only launches logic, never actual compute/data (which are launched and scaled via clearml
Does that make sense?