Hi all, Juts learning the ropes of ClearML atm. And am doing a really simple ETL pipeline: raw data -> clean data My current approach is in one script, I add...
4 months ago
Hi @<1523701070390366208:profile|CostlyOstrich36> - Cheers for your time
I thought about that, but I think the lineage feature is really valuable.
I've opted for this as a go to pattern now to achieve what I wanted. I literally just remove all files in the new dataset before finalizing it
with TemporaryDirectory() as tmp:
out = Path(tmp) / "df_clean.parquet"
result.to_parquet(out, index=False)
clean = Dataset.create(
dataset_name="clean-data",
dataset_p...
I wonder, is this stye of data set handling trying to square the circle with ClearML? Is it built for this type of stuff