The scheduler just downloads a dataset using the ID right? So if you don't upload a new dataset, the scheduler is just downloading the dataset from the last known ID then. I don't really see how that could lead to a new dataset with it's own ID as the parent. Would you mind explaining your setup in a little more detail? 🙂
Hello!
What is the usecase here, why would you want to do that? If they're the same dataset, you don't really need lineage, no?
I know, but I run a scheduler on the script that downloads a dataset, and if there is no new dataset to download, I try to figure out what it will do