Is there anything special about the parent dataset?
The SDK treats it as an itrable, and so the first element (since a string is an iterable of chars) is 'a'...
In the child dataset task I see the following :
ARTIFACTS - > STATE:Dataset state
Files added/modified: 1 - total size 518.78 MB
Current dependency graph: {
"0385db....": [],
()"94f4....": ["0385db..."]
}
child task is 94f4..
and parent task is "0385db..."
but what does the () line means?
AbruptWorm50 I think your issue is that the parents should be a list of strings, not a sting
That would explain why it reports the task id to be 'a' in the error. It tried to index the first element in a list, but took the first character of a string instead.
I think that would defeat the purpose of lineage no? The point is to keep track of where data came from in the real world. Rewriting that record is just kind of... metadata?
As for the (*) line, could it be that "0385db..."
itself does not have parents itself? So "0385db..."
is the base dataset, without parents, and it has 1 child, which has "0385db..."
as its parent
Another question: Is there a way to group together Dataset tasks (i.e redefine their parent) after the tasks have been finalized? In the same context: is there a way to change the dependency graph in the clearml dashboard after the task creation and finalization ?