It is deterministic. When you do Dataset.get(), clearML downloads file state.json, where you can see all relative file paths and chunks number
Thanks. @<1584716355783888896:profile|CornyHedgehog13> , I considered this. is the chunk order deterministic? As in, can I rely on chunk [0] always referring to the same file object if additional files are added?
You can get a chunk number that contains your file and download that chunk
There is no natural way to expose single files in Datasets. However it looks like you found an appropriate workaround 🙂
I found this... It works as long as the initial data files uploaded are converted to csv files (e.g., excel, .sav, .spss etc).
preprocess_task = Task.get_task(task_id='xxx123')
local_csv = preprocess_task.artifacts['data'].get_local_copy()