It is deterministic. When you do Dataset.get(), clearML downloads file state.json, where you can see all relative file paths and chunks number
There is no natural way to expose single files in Datasets. However it looks like you found an appropriate workaround 🙂
You can get a chunk number that contains your file and download that chunk
Thanks. @<1584716355783888896:profile|CornyHedgehog13> , I considered this. is the chunk order deterministic? As in, can I rely on chunk [0] always referring to the same file object if additional files are added?
I found this... It works as long as the initial data files uploaded are converted to csv files (e.g., excel, .sav, .spss etc).
preprocess_task = Task.get_task(task_id='xxx123')
local_csv = preprocess_task.artifacts['data'].get_local_copy()