You can get a chunk number that contains your file and download that chunk
It is deterministic. When you do Dataset.get(), clearML downloads file state.json, where you can see all relative file paths and chunks number
I found this... It works as long as the initial data files uploaded are converted to csv files (e.g., excel, .sav, .spss etc).
preprocess_task = Task.get_task(task_id='xxx123')
local_csv = preprocess_task.artifacts['data'].get_local_copy()
There is no natural way to expose single files in Datasets. However it looks like you found an appropriate workaround 🙂
Thanks. @<1584716355783888896:profile|CornyHedgehog13> , I considered this. is the chunk order deterministic? As in, can I rely on chunk [0] always referring to the same file object if additional files are added?