Hi @<1590514584836378624:profile|AmiableSeaturtle81> ! You could get the Dataset Struct
configuration object and get the job_size
from there, which is the dataset size in bytes. The task IDs of the datasets are the same as the datasets' IDs by the way, so you can call all the clearml task related function on the task your get by doing Task.get_task("dataset_id")
Answered
Is There Any Way To Get Dataset Size Without Downloading State.Json?
Im Doing Ds = Clearml.Dataset.Get(Dataset_Id=D_Id), But It Instantly Tries To Download State.Json Which Is On S3. Im Only Interested In Size And File Count Which I Then Get From Calling
Is there any way to get dataset size without downloading state.json?
im doing ds = clearml.Dataset.get(dataset_id=d_id), but it instantly tries to download state.json which is on S3. Im only interested in size and file count which i then get from calling ds.get_metadata("state")
"state" comes from Task, so another workaround would be to get Task id straight from knowing the dataset ID
I dont want to download state.json because
- Its 500+MB
- I need S3 Creds that I dont want to store on server
517 Views
1
Answer
6 months ago
6 months ago
Tags