Hey, it seems the dataset is simply too large to upload to the fileserver... How big is the dataset?
I want to upload the dataset into s3. is there a flag that tells it to do so?
Well, you'll need to configure the default output_uri
to be an s3 bucket
Well, I couldn’t find where to add the output_uri
.
I tried now the following:
` import clearml
local_path='<some-local-path>'
s3_path = 's3://<some-bucket-path>'
dataset = clearml.Dataset.create(dataset_project='project_name', dataset_name='trial_01')
dataset.add_files(path=local_path, dataset_path=s3_path) `but I don’t see the files on the s3 bucket.
I also tried this:dataset.sync_folder(local_path=local_path, dataset_path=s3_path)
and still no success. It seems like it uploading the files to the clearml server:>> dataset.get_default_storage() '
'
It would be great if you could help me understand how to direct the dataset to upload the files to the s3_path.
Well, in your clearml.conf
file, set the sdk.development.default_output_uri
to the desired value (see https://github.com/allegroai/clearml/blob/fb6fd9ac4a6820b4d1d3b8d6dcc60208a45d0718/docs/clearml.conf#L163 )