I want to upload the dataset into s3. is there a flag that tells it to do so?
Hey, it seems the dataset is simply too large to upload to the fileserver... How big is the dataset?
Well, you'll need to configure the default output_uri
to be an s3 bucket
Well, I couldn’t find where to add the output_uri
.
I tried now the following:
` import clearml
local_path='<some-local-path>'
s3_path = 's3://<some-bucket-path>'
dataset = clearml.Dataset.create(dataset_project='project_name', dataset_name='trial_01')
dataset.add_files(path=local_path, dataset_path=s3_path) `but I don’t see the files on the s3 bucket.
I also tried this:dataset.sync_folder(local_path=local_path, dataset_path=s3_path)
and still no success. It seems like it uploading the files to the clearml server:>> dataset.get_default_storage() '
'
It would be great if you could help me understand how to direct the dataset to upload the files to the s3_path.
Well, in your clearml.conf
file, set the sdk.development.default_output_uri
to the desired value (see https://github.com/allegroai/clearml/blob/fb6fd9ac4a6820b4d1d3b8d6dcc60208a45d0718/docs/clearml.conf#L163 )