Task.init() takes by default output_uri from clearml.conf configuration file (S3 bucket in my case). But underlined task created with Dataset.create() ignores it and uploads files by default to https://files.community.clear.ml Is this intended behavior?

Posted 2 years ago
Votes Newest

Answers 5

SuccessfulKoala55 Without commented line files are uploaded to http://files.community.clear.ml instead of by S3 bucket

Posted 2 years ago

Hi SuccessfulKoala55 Here is code_snipet
` task = Task.init(project_name=PROJECT_NAME, task_name=section)
print('params', params)

dataset = Dataset.create(dataset_name=params['dataset'], dataset_project=PROJECT_NAME)
dataset_local_dir = dataset.get_local_copy()

dataset._task.output_uri = task.output_uri

KeywordProcessor(params['es_host'], params['es_port'], True, DOCS_ROOT)

dataset.add_files(DOCS_ROOT, wildcard='*.csv')
dataset.upload() `I add several files to a dataset and upload

Posted 2 years ago

Hi HelpfulHare30 , which files are you referring to?

Posted 2 years ago

Yeah, I see what you mean, probably because the output_uri initialization is handled by the Task.init() . Can you please open a GitHub issue?

Posted 2 years ago

SuccessfulKoala55 Sure. At https://github.com/allegroai/clearml/issues ?

Posted 2 years ago
