
Reputation
Badges 1
13 × Eureka!BTW, when I run dataset = Dataset.create(dataset_name="mydataset", dataset_project="test_project")
, it creates the dataset instance on dashboard. The problem is uploading which doesn’t happen and this error shows up:
Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7febe270c340>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not ...
I didn’t pass anything for output_uri as I assumed the default is clearml data server
Thanks Martin, so does it mean I won’t be able to see the data hosted on S3 bucket in ClearMl dashboard under datasets tab after registering it?
I’m new to ClearMl and try to see how it works with S3 (external buckets)
I installed cClearML 1.9 and the error doesn’t show anymore. When I run the code it created the dataset instance on dashboard but it doesn’t upload the files to ClearMl data server from my S3 bucket. Am I doing sth wrong?
To expand on this, suppose I have an S3 bucket where my data is stored and I wish to transfer it to ClearML file server. I execute the following Python script
from clearml import Dataset
dataset = Dataset.create(dataset_name="my_dataset", dataset_project="my_project")
dataset.add_external_files(
source_url="
",
dataset_path="/my_dataset/"
)
dataset.upload()
dataset.finalize()
and this is aws part of my clearml.conf
aws {
s3 {
# S3 creden...
By the way, when I run the upload command I get the following error :
Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd72e900130>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known')': /
So this feature is not available for ClearML-hosted server?
This is what I’m running :
from clearml import Dataset
dataset = Dataset.create(dataset_name="mydataset", dataset_project="test_project")
dataset.add_external_files(
source_url="s3://???/",
dataset_path="/mydataset/"
)
dataset.upload()
dataset.finalize()
also I have:
api {
# Notice: 'host' is the api server (default port 8008), not the web server.
api_server:
web_server:
files_server:
# Credentials are generated using the webapp,
# Override with os environment: CLEARML_API_ACCESS_KEY / CLEARML_API_SECRET_KEY
credentials {"access_key": "***", "secret_key": "***"}
}
Let say I don’t have the data on my local machine but only S3 bucket. So to see the data in ClearML dashboard, I need to download first from S3 to my local machine and then add files and upload to ClearMl data server which is visible under this tab:
I didn’t change anything in my clearml.conf. Is there sth in sdk.development that I need to change:
development {
# Development-mode options
# dev task reuse window
task_reuse_time_window_in_hours: 72.0
# Run VCS repository detection asynchronously
vcs_repo_detect_async: true
# Store uncommitted git/hg source code diff in experiment manifest when training in development mode
# This stores "git diff" or "hg diff" into the exp...