I have GCP instance with official clearml image.
from clearml import StorageManager, Dataset
dataset = Dataset.create(
dataset_project="Project", dataset_name="Dataset_name"
)
files = [
'file.csv',
'file1.csv',
]
for file in files:
csv_file = StorageManager.get_local_copy(remote_url=file)
dataset.add_files(path=csv_file)
# Upload dataset to ClearML server (customizable)
dataset.upload()
# commit dataset changes
dataset.finalize()
I am running clearml server on gcp, but I didn't exposed ports instead I ssh to machine and do port forwarding to localhost. The problem is localhost on my machine is not same as localhost inside docker on worker. If I check dataset, files are stored in localhost, but actually it is not localhost. Didn't fond the solution yet how to properly setup hostname for dataserver. Any ideas?
Hi @<1702492411105644544:profile|YummyGrasshopper29> , clearml registers the uploaded artifacts (including datasets) with the URLs used to upload them, which is why the day is registered under localhost in your case. I think the solution in your case is to use url substitution
@<1702492411105644544:profile|YummyGrasshopper29> , I suggest you take a look here - None
In the webUI, when you go to the dataset, where do you see it is saved? You can click on 'full details' in any version of a dataset and see that in the artifacts section
I solve problem by adding container argument
--network host
@<1702492411105644544:profile|YummyGrasshopper29> , how did you save the dataset? Where was the data uploaded to?