@<1523701435869433856:profile|SmugDolphin23>
I rechecked on single files, creating new datasets, and everything works properly. I tried to create dataset using original data, and I got the following logs. Could you suggest what could be causing this?Uploading dataset changes (1497 files compressed to 9.07 MiB) to
None2023-05-12 08:46:03,114 - clearml.storage - ERROR - Exception encountered while uploading Failed uploading object /addudkin2/.datasets/test-addudkin/test-addudkin.19ab55776fed408cab214814543699de/artifacts/data/dataset.19ab55776fed408cab214814543699de.mr7nkbq8.zip (413): <html>
<head><title>413 Request Entity Too Large</title></head>
<body>
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>nginx</center>
</body>
</html>
WARNING:root:Failed uploading artifact 'data'. Retrying... (1/3)
2023-05-12 08:46:03,602 - clearml.storage - ERROR - Exception encountered while uploading Failed uploading object /addudkin2/.datasets/test-addudkin/test-addudkin.19ab55776fed408cab214814543699de/artifacts/data/dataset.19ab55776fed408cab214814543699de.mr7nkbq8.zip (413): <html>
<head><title>413 Request Entity Too Large</title></head>
<body>
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>nginx</center>
</body>
</html>
WARNING:root:Failed uploading artifact 'data'. Retrying... (2/3)
2023-05-12 08:46:03,920 - clearml.storage - ERROR - Exception encountered while uploading Failed uploading object /addudkin2/.datasets/test-addudkin/test-addudkin.19ab55776fed408cab214814543699de/artifacts/data/dataset.19ab55776fed408cab214814543699de.mr7nkbq8.zip (413): <html>
<head><title>413 Request Entity Too Large</title></head>
<body>
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>nginx</center>
</body>
</html>
WARNING:root:Failed uploading artifact 'data'. Retrying... (3/3)
2023-05-12 08:46:04,392 - clearml.storage - ERROR - Exception encountered while uploading Failed uploading object /addudkin2/.datasets/test-addudkin/test-addudkin.19ab55776fed408cab214814543699de/artifacts/data/dataset.19ab55776fed408cab214814543699de.mr7nkbq8.zip (413): <html>
<head><title>413 Request Entity Too Large</title></head>
<body>
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>nginx</center>
</body>
</html>
File compression and upload completed: total size 9.07 MiB, 1 chunk(s) stored (average size 9.07 MiB)
@<1578193574506270720:profile|DashingAlligator28> Removed nginx limits
Hi @<1524560082761682944:profile|MammothParrot39> ! A few thoughts:
You likely know this, but the files may be downloaded to something like /home/user/.clearml/cache/storage_manager/datasets/ds_e0833955ded140a69b4c9c9d8e84986c
. .clearml
may be hidden and if you are using an explorer you are not able to see the directory.
If that is not the issue: are you able to download some other datasets, such as our example one: UrbanSounds example ? I'm wondering if the problem only happens for your specific dataset.
Hi @<1524560082761682944:profile|MammothParrot39> , did you make sure to finalize the dataset you're trying to access?
How can you solve this problem? I'm with this one too.
@<1523701070390366208:profile|CostlyOstrich36> Yes, sure
import pandas as pd
import yaml
import os
from omegaconf import OmegaConf
from clearml import Dataset
config_path = 'configs/structured_docs.yml'
with open(config_path) as f:
config = yaml.full_load(f)
config = OmegaConf.create(config)
path2images = config.data.images_folder
def get_data(config, split):
path2annotation = os.path.join(config.data.annotation_folder, f"sample_{split}.csv")
data = pd.read_csv(path2annotation)
return data
data_train = get_data(config, 'train')
data_val = get_data(config, 'val')
data = pd.concat([data_val, data_train])
files = [os.path.join(path2images, file) for file in data['filename'].values]
dataset = Dataset.create(
dataset_name="test OCR dataset",
dataset_project="Text Recognition"
)
for file in files:
dataset.add_files(path=file)
dataset.upload()
dataset.finalize()
With this script, I uploaded data to the server. You can also see the final status of the dataset in the screenshot.