Hi UpsetCrow72 ,
Can you please explain which steps you took to make this happen? I'm not sure I understand what exactly happened.
How are you getting the data locally? Can you paste the error here?
hi CostlyOstrich36 , sorry, let me make it a bit more clear.
I simply upload a bunch of files as a new dataset using the Python API. Then using the CLI I get a local copy where I remove a few of the files. At this step, I tried both simply removing them from the file-system and then using $ clearml-data sync
, and also using $ clearml-data remove
. I get an error, invalid status, but the result is the same as I described above: file count was updated, but when I get a new local copy again of the dataset, I get all of the files, so it seems like the removing didn't happen.
Is there a step I'm missing? Is it even possible to remove files from a finalized dataset?
I'm getting it by $ clearml-data get --id <id> --copy <local_path>
and then this is the log output of $ clearml-data remove --id <id> --files <files>
clearml-data - Dataset Management & Versioning CLI Removing files/folder from dataset id 2a0eb9ab619c442abc204775f217d0b9 2022-09-19 13:44:12,964 - clearml.Task - ERROR - Action failed <400/110: tasks.add_or_update_artifacts/v2.10 (Invalid task status: expected=created, status=completed)> (task=2a0eb9ab619c442abc204775f217d0b9, artifacts=[{'key': 'state', 'type': 'dict', 'uri': '***', 'content_size': 30814, 'hash': 'beb96389b3d7a374115a1e340dff51fc6ce65c511a0a93411e0ea85fe8dbfc08', 'timestamp': 1663587852, 'type_data': {'preview': 'Dataset state\nFiles added/modified: 107 - total size 98.51 KB\nCurrent dependency graph: {\n "2a0eb9ab619c442abc204775f217d0b9": []\n}\n', 'content_type': 'application/json'}, 'display_data': [('files modified', '0'), ('files added', '108'), ('files removed', '0')]}], force=True) 2022-09-19 13:44:13,656 - clearml.Task - ERROR - Action failed <400/110: tasks.add_or_update_artifacts/v2.10 (Invalid task status: expected=created, status=completed)> (task=2a0eb9ab619c442abc204775f217d0b9, artifacts=[{'key': 'state', 'type': 'dict', 'uri': '***', 'content_size': 30529, 'hash': 'a848da39685d766ed1a5b650510ecc2b5fb1472acef13151572713668ba12161', 'timestamp': 1663587853, 'type_data': {'preview': 'Dataset state\nFiles added/modified: 106 - total size 98.33 KB\nCurrent dependency graph: {\n "2a0eb9ab619c442abc204775f217d0b9": []\n}\n', 'content_type': 'application/json'}, 'display_data': [('files modified', '0'), ('files added', '108'), ('files removed', '1')]}], force=True) 2022-09-19 13:44:14,333 - clearml.Task - ERROR - Action failed <400/110: tasks.add_or_update_artifacts/v2.10 (Invalid task status: expected=created, status=completed)> (task=2a0eb9ab619c442abc204775f217d0b9, artifacts=[{'key': 'state', 'type': 'dict', 'uri': '***', 'content_size': 30243, 'hash': 'aead122684666ae652e4bb6621e8bb9cb1a11230b21111cd0ca1e1d1851adc6a', 'timestamp': 1663587854, 'type_data': {'preview': 'Dataset state\nFiles added/modified: 105 - total size 97.94 KB\nCurrent dependency graph: {\n "2a0eb9ab619c442abc204775f217d0b9": []\n}\n', 'content_type': 'application/json'}, 'display_data': [('files modified', '0'), ('files added', '108'), ('files removed', '2')]}], force=True) 2022-09-19 13:44:14,935 - clearml.Task - ERROR - Action failed <400/110: tasks.add_or_update_artifacts/v2.10 (Invalid task status: expected=created, status=completed)> (task=2a0eb9ab619c442abc204775f217d0b9, artifacts=[{'key': 'state', 'type': 'dict', 'uri': '***', 'content_size': 29957, 'hash': '1842864e81ff882db7573e95d3c3712cdcad925433bfc30e49b5d3782cbc6d8f', 'timestamp': 1663587854, 'type_data': {'preview': 'Dataset state\nFiles added/modified: 104 - total size 97.43 KB\nCurrent dependency graph: {\n "2a0eb9ab619c442abc204775f217d0b9": []\n}\n', 'content_type': 'application/json'}, 'display_data': [('files modified', '0'), ('files added', '108'), ('files removed', '3')]}], force=True) 2022-09-19 13:44:15,556 - clearml.Task - ERROR - Action failed <400/110: tasks.add_or_update_artifacts/v2.10 (Invalid task status: expected=created, status=completed)> (task=2a0eb9ab619c442abc204775f217d0b9, artifacts=[{'key': 'state', 'type': 'dict', 'uri': '***', 'content_size': 29672, 'hash': 'df6fbf7bc6410930e80e1b8b9c4c9d9e3999b2b49fb040d1b0bf38d2c7d77d47', 'timestamp': 1663587855, 'type_data': {'preview': 'Dataset state\nFiles added/modified: 103 - total size 97.05 KB\nCurrent dependency graph: {\n "2a0eb9ab619c442abc204775f217d0b9": []\n}\n', 'content_type': 'application/json'}, 'display_data': [('files modified', '0'), ('files added', '108'), ('files removed', '4')]}], force=True) 5 files removed
tasks.add_or_update_artifacts/v2.10 (Invalid task status: expected=created, status=completed)>
Hi UpsetCrow72
How come you are trying to sync
a "completed" (finalized) dataset ?