I'm currently overcoming it by just adding a "_
" ("data_
something" -> "_data_something")
Hi @<1594863230964994048:profile|DangerousBee35> , I can get reproduce the issue, will keep you posted about it
After running your suggestion (which worked), i'll be more accurate, the issue is with datasets and not tasks. try running this:
from clearml import Task, Dataset
import pandas as pd
task = Task.init(project_name='examples', task_name='Artifacts with data_')
ds = Dataset.create(dataset_project="examples", dataset_name="test", use_current_task=True)
df = pd.DataFrame(
{
'num_legs': [2, 4, 8, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, 2, 1, 8]
},
index=['falcon', 'dog', 'spider', 'fish']
)
# Register Pandas object as artifact to watch
# (it will be monitored in the background and automatically synced and uploaded)
task.upload_artifact('data_train', df, metadata={'counting': 'legs', 'max legs': 69})
ds = Dataset.get(ds.id)
I'm getting this error:
Traceback (most recent call last):
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-c73f0477dd66>", line 19, in <module>
ds = Dataset.get(ds.id)
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1779, in get
instance = get_instance(dataset_id)
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1695, in get_instance
instance_ = cls._deserialize(local_state_file, task)
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 2640, in _deserialize
instance = cls(_private=cls.__private_magic, task=task)
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 288, in __init__
self._data_artifact_name = self._get_next_data_artifact_name()
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 2477, in _get_next_data_artifact_name
numbers = sorted([int(a[prefix_len:]) for a in data_artifact_entries if a.startswith(prefix)])
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 2477, in <listcomp>
numbers = sorted([int(a[prefix_len:]) for a in data_artifact_entries if a.startswith(prefix)])
ValueError: invalid literal for int() with base 10: 'train'
which is caused by the artifact starting with "data_".
using clearml version: 1.14.4
Hi @<1594863230964994048:profile|DangerousBee35> ,
i run
from clearml import Task
import pandas as pd
task = Task.init(project_name='examples', task_name='Artifacts with data_')
df = pd.DataFrame(
{
'num_legs': [2, 4, 8, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, 2, 1, 8]
},
index=['falcon', 'dog', 'spider', 'fish']
)
# Register Pandas object as artifact to watch
# (it will be monitored in the background and automatically synced and uploaded)
task.upload_artifact('data_train', df, metadata={'counting': 'legs', 'max legs': 69})
and all works, can you try running it? Do you have other script I can try? Whats the clearml
version you are using?
Hi @<1594863230964994048:profile|DangerousBee35> , I've never heard of that - does it happen to you? If so, in which agent version, and can you share a script to reproduce?