Hi @<1594863230964994048:profile|DangerousBee35> , I've never heard of that - does it happen to you? If so, in which agent version, and can you share a script to reproduce?
Hi @<1594863230964994048:profile|DangerousBee35> ,
i run
from clearml import Task
import pandas as pd
task = Task.init(project_name='examples', task_name='Artifacts with data_')
df = pd.DataFrame(
{
'num_legs': [2, 4, 8, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, 2, 1, 8]
},
index=['falcon', 'dog', 'spider', 'fish']
)
# Register Pandas object as artifact to watch
# (it will be monitored in the background and automatically synced and uploaded)
task.upload_artifact('data_train', df, metadata={'counting': 'legs', 'max legs': 69})
and all works, can you try running it? Do you have other script I can try? Whats the clearml
version you are using?
After running your suggestion (which worked), i'll be more accurate, the issue is with datasets and not tasks. try running this:
from clearml import Task, Dataset
import pandas as pd
task = Task.init(project_name='examples', task_name='Artifacts with data_')
ds = Dataset.create(dataset_project="examples", dataset_name="test", use_current_task=True)
df = pd.DataFrame(
{
'num_legs': [2, 4, 8, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, 2, 1, 8]
},
index=['falcon', 'dog', 'spider', 'fish']
)
# Register Pandas object as artifact to watch
# (it will be monitored in the background and automatically synced and uploaded)
task.upload_artifact('data_train', df, metadata={'counting': 'legs', 'max legs': 69})
ds = Dataset.get(ds.id)
I'm getting this error:
Traceback (most recent call last):
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-c73f0477dd66>", line 19, in <module>
ds = Dataset.get(ds.id)
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1779, in get
instance = get_instance(dataset_id)
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1695, in get_instance
instance_ = cls._deserialize(local_state_file, task)
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 2640, in _deserialize
instance = cls(_private=cls.__private_magic, task=task)
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 288, in __init__
self._data_artifact_name = self._get_next_data_artifact_name()
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 2477, in _get_next_data_artifact_name
numbers = sorted([int(a[prefix_len:]) for a in data_artifact_entries if a.startswith(prefix)])
File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 2477, in <listcomp>
numbers = sorted([int(a[prefix_len:]) for a in data_artifact_entries if a.startswith(prefix)])
ValueError: invalid literal for int() with base 10: 'train'
which is caused by the artifact starting with "data_".
using clearml version: 1.14.4
Hi @<1594863230964994048:profile|DangerousBee35> , I can get reproduce the issue, will keep you posted about it
I'm currently overcoming it by just adding a "_
" ("data_
something" -> "_data_something")