Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, Is It A Well Known Issue That Once You Upload An Artifact With The Prefix Of "Data_" To A Task, You Cannot Fetch The Task Since Clearml Sees It As A Data Logging?

hi, is it a well known issue that once you upload an artifact with the prefix of "data_" to a task, you cannot fetch the task since clearml sees it as a data logging?

  
  
Posted 7 days ago
Votes Newest

Answers 5


Hi @<1594863230964994048:profile|DangerousBee35> , I've never heard of that - does it happen to you? If so, in which agent version, and can you share a script to reproduce?

  
  
Posted 7 days ago

Hi @<1594863230964994048:profile|DangerousBee35> ,

i run


    from clearml import Task
    import pandas as pd
    task = Task.init(project_name='examples', task_name='Artifacts with data_')

    df = pd.DataFrame(
        {
            'num_legs': [2, 4, 8, 0],
            'num_wings': [2, 0, 0, 0],
            'num_specimen_seen': [10, 2, 1, 8]
        },
        index=['falcon', 'dog', 'spider', 'fish']
    )

    # Register Pandas object as artifact to watch
    # (it will be monitored in the background and automatically synced and uploaded)
    task.upload_artifact('data_train', df, metadata={'counting': 'legs', 'max legs': 69})

and all works, can you try running it? Do you have other script I can try? Whats the clearml version you are using?

  
  
Posted 6 days ago

After running your suggestion (which worked), i'll be more accurate, the issue is with datasets and not tasks. try running this:

from clearml import Task, Dataset
import pandas as pd

task = Task.init(project_name='examples', task_name='Artifacts with data_')
ds = Dataset.create(dataset_project="examples", dataset_name="test", use_current_task=True)

df = pd.DataFrame(
    {
        'num_legs': [2, 4, 8, 0],
        'num_wings': [2, 0, 0, 0],
        'num_specimen_seen': [10, 2, 1, 8]
    },
    index=['falcon', 'dog', 'spider', 'fish']
)

# Register Pandas object as artifact to watch
# (it will be monitored in the background and automatically synced and uploaded)
task.upload_artifact('data_train', df, metadata={'counting': 'legs', 'max legs': 69})
ds = Dataset.get(ds.id)

I'm getting this error:

Traceback (most recent call last):
  File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-c73f0477dd66>", line 19, in <module>
    ds = Dataset.get(ds.id)
  File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1779, in get
    instance = get_instance(dataset_id)
  File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1695, in get_instance
    instance_ = cls._deserialize(local_state_file, task)
  File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 2640, in _deserialize
    instance = cls(_private=cls.__private_magic, task=task)
  File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 288, in __init__
    self._data_artifact_name = self._get_next_data_artifact_name()
  File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 2477, in _get_next_data_artifact_name
    numbers = sorted([int(a[prefix_len:]) for a in data_artifact_entries if a.startswith(prefix)])
  File "/home/ubuntu/.virtualenvs/ca/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 2477, in <listcomp>
    numbers = sorted([int(a[prefix_len:]) for a in data_artifact_entries if a.startswith(prefix)])
ValueError: invalid literal for int() with base 10: 'train'

which is caused by the artifact starting with "data_".
using clearml version: 1.14.4

  
  
Posted 2 days ago

Hi @<1594863230964994048:profile|DangerousBee35> , I can get reproduce the issue, will keep you posted about it

  
  
Posted 2 days ago

I'm currently overcoming it by just adding a "_ " ("data_ something" -> "_data_something")

  
  
Posted 2 days ago
40 Views
5 Answers
7 days ago
2 days ago
Tags