Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I'M Having A Hard Time Uploading Files As Metadata To Datasets. I Need To Log A Dictionary With Preserved Order, Clearml Orders The Saved Dictionary And There Is No Control Of The User On This Behavior. Hence, I'M Creating A Json File And Log It To My

Hi, I'm having a hard time uploading files as metadata to datasets.
I need to log a dictionary with preserved order, ClearML orders the saved dictionary and there is no control of the user on this behavior. hence, I'm creating a Json file and log it to my dataset as metadata, and that is where I'm having trouble.

my code:

import json

from clearml import Task
from clearml import Dataset

task = Task.current_task()

pipe_dict = {'step_0': 'this is a test example', 'step_1': 'this is a test example', 'step_2': 'this is a test example'}
dataset = Dataset.create(dataset_name='test', dataset_project='test')

# make my own json file (not ordered)
json_pipe = json.dumps(pipe_dict)
with open('pipeline.json', 'w') as f:
    f.write(json_pipe)

# let's see the content of the json file
with open('pipeline.json', 'r') as f:
    print(f.read())

# link the json file as metadata
dataset.set_metadata('pipeline.json', metadata_name='pipeline')
dataset.upload()  # is this a must?

# failing when trying to get the metadata
my_pipe = dataset.get_metadata('pipeline')
print('success')

CLI command:

clearml-task --queue k8s_scheduler --project test --name test --script Scripts/clearml_tests/json_file.py --requirements Scripts/clearml_tests/requirements.txt

Req file:

clearml==1.12.2
boto3

Error (full log is attached):

Traceback (most recent call last):
  File "/root/.clearml/venvs-builds/3.10/task_repository/CorrAlgo.git/Scripts/clearml_tests/json_file.py", line 32, in <module>
    my_pipe = dataset.get_metadata('pipeline')
  File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 876, in get_metadata
    return metadata.get()
  File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/binding/artifacts.py", line 171, in get
    local_file = self.get_local_copy(raise_on_error=True, force_download=force_download)
  File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/binding/artifacts.py", line 240, in get_local_copy
    raise ValueError(
ValueError: Could not retrieve a local copy of artifact pipeline, failed downloading 

what have i tried so far:

  • running the script locally works as expected.
  • tried logging f"{task.cache_dir}/pipeline.json" instead of only "pipeline.json"
  • finalizing the dataset before getting the metadata solves this issue,but i wouldlike to keep the dataset in uploading mode
    would love to get some help, I'm pretty stucked here 😞
    Thanks!
  
  
Posted one year ago
Votes Newest

Answers 3


It worked, thanks! i spent a few hours trying to figure it out 😅

  
  
Posted one year ago

censored aws credentials

  
  
Posted one year ago

@<1594863230964994048:profile|DangerousBee35> this might simply be a sync issue - the upload is done in the background, it's possible you simply did it too quickly - try adding some sleep after setting the metadata and check than

  
  
Posted one year ago