Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I'M New To Using Datasets, If My Git Project Root Is

I'm new to using datasets, if my git project root is myProject and I expect file.json to be at the root level, how do I accomplish this?

  
  
Posted one year ago
Votes Newest

Answers 18


Sure. My git repo myProject.git does not have file.json checked into VCS. I'd like to add this file at experiment runtime or equivalent.

  
  
Posted one year ago

Can you please elaborate on what you mean?

  
  
Posted one year ago

You can save it as a dataset and then fetch it during run time, or am i missing something?

  
  
Posted one year ago

Or is there an easier way?

  
  
Posted one year ago

I assumed I would need to upload it and then reference it somehow?

  
  
Posted one year ago

I’m afaid I don’t think there is a way to go around this without modifying your code.

  
  
Posted one year ago

ok good to know

  
  
Posted one year ago

do I have to fetch it via code? I was hoping to not modify my scripts

  
  
Posted one year ago

Is not direcly cached in the ~/.clearml folder. There are some directories inside (one for storage, one for pip, another for venvs, etc.

So in your case it would be stored in ~/.clearml/cache/storage_manager/datasets/ds_{ds_id}/my_file.json

  
  
Posted one year ago

I wouldn't be able to pass in ~/.clearml/cache/storage_manager/datasets/ds_{ds_id}/my_file.json as an argument?

  
  
Posted one year ago

you would, but I’d advise against it, since that is not the intended way

  
  
Posted one year ago

This would be a short term solution as we build a proof of concept

  
  
Posted one year ago

ok, but if you were to run it from a different machine (or a different user!) it wouldn’t work

  
  
Posted one year ago

so it caches to ~/.clearml/ any files that are under the same project name?

  
  
Posted one year ago

Thanks!

  
  
Posted one year ago

ClearML downloads/caches datasets to ~/.clearml/ folder so yes, you need to modify your code.
dataset_folder = Dataset.get(project_name=, dataset_name=, version=).get_local_copy() file_json_path = os.path.join(dataset_folder, 'file.json')

  
  
Posted one year ago

Could I simply just reference the files by name and pass in a string such as ~/.clearml/my_file.json

  
  
Posted one year ago

After proving we can run our training, I would then advise we update our code base

  
  
Posted one year ago
652 Views
18 Answers
one year ago
one year ago
Tags