Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi! Is There Any Way To Add Git-Like Ignore File For Versioning Clearml Data? I Saw In Docs A Wildcard Argument When Files Are Added To A Dataset. How Can I Specify Ignoring Of Some File Types? For Example, I Want To Ignore Ipynb Checkpoints. How Can I Do

Hi!
Is there any way to add git-like ignore file for versioning clearml data? I saw in docs a wildcard argument when files are added to a dataset. How can i specify ignoring of some file types? For example, i want to ignore ipynb checkpoints. How can i do this?

  
  
Posted one year ago
Votes Newest

Answers 5


@<1537605940121964544:profile|EnthusiasticShrimp49> , @<1523701435869433856:profile|SmugDolphin23> , thank you for the answer!

  
  
Posted one year ago

Hi @<1676038099831885824:profile|BlushingCrocodile88> ! We will soon try to merge a PR submitted via Github that will allow you to specify a list of files to be added to the dataset. So you will then by able to do something like add_files(glob.glob(*) - glob.glob(*.ipynb))

  
  
Posted one year ago

clearml-data also supports glob patterns, so if you have your dataset files in the same directory as the experiment code, you can do something like clearml-data add --files *.csv and only add the CSV files.

There's no .gitignore-like functionality because clearml-data is not meant to track everything, and you need to be deliberate in what exactly you're adding. Hope this clarifies things.

  
  
Posted one year ago

That makes sense, yeah it would be nice to have a way to exclude some files when calling sync_folder

  
  
Posted one year ago

One more question has been raised. I have the next situation. I make mutable copy using .get_mutable_local_copy() method and edit/add some files in local folder. Ipynb checkpoints are created after this.
Then I want to synchronise dataset in my storage and call .sync_folder(). The Ipynb checkpoints also will be uploaded because of absence wildcard argument in this method. Could you check this issue?:) I know I can use add_files() method but it seems to me that using of sync_folder more convenient in such scenario. It would be nice if you will add the option for excluding some files in sync_folder method.

  
  
Posted one year ago
2K Views
5 Answers
one year ago
one year ago
Tags