Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I Have A Local Folder A, And A Dataset B. A:

I have a local folder a, and a dataset B.
a:
a a/.DS_Store a/1.txt a/b a/b/.DS_Store a/b/1.txt a/b/c a/b/c/1.txtDataset B:
b b/2.txt b/c b/c/2.txtI want to “merge” dataset B into the local folder A:
a a/.DS_Store a/1.txt a/b a/b/.DS_Store a/b/2.txt a/b/1.txt a/b/c a/b/c/2.txt a/b/c/1.txt
And I want to do this with the minimal amount of copying data around.
Ideally I would do B.get_local_copy(‘a’) and clearml would only download the missing files and place them in the right relative place.
For now I am doing:
local_path = B.get_local_copy() StorageManager.upload_folder(local_path, 'file:///path/to/folder/a/')However this involves:
Potentially downloading all of B locally (though some of it may still be in folder ‘a’ Copying from the local cache to ‘a’
Is there a more elegant way to achieve this?

  
  
Posted one year ago
Votes Newest

Answers 11


AgitatedDove14 ideas?

  
  
Posted one year ago

As a result, I need to do somethig which copies the files (e.g. cp -r or StorageManager.upload_folder(‘b’, ‘a’)
but this is expensive

You are saying the copy is just wasteful (but you do have the files locally)?

  
  
Posted one year ago

What I’d like is to do Dataset.get(“b”, to=‘a’) and have the download land the files directly there

  
  
Posted one year ago

Oh maybe:
cp -R b a/

  
  
Posted one year ago

so moving b in to a won’t work if some subfolders are already there

I though that if they are already there you would merge / overwrite, isn't that what you need ?
a/b/c/2.txt seems like the result of moving b from dataset B into folder b in Dataset A, what am I missing?
(My assumption is that you have both datasets locally on the same machine and that you can just copy the files from b of Datasset B into b folder of Dataset A)

  
  
Posted one year ago

RoughTiger69
move the files locally (i.e. based on the example move folder b into folder a ) Create a new version with two parents ('a' and 'b') then sync the local root folder ('a' in your case). Only the meta-data should change (because the referenced files are already in one of the datasets)wdyt?

  
  
Posted one year ago

AgitatedDove14 mv command requires empty folders… so moving b in to a won’t work if some subfolders are already there

  
  
Posted one year ago

Yes, but this is not the use-case.
The use-case is that I have a local folder and I want to merge a dataset into it without re-fetching the local folder…

  
  
Posted one year ago

Hi RoughTiger69 ,

If you create a child version and add the delta of the files to the child, fetching the child version will also fetch the parents files as well

  
  
Posted one year ago

if the state is :
a:
a a/.DS_Store a/1.txt a/b a/b/.DS_Store a/b/1.txt a/b/c a/b/c/1.txtDataset B:
b b/2.txt b/c b/c/2.txtThen the command
mv b a/returns error since a/ is not empty.
That’s exactly the issue…

As a result, I need to do somethig which copies the files (e.g. cp -r or StorageManager.upload_folder(‘b’, ‘a’)
but this is expensive

  
  
Posted one year ago

Oh then this should just work
cp -R --link b a/You can achieve the same symbol link link from python as well

  
  
Posted one year ago