Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I Have A Local Folder A, And A Dataset B. A:

I have a local folder a, and a dataset B.
a:
a a/.DS_Store a/1.txt a/b a/b/.DS_Store a/b/1.txt a/b/c a/b/c/1.txtDataset B:
b b/2.txt b/c b/c/2.txtI want to “merge” dataset B into the local folder A:
a a/.DS_Store a/1.txt a/b a/b/.DS_Store a/b/2.txt a/b/1.txt a/b/c a/b/c/2.txt a/b/c/1.txt
And I want to do this with the minimal amount of copying data around.
Ideally I would do B.get_local_copy(‘a’) and clearml would only download the missing files and place them in the right relative place.
For now I am doing:
local_path = B.get_local_copy() StorageManager.upload_folder(local_path, 'file:///path/to/folder/a/')However this involves:
Potentially downloading all of B locally (though some of it may still be in folder ‘a’ Copying from the local cache to ‘a’
Is there a more elegant way to achieve this?

  
  
Posted 2 years ago
Votes Newest

Answers 11


What I’d like is to do Dataset.get(“b”, to=‘a’) and have the download land the files directly there

  
  
Posted 2 years ago

AgitatedDove14 mv command requires empty folders… so moving b in to a won’t work if some subfolders are already there

  
  
Posted 2 years ago

if the state is :
a:
a a/.DS_Store a/1.txt a/b a/b/.DS_Store a/b/1.txt a/b/c a/b/c/1.txtDataset B:
b b/2.txt b/c b/c/2.txtThen the command
mv b a/returns error since a/ is not empty.
That’s exactly the issue…

As a result, I need to do somethig which copies the files (e.g. cp -r or StorageManager.upload_folder(‘b’, ‘a’)
but this is expensive

  
  
Posted 2 years ago

RoughTiger69
move the files locally (i.e. based on the example move folder b into folder a ) Create a new version with two parents ('a' and 'b') then sync the local root folder ('a' in your case). Only the meta-data should change (because the referenced files are already in one of the datasets)wdyt?

  
  
Posted 2 years ago

Oh then this should just work
cp -R --link b a/You can achieve the same symbol link link from python as well

  
  
Posted 2 years ago

AgitatedDove14 ideas?

  
  
Posted 2 years ago

Yes, but this is not the use-case.
The use-case is that I have a local folder and I want to merge a dataset into it without re-fetching the local folder…

  
  
Posted 2 years ago

As a result, I need to do somethig which copies the files (e.g. cp -r or StorageManager.upload_folder(‘b’, ‘a’)
but this is expensive

You are saying the copy is just wasteful (but you do have the files locally)?

  
  
Posted 2 years ago

Oh maybe:
cp -R b a/

  
  
Posted 2 years ago

so moving b in to a won’t work if some subfolders are already there

I though that if they are already there you would merge / overwrite, isn't that what you need ?
a/b/c/2.txt seems like the result of moving b from dataset B into folder b in Dataset A, what am I missing?
(My assumption is that you have both datasets locally on the same machine and that you can just copy the files from b of Datasset B into b folder of Dataset A)

  
  
Posted 2 years ago

Hi RoughTiger69 ,

If you create a child version and add the delta of the files to the child, fetching the child version will also fetch the parents files as well

  
  
Posted 2 years ago