Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi I'M Trying To Finalize A Dataset, But Although The

Hi
i'm trying to finalize a dataset, but although the finalize(auto_upload=True) completes successfully (see image) - But the dataset is still in an uploading status (resulting with is_fianl() as False)
The dataset has a single parquet file
I notices it when trying to get a local copy
I'm running clearml==1.11.1

ds.get_local_copy()

ValueError                                Traceback (most recent call last)
Cell In[37], line 1
----> 1 ds.get_local_copy()

File ~/..../.direnv/python-3.10.7/lib/python3.10/site-packages/clearml/datasets/dataset.py:940, in Dataset.get_local_copy(self, use_soft_links, part, num_parts, raise_on_error, max_workers)
    938     self._task = Task.get_task(task_id=self._id)
    939 if not self.is_final():
--> 940     raise ValueError("Cannot get a local copy of a dataset that was not finalized/closed")
    941 max_workers = max_workers or psutil.cpu_count()
    943 # now let's merge the parents

ValueError: Cannot get a local copy of a dataset that was not finalized/closed

Any suggestions?
image

  
  
Posted one year ago
Votes Newest

Answers 3


Hi @<1523701323046850560:profile|OutrageousSheep60> , we just released v1.12.1 - can you please check with that version?

  
  
Posted one year ago

upgrading to 1.12.1 didn't help
I think the issue is that when I create the dataset - i used

use_current_task=True,

If I change it to

use_current_task=False,

then it finalizes

  
  
Posted one year ago

will do
A work around that worked for me is to explicitly complete the task, seems like the flush has some bug
task = Task.get_task('...')
task.close()
task.mark_completed()

ds.is_final()
True

  
  
Posted one year ago
1K Views
3 Answers
one year ago
one year ago
Tags
Similar posts