Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Clearmlers, I'M Trying To Create A Dataset With Tagged Batches Of Data. I Firstly Create An Empty Dataset With Dataset_Name = 'Name_Dataset', And Then Create A Another Tagged Dataset With The First Batch And With Parent_Datasets=['Name_Dataset']. It'S

Hi CLEARMLers, I'm trying to create a dataset with tagged batches of data. I firstly create an empty dataset with dataset_name = 'name_dataset', and then create a another tagged dataset with the first batch and with parent_datasets=['name_dataset']. It's not working for me that way, any suggestion? Can anybody send to me an example?

  
  
Posted 10 months ago
Votes Newest

Answers 9


Hi @<1668427950573228032:profile|TeenyShells80> , the parent_datasets should be a list of dataset IDs or clearml.Dataset objects, not dataset names. Maybe that is the issue

  
  
Posted 10 months ago

thanks a lot Eugen, that was the issue

  
  
Posted 10 months ago

Hi @<1668427950573228032:profile|TeenyShells80> , can you please elaborate on the process? Exactly what steps you took, what CLI commands. Also what is happening when you say it's not working? Are there console logs? Please add some information 🙂

  
  
Posted 10 months ago

I'm working in an interactive session

  
  
Posted 10 months ago

Please add it as a code snippet.

  
  
Posted 10 months ago

this is the error I get

  
  
Posted 10 months ago

Traceback (most recent call last):
File "/root/ehread-playgrounds/bbiescas/NER-ES/clearml_pipelines/./step_1_clearml_dataset.py", line 38, in <module>
dataset = Dataset.create(dataset_name="general_ner_es",
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1248, in create
parent_datasets = [cls.get(dataset_id=p) if not isinstance(p, Dataset) else p for p in (parent_datasets or [])]
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1248, in <listcomp>
parent_datasets = [cls.get(dataset_id=p) if not isinstance(p, Dataset) else p for p in (parent_datasets or [])]
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1779, in get
instance = get_instance(dataset_id)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1678, in get_instance
task = Task.get_task(task_id=dataset_id_)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/task.py", line 989, in get_task
return cls.__get_task(
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/task.py", line 4331, in __get_task
return cls(private=cls.__create_protection, task_id=task_id, log_to_backend=False)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/task.py", line 209, in init
super(Task, self).init(**kwargs)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 161, in init
super(Task, self).init(id=task_id, session=session, log=log)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 152, in init
self.id = self.normalize_id(id)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 187, in normalize_id
return id.strip() if id else None
AttributeError: 'list' object has no attribute 'strip'

  
  
Posted 10 months ago

batches = [batch_1, batch_2, batch_3]

if name == 'main':
print("Create the dataset in ClearML")
dataset = Dataset.create(dataset_name="general_ner_es",
dataset_project='general_ner',
output_uri=' None ')

for batch in batches:

    df = pandasDF_from_annotations(bucket_name, batch_1)
     [df.to](http://df.to) _pickle('df.pkl')
    print("Add files to the dataset from " + str(batch))
    dataset = Dataset.create(dataset_name="general_ner_es",
                    dataset_project='general_ner',
                    output_uri=' [None](s3://es-ehrd-production-s3-ml-development/clearml/datasets/labelstudio/general_ner_es) ',
                    dataset_tags = [str(batch)],
                    parent_datasets=["general_ner_es"])
  
  
Posted 10 months ago

Can you add a code snippet that reproduces this for you please?

  
  
Posted 10 months ago
638 Views
9 Answers
10 months ago
10 months ago
Tags
Similar posts