Hi @<1668427950573228032:profile|TeenyShells80> , the parent_datasets
should be a list of dataset IDs or clearml.Dataset objects, not dataset names. Maybe that is the issue
Traceback (most recent call last):
File "/root/ehread-playgrounds/bbiescas/NER-ES/clearml_pipelines/./step_1_clearml_dataset.py", line 38, in <module>
dataset = Dataset.create(dataset_name="general_ner_es",
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1248, in create
parent_datasets = [cls.get(dataset_id=p) if not isinstance(p, Dataset) else p for p in (parent_datasets or [])]
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1248, in <listcomp>
parent_datasets = [cls.get(dataset_id=p) if not isinstance(p, Dataset) else p for p in (parent_datasets or [])]
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1779, in get
instance = get_instance(dataset_id)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/datasets/dataset.py", line 1678, in get_instance
task = Task.get_task(task_id=dataset_id_)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/task.py", line 989, in get_task
return cls.__get_task(
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/task.py", line 4331, in __get_task
return cls(private=cls.__create_protection, task_id=task_id, log_to_backend=False)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/task.py", line 209, in init
super(Task, self).init(**kwargs)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 161, in init
super(Task, self).init(id=task_id, session=session, log=log)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 152, in init
self.id = self.normalize_id(id)
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 187, in normalize_id
return id.strip() if id else None
AttributeError: 'list' object has no attribute 'strip'
batches = [batch_1, batch_2, batch_3]
if name == 'main':
print("Create the dataset in ClearML")
dataset = Dataset.create(dataset_name="general_ner_es",
dataset_project='general_ner',
output_uri=' None ')
for batch in batches:
df = pandasDF_from_annotations(bucket_name, batch_1)
[df.to](http://df.to) _pickle('df.pkl')
print("Add files to the dataset from " + str(batch))
dataset = Dataset.create(dataset_name="general_ner_es",
dataset_project='general_ner',
output_uri=' [None](s3://es-ehrd-production-s3-ml-development/clearml/datasets/labelstudio/general_ner_es) ',
dataset_tags = [str(batch)],
parent_datasets=["general_ner_es"])
Can you add a code snippet that reproduces this for you please?
Hi @<1668427950573228032:profile|TeenyShells80> , can you please elaborate on the process? Exactly what steps you took, what CLI commands. Also what is happening when you say it's not working? Are there console logs? Please add some information 🙂