Hi UpsetFrog68 , can you provide a standalone code snippet that would reproduce this occasional behaviour?
Hi CostlyOstrich36 ,
I've created a standalone code of a simple pipeline, demonstrating the issue.
The pipeline contains 3 steps: create_dataset, use_dataset, use_dataset2.
I ran the same script for 3 times, and got different results:
- Error in the third step for 2 times, third time succeeded.
- Errors for all retries - pipeline failed.
- Success.
Are you using a self hosted server or app.clear.ml ?
CostlyOstrich36 Any suggestions?
I encounter this unstable behavior also with other scenarios. For example, with a function step which just tries to get an existing dataset. for the line:
dataset = Dataset.get(dataset_project='test_pipelines', dataset_name=dataset_name, only_published=True)
I sometimes get:
Traceback (most recent call last):
File "/tmp/tmpnndxzv32.py", line 66, in get_dataset
dataset = Dataset.get(dataset_project='test_pipelines', dataset_name=dataset_name, only_published=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/clearml/datasets/dataset.py", line 1782, in get
raise ValueError(
ValueError: Could not find Dataset project/name/version ('test_pipelines', 'test', None)
But in some runs it finds the dataset.