Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hallo Everyone, Is Anyone Familiar With Any Restrictions Regarding The Usage Of Pipeline Decorators In Jupyter Notebooks? I Noticed, That When I Run A Pipeline Locally Within A Notebook (

Hallo everyone,

is anyone familiar with any restrictions regarding the usage of pipeline decorators in jupyter notebooks? I noticed, that when I run a pipeline locally within a notebook ( PipelineDecorator.run_locally() ) the first run works, but a rerun will cause a key error:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/opt/jupyter-server/notebooks/clearml_test/clearml-pipeline-text.ipynb Cell 5 line 2
1 PipelineDecorator.run_locally()
----> 2 test_pipeline()

File /opt/jupyter-server/venv-jupyter-server/lib/python3.11/site-packages/clearml/automation/controller.py:4361, in PipelineDecorator.pipeline.<locals>.decorator_wrap.<locals>.internal_decorator(*args, **kwargs)
4359 # this time the pipeline is executed only on the remote machine
4360 try :
-> 4361 pipeline_result = func(**pipeline_kwargs)
4362 except Exception:
4363 a_pipeline.stop(mark_failed=True )

/opt/jupyter-server/notebooks/clearml_test/clearml-pipeline-text.ipynb Cell 5 line 3
1 @PipelineDecorator.pipeline(name="test-pipeline", project="Test", add_pipeline_tags="True")
2 def test_pipeline():
----> 3 df = loadData()
4 df = word2vec_dummy(df, ["text",])
6 return len(df)

File /opt/jupyter-server/venv-jupyter-server/lib/python3.11/site-packages/clearml/automation/controller.py:4007, in PipelineDecorator.component.<locals>.decorator_wrap.<locals>.wrapper(*args, **kwargs)
4005 # get node and park is as launched
4006 cls._singleton._launched_step_names.add(_node_name)
-> 4007 _node = cls._singleton._nodes[_node_name]
4008 cls._retries[_node_name] = 0
4009 cls._retries_callbacks[_node_name] = retry_on_failure if callable(retry_on_failure) else \
4010 (functools.partial(cls._singleton._default_retry_on_failure_callback, max_retries=retry_on_failure)
4011 if isinstance(retry_on_failure, int) else cls._singleton._retry_on_failure_callback)

KeyError: 'loadData'

Using the same code in a .py works normally. I suspect that some resources are not managed correctly in the notebook. I did not find any docs about this. Maybe someone knows about this, otherwise I will open a bug report.

  
  
Posted 11 months ago
Votes Newest

Answers 3


Hi @<1627478122452488192:profile|AdorableDeer85>
Are you referring to running the pipeline on a remote machine ? could you provide the full Task/Pipeline log ?

  
  
Posted 11 months ago

Hi @<1523701205467926528:profile|AgitatedDove14> .
What I mean is shown in the following minimal example notebook. Executing cell [4] works only once during kernel lifetime. When executing cell [4] again (after one successful run) it crashes (see error message below).

Notebook:

%env CLEARML_WEB_HOST=

%env CLEARML_API_HOST=

%env CLEARML_FILES_HOST=

%env CLEARML_API_ACCESS_KEY=<your_key>
%env CLEARML_API_SECRET_KEY=<your_key>
env: CLEARML_WEB_HOST= [None](http://localhost:8080) 
env: CLEARML_API_HOST= [None](http://localhost:8008) 
env: CLEARML_FILES_HOST= [None](http://localhost:8081) 
...
from clearml import PipelineDecorator

@PipelineDecorator.component(cache=False, return_values=['value'])
def step1():
    value = 1
    return value
@PipelineDecorator.pipeline(name="test-pipeline", project="Test", add_pipeline_tags="True")
def test_pipeline():
    value = step1()
# PipelineDecorator.debug_pipeline() # works, even when run repeatedly
PipelineDecorator.run_locally() # works on first execution of cell only, not when run repeatedly
test_pipeline()
ClearML Task: created new task id=32cb8577db6140d8bdcce7e520948c30
ClearML results page:  [None](http://localhost:8080/projects/4d92231c28b748f190fde5b8b25d1a5a/experiments/32cb8577db6140d8bdcce7e520948c30/output/log) 
ClearML pipeline page:  [None](http://localhost:8080/pipelines/4d92231c28b748f190fde5b8b25d1a5a/experiments/32cb8577db6140d8bdcce7e520948c30) 



---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)

/opt/jupyter-server/notebooks/clearml_test/clearml-pipeline-text copy.ipynb Cell 4 line 3
      <a href=' [None](vscode-notebook-cell://ssh-remote) %2Bai1/opt/jupyter-server/notebooks/clearml_test/clearml-pipeline-text%20copy.ipynb#W4sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0'>1</a> # PipelineDecorator.debug_pipeline() # works, even when run repeatedly
      <a href=' [None](vscode-notebook-cell://ssh-remote) %2Bai1/opt/jupyter-server/notebooks/clearml_test/clearml-pipeline-text%20copy.ipynb#W4sdnNjb2RlLXJlbW90ZQ%3D%3D?line=1'>2</a> PipelineDecorator.run_locally() # works on first execution of cell only, not when run repeatedly
----> <a href=' [None](vscode-notebook-cell://ssh-remote) %2Bai1/opt/jupyter-server/notebooks/clearml_test/clearml-pipeline-text%20copy.ipynb#W4sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2'>3</a> test_pipeline()


File /opt/jupyter-server/venv-jupyter-server/lib/python3.11/site-packages/clearml/automation/controller.py:4429, in PipelineDecorator.pipeline.<locals>.decorator_wrap.<locals>.internal_decorator(*args, **kwargs)
   4427 # this time the pipeline is executed only on the remote machine
   4428 try:
-> 4429     pipeline_result = func(**pipeline_kwargs)
   4430 except Exception:
   4431     a_pipeline.stop(mark_failed=True)


/opt/jupyter-server/notebooks/clearml_test/clearml-pipeline-text copy.ipynb Cell 4 line 3
      <a href=' [None](vscode-notebook-cell://ssh-remote) %2Bai1/opt/jupyter-server/notebooks/clearml_test/clearml-pipeline-text%20copy.ipynb#W4sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0'>1</a> @PipelineDecorator.pipeline(name="test-pipeline", project="Test", add_pipeline_tags="True")
      <a href=' [None](vscode-notebook-cell://ssh-remote) %2Bai1/opt/jupyter-server/notebooks/clearml_test/clearml-pipeline-text%20copy.ipynb#W4sdnNjb2RlLXJlbW90ZQ%3D%3D?line=1'>2</a> def test_pipeline():
----> <a href=' [None](vscode-notebook-cell://ssh-remote) %2Bai1/opt/jupyter-server/notebooks/clearml_test/clearml-pipeline-text%20copy.ipynb#W4sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2'>3</a>     value = step1()


File /opt/jupyter-server/venv-jupyter-server/lib/python3.11/site-packages/clearml/automation/controller.py:4069, in PipelineDecorator.component.<locals>.decorator_wrap.<locals>.wrapper(*args, **kwargs)
   4067 # get node and park is as launched
   4068 cls._singleton._launched_step_names.add(_node_name)
-> 4069 _node = cls._singleton._nodes[_node_name]
   4070 cls._retries[_node_name] = 0
   4071 cls._retries_callbacks[_node_name] = retry_on_failure if callable(retry_on_failure) else \
   4072     (functools.partial(cls._singleton._default_retry_on_failure_callback, max_retries=retry_on_failure)
   4073      if isinstance(retry_on_failure, int) else cls._singleton._retry_on_failure_callback)


KeyError: 'step1'

  
  
Posted 10 months ago

Hi @<1627478122452488192:profile|AdorableDeer85>
I'm sorry I'm a bit confused here, any chance you can share the entire notebook ?
Also any reason why this is pointing to "localhost" and not IP/host of the clearml-server ? is the agent running on the same machine ?

  
  
Posted 10 months ago
755 Views
3 Answers
11 months ago
10 months ago
Tags