Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
We'Re Using Ray And Clearml Together, And Suddenly We'Re Seeing Some Hanging Threads, And Finally We Got An Error Message:

We're using Ray and ClearML together, and suddenly we're seeing some hanging threads, and finally we got an error message:

` 2022-01-10 09:58:56,803 [ERROR] [CrossValidationJob] : Failed: ray::cv_iteration() (pid=6703, ip=192.168.1.58)
File "python/ray/_raylet.pyx", line 480, in ray._raylet.execute_task

<trimmed irrelevant chunks of traceback and code>

File "***/ccmlp/mlops/clearml_ops.py", line 70, in _close
self.task.flush(wait_for_uploads=True)
File "***/.venv/lib/python3.8/site-packages/clearml/task.py", line 1492, in flush
self.__reporter.wait_for_events()
File "***/.venv/lib/python3.8/site-packages/clearml/backend_interface/metrics/reporter.py", line 261, in wait_for_events
return self._report_service.wait_for_events(timeout=timeout)
File "***/.venv/lib/python3.8/site-packages/clearml/backend_interface/metrics/reporter.py", line 83, in wait_for_events
while self._thread and self._thread.is_alive() and (not timeout or time()-tic < timeout):
AttributeError: 'bool' object has no attribute 'is_alive' `

  
  
Posted one year ago
Votes Newest

Answers 23


Well, the thing is ClearML also uses dictConfig, and I think you might be overriding its settings...

  
  
Posted one year ago

I believe it is maybe a race condition that's tangent to clearml now...

  
  
Posted one year ago

Example configuration -
version: 1 disable_existing_loggers: true formatters: simple: format: '%(asctime)s %(levelname)-9s %(name)-24s: %(message)s' filters: brackets: (): ccutils.logger.BracketFilter handlers: console: class: ccmlp.utils.TqdmStreamHandler level: INFO formatter: simple filters: [brackets] loggers: # Set logging levels for specific packages urllib3: level: WARNING matplotlib: level: WARNING botocore: level: WARNING fsspec: level: WARNING s3fs: level: WARNING boto3: level: WARNING s3transfer: level: WARNING git: level: WARNING ray: level: WARNING PIL: level: WARNING root: level: DEBUG handlers: [console]

  
  
Posted one year ago

I'm guessing that's not on pypi yet?

  
  
Posted one year ago

What's new in 1.1.6rc0?

  
  
Posted one year ago

Ah it is.

  
  
Posted one year ago

Another side effect btw is that some of our log files (we add a file handler to the logger) end up at 0 bytes. This specifically happens with Ray and ClearML and does not reproduce locally

  
  
Posted one year ago

I'll try with 1.1.5 first, then 1.1.6rc0

  
  
Posted one year ago

I thought so too - so I added flush calls just in case, but nothing's changed.
This is somewhat weird since it always happens in the above scenario (Ray + ClearML), and always in the last task/job from Ray

  
  
Posted one year ago

perhaps a flush issue?

  
  
Posted one year ago

We just inherit from logging.Handler and use that in our logging.config.dictConfig ; weird thing is that it still logs most of the tasks, just not the last one?

  
  
Posted one year ago

SuccessfulKoala55 could this be related to the monkey patching for logging platform? We have our own logging handlers that we use in this case

  
  
Posted one year ago

What do you mean 😄 Using logging.config.dictConfig(...)

  
  
Posted one year ago

Might very well be - do you touch other handlers?

  
  
Posted one year ago

Well, how to you set up dictConfig?

  
  
Posted one year ago

Or do you mean the contents of the configuration, probably :face_palm: ... one moment

  
  
Posted one year ago

Can you try the latest RC? 1.1.6rc0?

  
  
Posted one year ago

Some commits related to subprocesses and thread handling 🙂

  
  
Posted one year ago

If that's the case, wouldn't it apply across the board? This happens in a single task within ray - the other tasks (I have many in a single run) are fine

  
  
Posted one year ago

I believe so...

  
  
Posted one year ago

Hi UnevenDolphin73 , which clearml version are you using?

  
  
Posted one year ago

1.1.4

  
  
Posted one year ago

I'll try upgrading to 1.1.5, one moment

  
  
Posted one year ago