Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Am Successfully Starting Multiple Tasks Automatically, But They Dont Train To Completion They Start Training And Then At Some Point They Give Me This Error:

Hi, I am successfully starting multiple tasks automatically, but they dont train to completion they start training and then at some point they give me this error: Process terminated by user . As far as I can tell I am not terminating them. Any idea why this might happen?

  
  
Posted 2 years ago
Votes Newest

Answers 4


Hi CloudySwallow27 , regarding - Process terminated by user - Are you running Hyperparam Optimization?
Regarding CUDA - yes, you need CUDA installed (or run it from a docker with CUDA) - ClearML doesn't handle the CUDA installation since this is on a driver level.

  
  
Posted 2 years ago

relately, I just noticed that the GPU is not starting. This was in the logs:
2022-04-07 20:59:54.464854: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
Do we need to call a specific instance w/ CUDA preinstalled or does clearml take care of it?

  
  
Posted 2 years ago

Thanks! I installed CUDA/CuDNN on the image and now the GPU is being utilized.

  
  
Posted 2 years ago

This is what the instance state looks like, as logged by clearml:

  
  
Posted 2 years ago
530 Views
4 Answers
2 years ago
one year ago
Tags