Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello, Does Anyone Else Have Trouble With Deleting Experiments? Sometimes When Deleting 10 Or So Experiments Some Errors Pop Out And The Entire System Becomes Unstable (Workers Do Not Show Up, Cannot Reset Experiments Etc) This Behaviour Was Not Fixed W

Hello,

Does anyone else have trouble with deleting experiments? Sometimes when deleting 10 or so experiments some errors pop out and the entire system becomes unstable (workers do not show up, cannot reset experiments etc)

This behaviour was not fixed with 1.5.0. log skimming showed quite a lot of timeouts

  
  
Posted 2 years ago
Votes Newest

Answers 26


Any error in the apiserver log? (sudo docker logs clearml-apiserver)

  
  
Posted 2 years ago

I haven't looked, I'll let you know next time it happens

  
  
Posted 2 years ago

I would suggest (assuming the machine has enough RAM memory) to set it to at least -Xms4g -Xmx4g and maybe more. You'll need at least twice than that free for ES alone (so make sure your machine has at least 16GB RAM)

  
  
Posted 2 years ago

And you deleted a single experiment? Or many?

  
  
Posted 2 years ago

we didn't change a thing from the defaults that's in your github 😄 so it's 500M?

  
  
Posted 2 years ago

how much memory do you have assigned to ES?

  
  
Posted 2 years ago

Hi RotundHedgehog76 ,
Where exactly do you see errors?

  
  
Posted 2 years ago

Okay, thank you for the suggestions, we'll try it out

  
  
Posted 2 years ago

sure

  
  
Posted 2 years ago

So this seems to be a purely load issue - can you remind me what deployment type you are using? docker-compose, right?

  
  
Posted 2 years ago

Errors pop in occasionally in the Web UI. All we see is a dialog with the text "Error"

  
  
Posted 2 years ago

Yes, that's right. We deployed it on a GCP instance

  
  
Posted 2 years ago

This was actually a reset (of a one experiment) not a delete

  
  
Posted 2 years ago

I guess I'll let you know the next time this happens haha

  
  
Posted 2 years ago

No errors in logs, but that's because I restarted the deployment :(

  
  
Posted 2 years ago

Anything you can see in the browser's JS console or in the Developer Tools Network section?

  
  
Posted 2 years ago

Can you try to get the ES log using docker logs clearml-elastic ?

  
  
Posted 2 years ago

it's in the default env vars for elasticsearch in the docker compose

  
  
Posted 2 years ago

So currently it's -Xms2g -Xmx2g which means 2GB

  
  
Posted 2 years ago

Hello, a similar thing happened today. In the developer's console there was this line

https://server/api/v2.19/tasks.reset_many 504 (Gateway time-out)

  
  
Posted 2 years ago

For now, docker compose down && docker compose up -d helps

  
  
Posted 2 years ago

i think you're right, the default elastic values do not seem to work for us

  
  
Posted 2 years ago

Can you send what you have there now?

  
  
Posted 2 years ago

the entire index is 35G

  
  
Posted 2 years ago

How large are your ES indices? Maybe this is ES being inefficient?

  
  
Posted 2 years ago

Nothing at all. There are only 2 logs from this day, and all were at 2am

  
  
Posted 2 years ago
1K Views
26 Answers
2 years ago
one year ago
Tags