Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello, Does Anyone Else Have Trouble With Deleting Experiments? Sometimes When Deleting 10 Or So Experiments Some Errors Pop Out And The Entire System Becomes Unstable (Workers Do Not Show Up, Cannot Reset Experiments Etc) This Behaviour Was Not Fixed W

Hello,

Does anyone else have trouble with deleting experiments? Sometimes when deleting 10 or so experiments some errors pop out and the entire system becomes unstable (workers do not show up, cannot reset experiments etc)

This behaviour was not fixed with 1.5.0. log skimming showed quite a lot of timeouts

  
  
Posted one year ago
Votes Newest

Answers 26


Yes, that's right. We deployed it on a GCP instance

  
  
Posted one year ago

Any error in the apiserver log? (sudo docker logs clearml-apiserver)

  
  
Posted one year ago

I haven't looked, I'll let you know next time it happens

  
  
Posted one year ago

Can you try to get the ES log using docker logs clearml-elastic ?

  
  
Posted one year ago

Nothing at all. There are only 2 logs from this day, and all were at 2am

  
  
Posted one year ago

And you deleted a single experiment? Or many?

  
  
Posted one year ago

Hi RotundHedgehog76 ,
Where exactly do you see errors?

  
  
Posted one year ago

For now, docker compose down && docker compose up -d helps

  
  
Posted one year ago

Errors pop in occasionally in the Web UI. All we see is a dialog with the text "Error"

  
  
Posted one year ago

Anything you can see in the browser's JS console or in the Developer Tools Network section?

  
  
Posted one year ago

So this seems to be a purely load issue - can you remind me what deployment type you are using? docker-compose, right?

  
  
Posted one year ago

This was actually a reset (of a one experiment) not a delete

  
  
Posted one year ago

Hello, a similar thing happened today. In the developer's console there was this line

https://server/api/v2.19/tasks.reset_many 504 (Gateway time-out)

  
  
Posted one year ago

we didn't change a thing from the defaults that's in your github 😄 so it's 500M?

  
  
Posted one year ago

I would suggest (assuming the machine has enough RAM memory) to set it to at least -Xms4g -Xmx4g and maybe more. You'll need at least twice than that free for ES alone (so make sure your machine has at least 16GB RAM)

  
  
Posted one year ago

Okay, thank you for the suggestions, we'll try it out

  
  
Posted one year ago

So currently it's -Xms2g -Xmx2g which means 2GB

  
  
Posted one year ago

Can you send what you have there now?

  
  
Posted one year ago

sure

  
  
Posted one year ago

No errors in logs, but that's because I restarted the deployment :(

  
  
Posted one year ago

I guess I'll let you know the next time this happens haha

  
  
Posted one year ago

How large are your ES indices? Maybe this is ES being inefficient?

  
  
Posted one year ago

the entire index is 35G

  
  
Posted one year ago

i think you're right, the default elastic values do not seem to work for us

  
  
Posted one year ago

how much memory do you have assigned to ES?

  
  
Posted one year ago

it's in the default env vars for elasticsearch in the docker compose

  
  
Posted one year ago
600 Views
26 Answers
one year ago
one year ago
Tags