Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey Hey, I Having Trouble With Clearml And Albs In The Aws. Could Someone Help Me?

Hey hey, I having trouble with ClearML and ALBs in the AWS. Could someone help me? πŸ™‚

I am currently trying to deploy ClearML in the AWS. The Basic Infrastructure has an Application Load Balancer (ALB) and an Autoscaling Group that launches a ClearML AMI. I followed the instructions described in https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_config/#sub-domains-and-load-balancers for setting up the ALB.

The steps I did are:
Edit /opt/clearml/config/apiserver.conf and added the domain of associated with my ALB via Route53 Create one HTTPS listener with host_header conditions ( app. , api. , files. ) that point to the respective target groups Created 3 HTTP Target Groups ( app. , api. , files. ) for the appropriate ports (8080, 8008, 8081) that target the ClearML server in the Autoscaling Group Double checked that the security groups allow access (web -> ALB and ALB -> Instance) Restarted the ClearML Server
Calling the app.<mydomain>.com (or api. or files. ) adress results in a 504 Gateway Time-out error after 10 seconds and all default health checks in the Target Groups are failing due to Request timed out .

Any idea how I could debug where the problem is? Thanks a lot πŸ™‚

  
  
Posted 2 years ago
Votes Newest

Answers 30


it’s alongside health checks tab

  
  
Posted 2 years ago

usually you can see if you are getting timeouts or wrong http code

  
  
Posted 2 years ago

Thanks a lot for the help debugging!

  
  
Posted 2 years ago

it can help debugging

  
  
Posted 2 years ago

Can you pls share all 3 health checks ?

  
  
Posted 2 years ago

Ok, I think that's been very helpful πŸ™‚ I'll experiment a little, now that I know a Health Check that must work. I'll write here if I find something! Thanks a lot for the awesome support!

  
  
Posted 2 years ago

look also at the monitoring tab

  
  
Posted 2 years ago

Just to be sure we are in sync 😁

  
  
Posted 2 years ago

But I still have one thing I'd like to fix: the health check for the file server on port 8081 gives me unhealthy for path "/". Is there a valid path you know I can use there for health checks? A curl gives me

  
  
Posted 2 years ago

atm it’s the way to go

  
  
Posted 2 years ago

doubled copy paste

  
  
Posted 2 years ago

And it's still unhealthy. I am starting to suspect that somehow the Autoscaling Part in between the ALB and the ClearML server could be causing the problem.

  
  
Posted 2 years ago

ok, ty very much for your feedback πŸ˜„

  
  
Posted 2 years ago

And I could access the web server even if the health check was failing. So that was not a problem in the end.

  
  
Posted 2 years ago

Currently I'm "cheating" and counting a 405 as the success code for the healthcheck.

  
  
Posted 2 years ago

can you change the path in ALB healthcheck pls?

  
  
Posted 2 years ago

Yes!

  
  
Posted 2 years ago

API

  
  
Posted 2 years ago

Web Server

  
  
Posted 2 years ago

from / to /debug.ping

  
  
Posted 2 years ago

ops

  
  
Posted 2 years ago

In fact it's the same we are applying to helm charts for k8s

  
  
Posted 2 years ago

the goal is to get healthchecks green so ALB should be able to work

  
  
Posted 2 years ago

You are not cheating πŸ˜‚

  
  
Posted 2 years ago

I'm going to ask an update to docs

  
  
Posted 2 years ago

in some second it should became green

  
  
Posted 2 years ago

These are the seetings for health check now

  
  
Posted 2 years ago

File Server

  
  
Posted 2 years ago

JuicyFox94 I think I found the problem. To my absolute shame, the security group of the ALB had no Outbound rules, i.e. no traffic was allowed out of the ALB πŸ™ˆ . Now I can access the ClearML Webserver!

  
  
Posted 2 years ago

This gives me a 200 πŸ™‚

  
  
Posted 2 years ago
941 Views
30 Answers
2 years ago
one year ago
Tags