HAProxy ALOHA Documentation 15.5

TCP

Health checking is vital to ensuring the uptime of your TCP-based services. When HAProxy ALOHA detects that a server is unreachable, it removes that server from the load balancing rotation automatically. The server returns to the rotation once it becomes healthy again.

Active health checks

An active health check attempts to connect to a server at a regular interval. If the connection cannot be established, then the health checks fails. The minimum configuration for a health check is the check keyword on a server line.

Check a TCP port

A basic TCP-layer health check tries to connect to the server's TCP port. The check is valid when the server answers with a SYN/ACK packet. Enable it by adding a check parameter to each server line that you would like to monitor.

The load balancer tries to connect to port 80 on each server.

backend be_myapp
   server srv1 10.0.0.1:80 check
   server srv2 10.0.0.2:80 check

To send health check probes to a port other than the one to which normal traffic is sent, add the port parameter.

The health checks target port 8080.

backend be_myapp
   server srv1 10.0.0.1:80 check port 8080
   server srv2 10.0.0.2:80 check port 8080

Change the health check interval

By default, HAProxy ALOHA sends a health check every two seconds. Change this by adding the inter parameter to the server line.

Health checks are sent once every four seconds.

backend be_myapp
   server srv1 10.0.0.1:80 check inter 4s
   server srv2 10.0.0.2:80 check inter 4s

Use any of the following time suffixes:

  • us : microseconds

  • ms : milliseconds

  • s : seconds

  • m : minutes

  • h : hours

  • d : days

Other parameters that affect the check interval are defined below:

Parameter

Description

inter

Sets the interval between two consecutive health checks. If not specified, the default value is 2s.

fastinter

Sets the interval between two consecutive health checks when the server is in any of the transition states: UP - transitionally DOWN or DOWN - transitionally UP. If not set, then inter is used.

downinter

Sets the interval between two consecutive health checks when the server is in the DOWN state. If not set, then inter is used.

The diagram below describes at which phase each of these settings applies.

[Active/Standby with VRRP]

Change the health check thresholds

Use the fall parameter to change the number of failed health checks that will trigger removing the server from the load balancing rotation. By default, this is set to 3.

Five failed checks will put the server into the DOWN state.

backend be_myapp
   server srv1 10.0.0.1:80 check fall 5
   server srv2 10.0.0.2:80 check fall 5

Use the rise parameter to set how many successful checks are needed to bring a down server back up. The default is 2.

Ten successful health checks are needed before the server will return to the load balancing rotation.

backend be_myapp
   server srv1 10.0.0.1:80 check fall 5 rise 10
   server srv2 10.0.0.2:80 check fall 5 rise 10

Passive health checks

A passive health check monitors live traffic for failed TCP connections. Passive checks will detect errors returning from any part of your proxied service, but they require active traffic to monitor.

Monitor for TCP connection errors

Use the observe layer4 parameter to monitor live traffic for TCP connection errors.

Monitor for TCP connection errors. When there are at least 10 of these errors, we mark the server as down by using the mark-down value for the on-error parameter:

backend servers
   server server1 192.168.0.10:80 check  observe layer4  error-limit 10  on-error mark-down
  1. Add the check parameter to the server lines you want to monitor.

    The check parameter enables an active health check probe that will ping the server's TCP port at an interval. After a set number of successful active health check probes, this will bring the server back online after it has been removed from the load balancing rotation from failed passive health checks.

  2. Add the observe layer4 parameter to each server line to activate passive health checking.

  3. Add the error-limit and on-error parameters to set the threshold for failed passive health checks and the action to take when errors exceed that threshold.

Set the on-error action

The on-error parameter on the server line determines what action to take when errors exceed the threshold you set with the error-limit. It takes any of the following values:

Action

Description

fastinter

Forces "fastinter" mode, which causes the active health check probes to be sent more rapidly.

fail-check

Increments one failed active health check and forces "fastinter" mode.

sudden-death

Simulates a pre-fatal failed check. One more check will mark the server as down. It also forces "fastinter" mode.

mark-down

Marks the server as down and forces "fastinter" mode.

Active health checks (Direct Server Return)

The LB Layer7 tab configures an HAProxy reverse proxy. You can also load balance TCP by using the LB Layer4 tab, which instead configures a network router using Linux Virtual Server (LVS). While an HAProxy reverse proxy is simpler, you may prefer to use the LB Layer4 tab (LVS) when you want to load balance TCP in combination with Direct Server Return.

In that case, the director section supports checking a connection to a TCP port.

Configure TCP health checks on the LB Layer4 tab.

director exchange 10.0.0.9:443 TCP
   balance roundrobin                               # load balancing algorithm
   mode gateway                                     # forwarding mode

   option tcpcheck
   check port 21 interval 5s timeout 1s             # advanced check parameters
   server exchange1 10.0.0.13:443 weight 10 check   # server exchange1
   server exchange2 10.0.0.14:443 weight 10 check   # server exchange2
  1. Add a check parameter to each server line for which you would like to enable health checking.

  2. Add option tcpcheck in the same section. It take no arguments.

  3. Add the line check to the section to configure other options. It uses the following syntax:

    check { [timeout <seconds>] [interval <seconds>] [source <ip>] [port <port>]
            [rise <count>] [fall <count>] [inhibit] }

    timeout <seconds>

    Period after which an attempt without a response from the server is considered as failed (default: half of interval)

    interval <seconds>

    Interval between two checks, in seconds (default: 10 seconds)

    source <ip>

    Source IP to use when performing the check

    port <port>

    Forces the destination port (default: real server port, if it exists)

    rise <count>

    A server will be considered as operational after <count> consecutive successful health checks. The default is 1.

    fall <count

    A server will be considered as dead after <count> consecutive unsuccessful health checks. The default is 1.

    inhibit

    If a server is down, its weight is passed to 0 but not deleted. Established connection are not broken but new connections are dispatched on the other servers.


Next up

Protocols