How to quickly setup a load balanced, high availability, Apache cluster

Written by Simon Green

Topics: News, Tech

I wanted to find a simple to maintain and expand soultion for load balancing a web cluster with high availability. I have found my solution in HAProxy.

Scenario

Bluhalo Labs HAProxy Test Rig

Bluhalo Labs HAProxy Test Rig

This demo scenario is in the following enviroment:

  • Network Configuration
    • Network: 192.168.11.0
    • Subnet Mask: 255.255.255.0
    • Gateway: 192.168.11.254 (Although this is irrelavant)
  • 1 x Shared IP for the Load Balancers
    • 192.168.11.40
  • 2 x Load Balancers
    • BHLabs1 – 192.168.11.30
    • BHLabs6 – 192.168.11.39
  • 4 x Web Servers
    • BHLabs2 – 192.168.11.35
    • BHLabs3 – 192.168.11.36
    • BHLabs4 – 192.168.11.37
    • BHLabs5 – 192.168.11.38

At the start of this setup all machines are running Ubuntu 8.04 Server from a standard install with openssh-server installed and the root password set. All setup commands are run as root or with sudo.

Web Server Configuration

Basic Config

As we are only doing this as a basic test, a very simple Apache config is required.
This will install Apache 2 and also PHP5 to give us some basic scripting to output server name for testing etc that you may wish to play with later.

# apt-get -y install php5

Next you need to create a check file for HAProxy to look for from the load balancers. This file will be used to determine if the servers are up. This will create a blank file called check.txt in the default DocumentRoot for Apache.

# touch /var/www/check.txt

Now stick your test index.html in that directory as well.

# echo "oh hi" > /var/www/index.html

Log Modification

Filtering out HAProxy health checks

You don’t want to log hits to the check.txt file in your Apache logs, so put an exclusion in your VirtualHost directive. here’s an example of how:

<VirtualHost *>
  ServerAdmin server-alert@bluhalo.com
  DocumentRoot /var/www/
  ErrorLog /var/log/apache2/error.log
  LogLevel warn
  CustomLog /var/log/apache2/access.log combined env=!dontlog
  SetEnvIf Request_URI "^/check\.txt$" dontlog
</VirtualHost>

Modify the access logs

HAProxy will act as a completely transparent proxy so by default the web servers will log the load balancers IP in it’s logs instead of the user’s. HAProxy add’s the user’s IP to the header in the “X-Forwarded-For” field, so you need to modify the log configuration in your apache2.conf to take advantage of this:

vi /etc/apache2/apache2.conf

Search for entries that start “LogFormat” …

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent

… and swap “%h” for “%{X-Forwarded-For}i” like:

LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent

Load Balancer Configuration

You need to install HAProxy and Heartbeat for this setup to work. HAProxy provides your load balancing functionality and Heartbeat provides your high-availability failover functionality.

# apt-get -y install haproxy heartbeat-2

Let’s start with HAProxy as thats the easier one

HAProxy

Open up the HAProxy config file:

# vi /etc/haproxy.cfg

and replace the whole file on both servers with the following:

global
  log 127.0.0.1 local0
  log 127.0.0.1 local1 notice
  maxconn 4096
  #debug
  #quiet
  user haproxy
  group haproxy

defaults
  log global
  mode  http
  option  httplog
  option  dontlognull
  retries 3
  redispatch
  maxconn 2000
  contimeout  5000
  clitimeout  50000
  srvtimeout  50000

listen bhlabslb 192.168.11.40:80
  mode http
  stats enable
  stats auth admin:password
  balance roundrobin
  option httpclose
  option forwardfor
  option httpchk HEAD /check.txt HTTP/1.0
  server  inst1 192.168.11.35:80 cookie server01 check inter 2000 fall 3
  server  inst2 192.168.11.36:80 cookie server02 check inter 2000 fall 3
  server  inst3 192.168.11.37:80 cookie server01 check inter 2000 fall 3
  server  inst4 192.168.11.38:80 cookie server02 check inter 2000 fall 3
  capture cookie vgnvisitor= len 32

  rspidel ^Set-cookie:\ IP= # do not let this cookie tell our internal IP address

You also have to allow the HAProxy service to start. Change the ENABLED value in the /etc/default/haproxy:

# Set ENABLED to 1 if you want the init script to start haproxy.
ENABLED=1
# Add extra flags here.
#EXTRAOPTS="-de -m 16"

Heartbeat

We will be using Heartbeat to pass the shared IP address (192.168.11.40) between our 2 load balancers if one goes down. To do this, it needs to be able to bind to an address that doesn’t yet exists on the system. In order to allow this you need to add the following to /etc/sysctl.conf:

# Allow HAProxy shared IP
net.ipv4.ip_nonlocal_bind = 1

and then run:

# sysctl -p

Heartbeat requires 3 main configuration files which do not come with the install. First of all the authkey. Do the following on both servers:

# vi /etc/ha.d/authkeys

add the following content, making sure you replace MyPassword with a secure string. This needs to be the same on both servers:

auth 3
3 md5 MyPassword

This file MUST be accessible only by root or Heartbeat won’t start:

# chmod 600 /etc/ha.d/authkeys

Next on each server create the following. Run:

# uname -n

to get the kernels take on the local hostname, and then insert this into:

# vi /etc/ha.d/haresources

in the following syntax:

BHLabs1 192.168.11.40

and

BHLabs6 192.168.11.40

Note the hostname changes but not the IP. The IP is the shared IP. Finally the main Heartbeat config file:

vi /etc/ha.d/ha.cf

On the first server:

#
#       keepalive: how many seconds between heartbeats
#
keepalive 2
#
#       deadtime: seconds-to-declare-host-dead
#
deadtime 10
#
#       What UDP port to use for udp or ppp-udp communication?
#
udpport        694
bcast  eth0
mcast eth0 225.0.0.1 694 1 0
ucast eth0 192.168.11.30
#       What interfaces to heartbeat over?
udp     eth0
#
#       Facility to use for syslog()/logger (alternative to log/debugfile)
#
logfacility     local0
#
#       Tell what machines are in the cluster
#       node    nodename ...    -- must match uname -n
node    BHLabs1
node    BHLabs6

and on the second server:

#
#       keepalive: how many seconds between heartbeats
#
keepalive 2
#
#       deadtime: seconds-to-declare-host-dead
#
deadtime 10
#
#       What UDP port to use for udp or ppp-udp communication?
#
udpport        694
bcast  eth0
mcast eth0 225.0.0.1 694 1 0
ucast eth0 192.168.11.39
#       What interfaces to heartbeat over?
udp     eth0
#
#       Facility to use for syslog()/logger (alternative to log/debugfile)
#
logfacility     local0
#
#       Tell what machines are in the cluster
#       node    nodename ...    -- must match uname -n
node    BHLabs1
node    BHLabs6

Testing

Restart all the services. On the web servers run:

# apache2ctl restart

and on the load balancers:

# /etc/init.d/heartbeat start
# /etc/init.d/haproxy start

You should then be able to hit http://192.168.11.40 and see your test webpage!
You should also have a page full of stats to please they eyes from HAProxy at http://192.168.11.40:81/haproxy?stats. This can be turned off by removing the following 2 lines from the haproxy.cfg. This should be removed in a production enviroment.:

 stats enable
 stats auth admin:password

Try the following:

  • Kill the heartbeat service on each load balancer in turn while running “watch ifconfig”. You should see the IP address move from server to server.
  • Pull the plug on either of the load balancers while running a ping from your PC to 192.168.11.40, the ping should never fail.
  • Shut down Apache on any of the web servers, you should see them go red in the stats page within 2 seconds and they will be removed from the load balanced group until they come back up.

Caveats and best practices

This setup is lacking in some important best practices:

  1. The heartbeat should run on it’s own network, usually a VLAN dedicated to it.
  2. The web servers should sit on a different network to the frontend on the load balancers. The load balancers would then have a second interface out to that network to connect to the web servers.

Later I will follow this up with a way to add MySQL to this configuration.

Reference

Comments

16 Comments For This Post I'd Love to Hear Yours!

  1. Joe M says:

    on the subject of load balancing, the LoadMaster 2000 intelligently and efficiently distributes user traffic among web and application servers to ensure users get the best experience possible.

  2. Mike says:

    ^^^ Spam??? HA_Proxy is free…..

  3. JC says:

    Beginner’s question: How does one hook up the VIP to the external IP seen by visitors?

  4. Jake says:

    Hi Simon
    Thanks for a very easy-to-follow guide.

    I’ve been unsuccessful in getting the HAproxy cluster to work though. I’ve set it up in three different VM environments (i can install 8.04 with my eyes shut by now!) I tried it at first on 10.04.1 LTS, but that didnt work and I thought it was to do with the fact that yours is running on 8.04. Now It’s running 8.04.1 Server LTS.

    Would you be able to help be troubleshoot this?

    Summary of my setup:
    Shared IP: 10.40.1.240
    LB1: 10.40.1.241
    LB2: 10.40.1.242
    WEB1: 10.40.1.243
    WEB2: 10.40.1.244
    WEB3: 10.40.1.245

    Problem: the shared IP does not come up.

    When starting heartbeat:
    root@LB1:~# /etc/init.d/heartbeat start
    Starting High-Availability services:
    2011/02/11_17:55:39 INFO: Resource is stopped
    2011/02/11_17:55:40 INFO: Resource is stopped
    Done.
    root@LB1:~#

    Any ideas?

    Thanks!
    Jake

  5. Simon Green says:

    Hi Jake,

    If you post your relevant config files and ifconfig from each server i’ll have a look for you.

    Simon

  6. Jake says:

    Hi Simon

    Thanks for the quick response. OK here goes:

    Firstly, for individual text files for these, please click on these:
    LB1: http://bit.ly/g2KiCS
    LB2: http://bit.ly/iaXnby

    Here’s the text:

    ===========
    LB1 Config files:
    ===========

    INTERFACES:

    root@LB1:~# cat /etc/network/interfaces
    # This file describes the network interfaces available on your system
    # and how to activate them. For more information, see interfaces(5).

    # The loopback network interface
    auto lo
    iface lo inet loopback

    # The primary network interface
    auto eth0
    iface eth0 inet static
    address 10.40.1.241
    gateway 10.40.0.1
    netmask 255.255.248.0
    network 10.40.0.0
    broadcast 10.40.7.255

    HAPROXY:

    root@LB1:~# cat /etc/haproxy.cfg
    global
    log 127.0.0.1 local0
    log 127.0.0.1 local1 notice
    maxconn 4096
    #debug
    #quiet
    user haproxy
    group haproxy

    defaults
    log global
    mode http
    option httplog
    option dontlognull
    retries 3
    redispatch
    maxconn 2000
    contimeout 5000
    clitimeout 50000
    srvtimeout 50000

    listen haproxylb 10.40.1.240:80
    mode http
    stats enable
    stats auth admin:password
    balance roundrobin
    option httpclose
    option forwardfor
    option httpchk HEAD /check.txt HTTP/1.0
    server inst1 10.40.1.243:80 cookie server01 check inter 2000 fall 3
    server inst2 10.40.1.244:80 cookie server02 check inter 2000 fall 3
    server inst3 10.40.1.245:80 cookie server01 check inter 2000 fall 3
    server inst4 10.40.1.246:80 cookie server02 check inter 2000 fall 3
    capture cookie vgnvisitor= len 32

    rspidel ^Set-cookie:\ IP= # do not let this cookie tell our internal IP address

    root@LB1:~# cat /etc/default/haproxy
    # Set ENABLED to 1 if you want the init script to start haproxy.
    ENABLED=1
    # Add extra flags here.
    #EXTRAOPTS=”-de -m 16″

    HEARTBEAT:

    root@LB1:~# tail -2 /etc/sysctl.conf
    # Allow HAProxy shared IP
    net.ipv4.ip_nonlocal_bind = 1

    root@LB1:~# cat /etc/ha.d/authkeys
    auth 3
    3 md5 MyPassword

    root@LB1:~# ls -l /etc/ha.d/authkeys
    -rw——- 1 root root 24 2011-02-11 17:42 /etc/ha.d/authkeys

    root@LB1:~# cat /etc/ha.d/haresources
    LB1 10.40.1.240
    LB2 10.40.1.240

    root@LB1:~# cat /etc/ha.d/ha.cf
    #
    # keepalive: how many seconds between heartbeats
    #
    keepalive 2
    #
    # deadtime: seconds-to-declare-host-dead
    #
    deadtime 10
    #
    # What UDP port to use for udp or ppp-udp communication?
    #
    udpport 694
    bcast eth0
    mcast eth0 225.0.0.1 694 1 0
    ucast eth0 10.40.1.242
    # What interfaces to heartbeat over?
    udp eth0
    #
    # Facility to use for syslog()/logger (alternative to log/debugfile)
    #
    logfacility local0
    #
    # Tell what machines are in the cluster
    # node nodename … — must match uname -n
    node LB1
    node LB2

    ===========
    LB2 Config files:
    ===========

    INTERFACES:

    root@LB2:~# cat /etc/network/interfaces
    # This file describes the network interfaces available on your system
    # and how to activate them. For more information, see interfaces(5).

    # The loopback network interface
    auto lo
    iface lo inet loopback

    # The primary network interface
    auto eth0
    iface eth0 inet static
    address 10.40.1.242
    gateway 10.40.0.1
    netmask 255.255.248.0
    network 10.40.0.0
    broadcast 10.40.7.255

    HAPROXY:

    root@LB2:~# cat /etc/haproxy.cfg
    global
    log 127.0.0.1 local0
    log 127.0.0.1 local1 notice
    maxconn 4096
    #debug
    #quiet
    user haproxy
    group haproxy

    defaults
    log global
    mode http
    option httplog
    option dontlognull
    retries 3
    redispatch
    maxconn 2000
    contimeout 5000
    clitimeout 50000
    srvtimeout 50000

    listen haproxylb 10.40.1.240:80
    mode http
    stats enable
    stats auth admin:password
    balance roundrobin
    option httpclose
    option forwardfor
    option httpchk HEAD /check.txt HTTP/1.0
    server inst1 10.40.1.243:80 cookie server01 check inter 2000 fall 3
    server inst2 10.40.1.244:80 cookie server02 check inter 2000 fall 3
    server inst3 10.40.1.245:80 cookie server01 check inter 2000 fall 3
    server inst4 10.40.1.246:80 cookie server02 check inter 2000 fall 3
    capture cookie vgnvisitor= len 32

    rspidel ^Set-cookie:\ IP= # do not let this cookie tell our internal IP address

    root@LB2:~# cat /etc/default/haproxy
    # Set ENABLED to 1 if you want the init script to start haproxy.
    ENABLED=1
    # Add extra flags here.
    #EXTRAOPTS=”-de -m 16″

    HEARTBEAT:

    root@LB2:~# tail -2 /etc/sysctl.conf
    # Allow HAProxy shared IP
    net.ipv4.ip_nonlocal_bind = 1

    root@LB2:~# cat /etc/ha.d/authkeys
    auth 3
    3 md5 MyPassword

    root@LB2:~# ls -l /etc/ha.d/authkeys
    -rw——- 1 root root 24 2011-02-11 17:42 /etc/ha.d/authkeys

    root@LB2:~# cat /etc/ha.d/haresources
    LB1 10.40.1.240
    LB2 10.40.1.240

    root@LB2:~# cat /etc/ha.d/ha.cf
    #
    # keepalive: how many seconds between heartbeats
    #
    keepalive 2
    #
    # deadtime: seconds-to-declare-host-dead
    #
    deadtime 10
    #
    # What UDP port to use for udp or ppp-udp communication?
    #
    udpport 694
    bcast eth0
    mcast eth0 225.0.0.1 694 1 0
    ucast eth0 10.40.1.241
    # What interfaces to heartbeat over?
    udp eth0
    #
    # Facility to use for syslog()/logger (alternative to log/debugfile)
    #
    logfacility local0
    #
    # Tell what machines are in the cluster
    # node nodename … — must match uname -n
    node LB1
    node LB2

  7. Jake says:

    Actually, I just noticed something – i think I’ve got the ucast IP addresses wrong (on both) – that should be the local IP right?

    I’ll try swap these around and see if it comes up.

    Cheers
    Jake

  8. Jake says:

    I just change the ucast IP addresses on both to be the local IP address for each, but even after reboot it doesn’t make the shared ip (10.40.1.240) live…

  9. Jake says:

    Hi Simon
    Did you see my other posts? They were awaiting moderation, then just disappeared?

    Thanks
    Jake

  10. Pawel says:

    Hello Simon,

    Thanks a lot for the exhaustive HOWTO! I’m going to use it soon in order to setup very similar system.

    BTW, what software did you use for your nice diagram? Is it maybe Dia?

    Pawel

  11. Musa says:

    Hey Simon,

    Thanks for your great work here. I am configuring this on zentyal server (ubuntu 10.04 based) and when i run command ‘ /etc/init.d/haproxy start’, i get the following error:
    -su: /etc/init.d/haproxy: No such file or directory
    (i am already logged in as root)

    Thanks in advance!

  12. Musa says:

    ok..seems i hadnot installed haproxy properly..
    now that i have installed it and configured it and when i try to run/compile the haproxy file by command ‘ haproxy -f haproxy.cfg’, i get the following error:
    [ALERT] 174/130036 (20357) : Starting proxy migital1: cannot bind socket

    Any help please?

  13. Simon Green says:

    Hi Musa, Is there anything else listening on the port you;re trying to bind to? Try ’sudo netstat -ln’ to check. If that port is clear, then make sure you’re starting haproxy as root.

  14. Musa says:

    yes sir that was the issue exactly and i sorted it out…thanks for your help but i have one more issue:
    currently m monitoring hits on my webservers by avast firewall just to calculate the daily traffic and it is about, say 40 hits/sec.
    but when i today deployed the zentyal server with HAproxy configured i can see that the hits are evenly distributed to two webservers at the back end however when i detach one of the webservers, i see a strange output on my firewall. i observed that hits dont come for a minute to the attached server (most probably becoz of round-robin algo.) and after 1 minutes the hit count reaches to 200 hundred/sec for a while and then disappear again…this keeps on.
    seems HAproxy is lacking failover capability.
    i m using HAproxy first time (in deed loadbalancing first time)just let me know if my assumptions are logical..

    Regards

  15. Musa says:

    Here is my HAproxy file:

    global
    log 127.0.0.1 local0
    log 127.0.0.1 local1 notice

    maxconn 8192
    user haproxy
    group haproxy
    daemon
    #debug
    #quiet

    defaults
    log global
    mode http
    option httplog
    option dontlognull
    retries 3
    option redispatch
    maxconn 2000
    contimeout 5000
    clitimeout 50000
    srvtimeout 50000

    listen migital1 182.xx.xx.93:80
    mode http
    stats enable
    balance roundrobin
    option httpclose
    option forwardfor
    stats auth admin:migi@123
    stats refresh 10s
    server web1 10.25.7.167:80 check inter 2000 fall 3
    server web2 10.25.7.168:80 check inter 2000 fall 3

  16. Musa says:

    Also, i am facing the following issue when i try to restart HAproxy by using command ‘/etc/init.d/haproxy restart’:
    .: 12: Can’t open /etc/rc.d/init.d/functions

    Any idea sir?

16 Comments Trackbacks For This Post

  1. How to quickly setup a load balanced, high availability, Apache cluster | BH-Server
  2. Simple MySQL replication cluster with load balancer on the slaves « Bluhalo IT
  3. MySQL and Apache2 Clustering - CenCalLX Forums
  4. HA-Apache2 Web-Site and HA-MySQL « Ozzy's Blog
  5. PHP: What is the best way to set up load balancing for PHP? - Quora
  6. What is the best loadbalancer for 2 apache webservers with PHP APC and serves up to 100 http request per second? - Quora
  7. AWS EC2 NodeJS, Forever, HAProxy Setup | candland.code

Leave a Comment Here's Your Chance to Be Heard!

Call us now
01252 560 260