It ‘s used by many large companies, including GitHub, Stack Overflow, Reddit, Tumblr and Twitter .
HAProxy ( High Availability Proxy ) is able to handle a bunch of dealings. similar to Nginx, it uses a single-process, event-driven exemplar. This uses a low ( and stable ) measure of memory, enabling HAProxy to handle a large numeral of coincident requests .
Setting it up is pretty easy as well ! We ‘ll cover install and setting up HAProxy to load balance between three sample NodeJS HTTP servers.
Reading: Load Balancing with HAProxy
Common Setups
In this example, I ‘ll show using HAProxy to proxy requests between three NodeJS “ web servers ” ( NodeJS applications using Node ‘s HTTP library ). This is just for case – in reality, you ‘ll likely see HAProxy used to distribute requests across other “ veridical ” web servers, such as Nginx or Apache .
I try to give examples that are as good for production as they are for examples, but in this case, it ‘s not excessively significant – from HAProxy ‘s stand-point, a web server, is a web server, is a vane server .
In a more “ substantial ” frame-up, web servers such as Apache or Nginx will stand between HAProxy and a web application. These web servers will typically either respond with inactive files or proxy requests they receive off to a Node, PHP, Ruby, Python, Go, Java or other active application that might be in place .
You can see examples of Apache/Nginx proxying requests off to an application in the SFH editions/articles on Apache and Nginx .
HAProxy can balance requests between any application that can handle HTTP or even TCP requests. In this case, setting up three NodeJS web servers is good a commodious way to show load poise between three web servers. How HAProxy sends requests to a web server or TCP end degree does n’t end up changing how HAProxy works !
Read up on how your application might be affected by using a Load Balancer here .
Installation
We ‘ll install the latest HAProxy ( 1.5.1 as of this writing ) on Ubuntu 14.04. To do sol, we can use the ppa:vbernat/haproxy-1.5
depository, which will get us a late stable release :
sudo add-apt-repository -y ppa:vbernat/haproxy-1.5
sudo apt-get update
sudo apt-get install -y haproxy
If you ‘re missing the
add-apt-repository
command on Ubuntu 14.04, installing thesoftware-properties-common
box will retrieve it. This is different from previous Ubuntu ‘s, who used thepython-software-properties
software to installadd-apt-repository
.
Sample NodeJS Web Server
now that HAProxy is installed, we need a few web servers to load poise between. To keep this exercise simple, we ‘ll use a previously mentioned NodeJS application, which just opens up three HTTP listeners on separate ports :
// File /srv/server.js
var http = require('http');
function serve(ip, port)
{
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.write(JSON.stringify(req.headers));
res.end("\nThere's no place like "+ip+":"+port+"\n");
}).listen(port, ip);
console.log('Server running at http://'+ip+':'+port+'/');
}
// Create three servers for
// the load balancer, listening on any
// network on the following three ports
serve('0.0.0.0', 9000);
serve('0.0.0.0', 9001);
serve('0.0.0.0', 9002);
On my server, I saved this at location
/srv/server.js
We ‘ll bounce between these three web servers with HAProxy. These will plainly respond to any request with the IP address/port of the waiter, along with the request headers received in the HTTP request to these listeners .
HAProxy Configuration
HAProxy configuration can be found at /etc/haproxy/haproxy.cfg
. here ‘s what we ‘ll likely see by default :
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# Default ciphers to use on SSL-enabled listening sockets.
# For more information, see ciphers(1SSL).
ssl-default-bind-ciphers kEECDH+aRSA+AES:kRSA+AES:+AES256:RC4-SHA:!kEDH:!LOW:!EXP:!MD5:!aNULL:!eNULL
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
here we have some global configuration, and then some defaults ( which we can override as needed for each server setup ) .
Within the global
section, we probable wo n’t need to make any changes. here we see that HAProxy is run as the user/group haproxy
, which is created during install. Running as a disjoined arrangement user/group provides some supernumerary avenues for increasing security through user/group permissions .
furthermore, the dominate process is run as root – that process then uses chroot
to separate HAProxy from other system areas, about like running with its own container. It besides sets itself a running as a daemon
( in the background ) .
Within defaults
section, we see some log and timeout options. HAProxy can log all world wide web requests, giving you the option to turn off access logs in each world wide web node, or conversely, turning logs off at the cargo halter while having them on within each web waiter ( or any combination thereof ). Where you want your logs to be generated/saved/aggregated is a decisiveness you should make based on your needs .
If you want to turn off logging regular ( successful ) HTTP requests within HAProxy, add the argumentation option dontlog-normal
. The dontlog-normal directing will tell HAProxy to only log error responses from the web nodes. alternatively, you can plainly separate error logs from the even access logs via the option log-separate-errors
option .
Load Balancing Configuration
To get begin balancing traffic between our three HTTP listeners, we need to set some options within HAProxy :
frontend
– where HAProxy listens to connectionsbackend
– Where HAPoxy sends incoming connectionsstats
– Optionally, setup HAProxy web tool for monitoring the load balancer and its nodes
here ‘s an example frontend :
frontend localnodes
bind *:80
mode http
default_backend nodes
This is a frontend, which I have named ‘localnodes ‘. I named it ‘localnodes ‘ because the NodeJS app, used to simulate three network servers, is barely being run locally. The name of the frontend is arbitrary .
bind *:80
– I’ve bound this frontend to all network interfaces on port 80. HAProxy will listen on port 80 on each available network for new HTTP connectionsmode http
– This is listening for HTTP connections. HAProxy can handle lower-level TCP connections as well, which is useful for load balancing things like MySQL read databases, if you setup database replicationdefault_backend nodes
– This frontend should use the backend namednodes
, which we’ll see next.
TCP is “ lower floor ” than HTTP. HTTP is actually built on top of TCP, thus every HTTP connection uses TCP, but not every TCP connection is an HTTP request .
next lashkar-e-taiba ‘s see an example backend configuration :
backend nodes
mode http
balance roundrobin
option forwardfor
http-request set-header X-Forwarded-Port %[dst_port]
http-request add-header X-Forwarded-Proto https if { ssl_fc }
option httpchk HEAD / HTTP/1.1\r\nHost:localhost
server web01 127.0.0.1:9000 check
server web02 127.0.0.1:9001 check
server web03 127.0.0.1:9002 check
This is where we configure the servers to distribute traffic between. I ‘ve named it “ node ”. Similar to the frontend, the list is arbitrary. Let ‘s go through the options seen there :
mode http
– This will pass HTTP requests to the servers listedbalance roundrobin
– Use the roundrobin strategy for distributing load amongst the serversoption forwardfor
– Adds theX-Forwarded-For
header so our applications can get the clients actually IP address. Without this, our application would instead see every incoming request as coming from the load balancer’s IP addresshttp-request set-header X-Forwarded-Port %[dst_port]
– We manually add theX-Forwarded-Port
header so that our applications knows what port to use when redirecting/generating URLs.- Note that we use the
dst_port
“destination port” variable, which is the destination port of the client HTTP request.
- Note that we use the
option httpchk HEAD / HTTP/1.1\r\nHost:localhost
– Set the health check HAProxy uses to test if the web servers are still responding. If these fail to respond without error, the server is removed from HAProxy as one to load balance between. This sends a HEAD request with theHTTP/1.1
andHost
header set, which might be needed if your web server uses virtualhosts to detect which site to send traffic tohttp-request add-header X-Forwarded-Proto https if { ssl_fc }
– We add theX-Forwarded-Proto
header and set it to “https” if the “https” scheme is used over “http” (via ssl_fc). Similar to the forwarded-port header, this can help our web applications determine which scheme to use when building URL’s and sending redirects (Location headers).server web01-3 127.0.0.0:9000-2 check
– These three lines add the web servers for HAProxy to balance traffic between. It arbitrarily names each oneweb01
–web03
, set’s their IP address and port, and adds the directivecheck
to tell HAProxy to health check the server
Load Balancing Algorithms
Let ‘s take a agile moment to go over something significant to load balancing – deciding how to distribute traffic amongst the servers. The following are a few of the options HAProxy offers in version 1.5 :
Roundrobin: In the above configuration, we used the pretty basic roundrobin
algorithm to distribute traffic amongst our three servers. With roundrobin
, each server is used in twist ( although you can add weights to each waiter ). It is limited by design to 4095 ( ! ) servers .
Weights default to 1, and can be vitamin a high as 256. Since we did n’t set one above, all have a weight of 1, and roundrobin simply goes from one server to the following
We can accomplish gluey sessions with this algorithm. gluey sessions means user sessions, normally identified by a cookie, will tell HAProxy to always send requests from a customer to a lapp server. This is utilitarian for web applications that use nonpayment session manage, which likely saves session data on the server, rather than within a cookie on the clients browser or in a centralized seance store, such as redis or memcached .
To do indeed, you can add a cookie SOME-COOKIE-NAME prefix
directing into the backend. then plainly add the cookie
directive within each server. then HAProxy will append a cookie ( or add onto an existing one ) a identifier for each server. This cookie will be sent back in subsequent requests from the client, letting HAProxy know which server to send the request to. This looks like the watch :
backend nodes
# Other options above omitted for brevity
cookie SRV_ID prefix
server web01 127.0.0.1:9000 cookie check
server web02 127.0.0.1:9001 cookie check
server web03 127.0.0.1:9002 cookie check
I suggest using cookie-based sessions or a central seance storehouse preferably if you have the option to do thus within your web applications. Do n’t rely on requiring clients to always connect to the lapp web waiter to stay logged into your application .
static-rr: This is alike to the roundrobin method acting, except you ca n’t adjust server weights on the fly. In tax return, it has no purpose limitation on the numeral of servers, like roundrobin does .
leastconn: The server with the lowest number of connections receives the connection. This is better for servers with long-running connections ( LDAP, SQL, TSE ), but not necessarily for ephemeral connections ( HTTP ) .
uri: This takes a located dowry of the URI used in a request, hashes it, divides it by the total weights of the running servers and uses the resultant role to decide which server to send traffic to. This effectively sets it so the like server handles the same URI end points .
This is frequently used with proxy caches and anti-virus proxies in order to maximize the cache hit rate .
Not mentioned, but worth checking out are the remaining balance algorithm :
- rdp-cookie – Session stickiness for the RDP protocol
- first
- source
- url_param
- hdr
Test the Load Balancer
Putting all those directives inside of the /etc/haproxy/haproxy.cfg
file gives us a load halter !
here ‘s the complete configuration file at /etc/haproxy/haproxy.cfg
:
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# Default ciphers to use on SSL-enabled listening sockets.
# For more information, see ciphers(1SSL).
ssl-default-bind-ciphers kEECDH+aRSA+AES:kRSA+AES:+AES256:RC4-SHA:!kEDH:!LOW:!EXP:!MD5:!aNULL:!eNULL
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
frontend localnodes
bind *:80
mode http
default_backend nodes
backend nodes
mode http
balance roundrobin
option forwardfor
http-request set-header X-Forwarded-Port %[dst_port]
http-request add-header X-Forwarded-Proto https if { ssl_fc }
option httpchk HEAD / HTTP/1.1\r\nHost:localhost
server web01 127.0.0.1:9000 check
server web02 127.0.0.1:9001 check
server web03 127.0.0.1:9002 check
listen stats *:1936
stats enable
stats uri /
stats hide-version
stats auth someuser:password
You start/restart/reload start HAProxy with these settings. Below I restart HAProxy just because if you ‘ve been following line by line, you may not have started HAProxy so far :
# You can reload if HAProxy is already started
$ sudo service haproxy restart
then start the Node server :
$ node /srv/server.js
note that I ‘m assuming the Node server is being run on the same waiter has HAProxy for this show – that ‘s why all the IP addresses used are referencing localhost
127.0.0.1
.
then fountainhead to your servers IP cover or hostname and see it balance traffic between the three Node servers. I broke out the beginning request ‘s headers a spot so we can see the add X-Forwarded-*
headers :
{"host":"192.169.22.10",
"cache-control":"max-age=0",
"accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36",
"accept-encoding":"gzip,deflate,
sdch","accept-language":"en-US,en;q=0.8",
"x-forwarded-port":"80", // Look, our x-forwarded-port header!
"x-forwarded-for":"172.17.42.1"} // Look, our x-forwarded-for header!
There's no place like 0.0.0.0:9000 // Our first server, on port 9000
{"host":"192.169.22.10","cache-control":"max-age=0","accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8","user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36","accept-encoding":"gzip,deflate,sdch","accept-language":"en-US,en;q=0.8","x-forwarded-port":"80","x-forwarded-for":"172.17.42.1"}
There's no place like 0.0.0.0:9001 // Our second server, on port 9001
{"host":"192.169.22.10","cache-control":"max-age=0","accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8","user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36","accept-encoding":"gzip,deflate,sdch","accept-language":"en-US,en;q=0.8","x-forwarded-port":"80","x-forwarded-for":"172.17.42.1"}
There's no place like 0.0.0.0:9002 // Our third server, on port 9002
See how it round-robins between the three servers in the order they are specify ! We besides have the x-forwared-for
and x-forwarded-port
headers available to us, which our lotion can use
Monitoring HAProxy
You may have noticed the following directives which I have n’t discussed even :
listen stats *:1936
stats enable
stats uri /
stats hide-version
stats auth someuser:password
HAProxy comes with a web interface for monitoring the load halter and the servers it is frame-up to use. Let ‘s go over the above options :
listen stats *:1936
– Use thelisten
directive, name itstats
and have it listen on port1936
.stats enable
– Enable the stats monitoring dashboardstats uri /
– The URI to reach it is just/
(on port 1936)stats hide-version
– Hide the version of HAProxy usedstats auth someuser:password
– Use HTTP basic authentication, with the set username and password. In this example, the username issomeuser
and the password is justpassword
. Don’t use that in production – in fact, make sure your firewall blocks external HTTP access to that port.
When you head to your server and port in your web browser, here ‘s what the dashboard will look like :
For me, I reached this at hypertext transfer protocol : //192.168.22.10:1936 in my browser. The IP cover
192.168.22.10
happened to be the IP address of my test server.
We can see the Frontend we defined under localhost
, which I named the frontend. actually when this screenshot was taken, it was called “ localhost ”, but I changed it to localnodes
after to be less confusing. This part shows the condition of incoming requests .
There is besides the nodes
section ( again, the name I chose for the specify backend section ), our specify backend servers. Each server here is green, which shows that they are “ healthy ”. If a health check fails on any of the three servers, then it will display as loss and it wo n’t be included in the rotation of the servers .
finally there is the stats section, which merely gives information about the stats page that shows this very information .