Reverse Proxy by the Pound
Deciding on a reliable proxy server for all your web services can be a frustrating exercise. Configuring the proxy server can be a pain for the administrators, too. Luckily Pound exists; it makes good sense, and is very nice to configure.
Pound provides a few basic functions. It is, paraphrased from the Pound page:
- A reverse proxy
- A load balancer
- An SSL terminator
- An HTTP sanitizer
- A failover server
- A request redirector
We'll use all of these features in examples, excluding the sanitizer part. Sanitizing is done automatically: Pound will check the HTTP (and HTTPS) headers for correctness, and only pass well-formed packets to the back end servers. Assuming the checking isn't too draconian, this is a useful feature that will actually catch a good number of exploit attempts.
So what is it that we'd like to do with Pound? How about easily improve two aspects of Web services: reliability and security.
Pound addresses reliability directly, by providing load balancing and failover. These two items are separate, because some load balancer products do not take care to check the state of back end servers before sending requests to them. If a back end ever fails to respond, Pound will stop sending requests to that server until it becomes responsive again; i.e. automatic failover happens.
The security benefit of Pound is that it can remove your web servers from the Internet. If you're a hosting provider or university, you've most likely dealt with compromised websites as a result of long forgotten PHP-based programs lingering about. Most exploits will run code from another website, or download a program from another website and run it. With a reverse proxy, your actual web servers don't need to be allowed access to the Internet at all, rendering this class of exploits officially neutered.
When people begin to move Web services to proxy servers, they quickly realize that even with all this newly gained failover capability; they still have a single point of failure—the proxy server. It's true, and is why running a reliable proxy is important. Extremely busy sites can also install SSL accelerator cards, to lighten the CPU load. It's also possible to have failover-capable proxy servers. Possibly the best option, CARP, is worth looking into. For a more universally supported option, it's also possible to use VRRP or even OSPF to achieve failover.
Enough stalling; let's see how Pound does this. Let's say we want to use a few Internet accessible IP addresses for SSL websites, and another to serve all non-SSL sites. We'll proxy to back end servers, which are not Internet accessible. For each SSL website, we need to configure a separate service in Pound.
ListenHTTP Address 220.127.116.11 Port 80 Service BackEnd Address 10.0.0.2 Port 80 End BackEnd Address 10.0.0.3 Port 80 End End End
This tells Pound to listen on the "real" IP address of 18.104.22.168, port 80, and proxy all requests to the backend servers listed. Failover automatically happens, so there's nothing to configure to accomplish this important feature. SSL certificates will presumably be configured as well, when a separate section for port 443 is configured.
If we want to host a bunch of web sites that don't require SSL, it makes sense to "virtual host" them from the same IP address. This normally works via the Web server checking the HTTP headers to see which site was requested, and then serve the appropriate site. You can still do this with a proxy server, and in Pound, it's configured like so:
ListenHTTP Address 22.214.171.124 Port 80 Service HeadRequire "Host: .*www.server.tld.*" BackEnd Address 10.0.0.1 Port 80 End End End
You must configure multiple Service statements—one per website. The above configuration proxies all requests for www.server.tld (HeadRequire just says to match something in the header) to 10.0.0.1, port 80. Multiple BackEnd statements, of course, can exist for each name-based virtual host that's configured.
Proxy servers have a fundamental problem with sessions. A session, in general, is when the Web server keeps track of a client, and allows access to some data without requiring another login. If requests are served in a round-robin fashion, sessions break because (for most applications) session data is not synced between Web servers. The simple solution is to configure Pound to be session-aware. Pound supports five different types of session tracking to allow many flexible methods of getting clients to the right back ends. The most simple, and most likely to work reliably, is IP-based sessions:
Session Type IP TTL 300 End
Just place that within a Service section, and client requests from the same IP that remain active within a 5-minute period will all end up at the same back end server. Header values, cookie data, and even URL contents can also be used to keep track of sessions. None of them work for all situations, so getting sessions working correctly requires a bit of understanding about how your application works.
There are a few other details to configuring the Pound proxy server, but this is a great start. Once you understand a few fundamental issues with HTTP, sessions, and SSL, it's trivial to configure Pound. Go forth, and bring to the land a robust and secure Web infrastructure.