Catch (But Don't Release) with Squid Web Proxying
We all know and love Squid, the versatile HTTP caching proxy. Squid conserves bandwidth, speeds up Web surfing, and comes with all kinds of controls to rein in unruly users: bandwidth throttling, domain filtering, and user access controls, to name a few. But no matter how skillfully you configure your Squid server, it's easy to bypass it. All your users have to do is delete the references to it in their Web browser configurations. If all you're doing with Squid is caching, this makes no sense, but then some folks just like to get away with stuff. If you're using Squid for filtering, bandwidth control or any other restrictions, you will certainly have a rebel underground to deal with.
Unless you set up Squid as a transparent proxy, that is. Then you don't have to hassle with configuring individual browsers at all, and your users cannot escape your iron fist. To set this up all you need is Squid and iptables.
Transparent proxying has advantages and disadvantages. Having browsers directly configured to use your proxy results in speedier performance and avoids the problems inherent in using a transparent proxy, which we'll get to shortly. If you would rather avoid running a transparent proxy and want to know a slick trick for enforcing "voluntary" compliance, skip ahead to the end of this article.
One way to foil users who want to bypass your Squid proxy is to use some sort of central policy enforcer, like cfengine or Active Directory. Both of these have their limitations. Active Directory's group policies are fine for mass client configurations, but users will still have windows of opportunity to make mischief. It only works on Internet Explorer, and if you have to allow your users to surf the Web with Internet Explorer, well, I'm sorry.
cfengine is quite excellent for Linux/Unix systems, and works with any Web browser. Still, as good as it is, users can escape your proxy by deleting the configuration, until the next cfengine run restores it. Unless you run cfengine every few seconds, which is a bad idea.
Transparent Proxy Gotchas
Doubtless there are other tools for creating and enforcing group policies, and these are good things to use in any case. But if you don't want to use these, or want to use them in addition to a transparent proxy, you should know the limitations of transparent proxying. The descriptive names for it are "HTTP interception" and "HTTP hijacking." It violates TCP/IP standards by spoofing the source end of the connection. So the user agent (or Web browser) and origin server think they have a TCP connection with each other when they don't. The user agent is not aware that it is using a proxy, so it may not send the correct HTTP headers to the server. ident lookups won't work, HTTP proxy authentication won't work, and any anti-IP-spoofing rules you have in place must be disabled, because transparent proxying spoofs source packets.
The first step to minimizing hassles is to make sure all of your user's Web browsers are up-to-date, standards-compliant browsers. This is no big deal even on Windows, as you can use Opera, Mozilla, Netscape, K-Meleon, Firefox, Avant, and a host of others. Opera is, in my nearly-humble opinion, the best cross-platform Web browser by a country mile. Whichever you prefer, the point is there are a lot of good choices. If you must support Internet Explorer, you need a version newer than 5.5-pre-SP1. Any version before that will not "refresh" correctly with a transparent proxy, and will display other odd behaviors. (See Resources.)
None of these problems occur when a Web browser is directly configured to use a proxy; everything "just works" because the client knows it is using a proxy, so it sends the correct cache-aware HTTP headers. Then the server knows about the proxy as well, and behaves accordingly.
I'm assuming you already have Squid up and running; if you don't see Resources for some good howtos. To enable transparent proxying you need two things: some options added to squid.conf, and some iptables rules to divert all outgoing HTTP traffic to Squid. The simplest setup is to have Squid on your firewall/Internet gateway. Add these options to squid.conf:
Since Squid is sharing a box with iptables, all the kernel configurations you need are in place. Most Squid documentation gives you a kernel configuration checklist; if your system can run iptables that's all you need to know. This rule assumes a standard multi-homed box, and diverts all outgoing HTTP traffic to Squid:
iptables -t nat -A PREROUTING -i $LAN_IFACE -p tcp --dport 80 -j REDIRECT --to-port 3128
You may use a different port than Squid's default 3128, if you like, just remember to change it in squid.conf. Some admins prefer 8080 because it is easy to remember.
If Squid is on a separate server from the firewall, use this rule. This example excludes the Squid server itself at 192.168.1.12, because you don't want to create a loop:
iptables -t nat -A PREROUTING -i $LAN_IFACE -s ! 192.168.1.12 -p tcp --dport 80 -j DNAT --to 192.168.1.12:3128
Giving Some Users A Pass
There may be some users that require un-hindered Internet access. One way to do this with is with this with additional iptables rules. These rules must come before the PREROUTING rules in your iptables script, like these rules that forward port 80 requests from two workstations out to the Internet before Squid gets hold of them:
iptables -A FORWARD -i $LAN_IFACE -o $WAN_IFACE -s 192.168.1.100 -p tcp --dport 80 -j ACCEPT iptables -A FORWARD -i $LAN_IFACE -o $WAN_IFACE -s 192.168.1.101 -p tcp --dport 80 -j ACCEPT
Enforcing Voluntary Compliance
As we learned back in the beginning, using a transparent proxy presents some problems, and that having Web browsers directly configured to use your nice proxy means better performance and fewer possibly broken transactions, and hopefully fewer whiny users. So here is a nifty little trick to enforce "voluntary" compliance: first set up a simple HTTP server on port 8080 that serves up a page with detailed instructions on how to configure Web browsers to use the proxy. Then use your iptables REDIRECT rule to divert port 80 traffic to this page:
iptables -t nat -A PREROUTING -i $LAN_IFACE -p tcp --dport 80 -j REDIRECT --to-port 8080
It won't hurt to include an explanation of the benefits, like speedier performance, reduced bandwidth usage, and lower levels of network admin wrath.
- Automate Linux Configuration with cfengine, part 1
- Automate Linux Configuration with cfengine, part 2
- Internet Explorer Always Retrieves from Transparent Cache Servers
- Simple Configuration Tips Put Squid on the Menu
- Squid Puts the Squeeze on Net Wrongdoers
- the Squid FAQ is a gold mine of useful information