Get More Out of Your Pipe with Apache and mod_gzip

By Carla Schroder | Jun 23, 2004 | Print this Page
http://www.enterprisenetworkingplanet.com/netos/article.php/3372341/Get-More-Out-of-Your-Pipe-with-Apache-and-modgzip.htm

Some Web sites seem like they are designed to annoy and alienate visitors. Teeny tiny fixed fonts, weirdo fixed page widths, ad servers on Mars, and the content won't load until the ads do, and all kinds of dynamic jiggery-pokery that does everything but quickly deliver a nice, readable page.

Webmasters who are serious about running high-performance Web servers, and who want pleased and delighted visitors, have a great tool in Apache 1.3's mod_gzip. mod_gzip compresses pages on the fly, reducing their size considerably. Depending on the types of files served, you'll see size reductions ranging from 20%- 80%, and a nice increase in server efficiency. Nothing is needed on the client side, except sane modern Web browsers like Mozilla, Firefox, Opera, Galeon, and Konqueror. Mozilla, Firefox, and Opera are nice cross-platform browsers with all kinds of neat features, so don't be afraid to standardize on one of them. You won't be sorry. Internet Explorer more or less supports mod_gzip, but some file formats, such as .pdf, don't always render correctly, and it's a big security hole anyway. You'll want to check out the Resources section below for links that clarify some of the issues assorted browsers have with gzipped content.

mod_gzip compresses static files, and any CGI-generated pages. You'll get the same results as using gzip on any file: Files that are already highly-compressed, like .jpg, won't compress much further, and text files will shrink a lot.

It's Not Magic
But good smart design and coding. When Josephine Surfer wanders to your Web site, Apache checks the headers sent by her Web browser and looks for the "Accept-encoding: gzip" HTTP request header. Apache smiles, and sends some headers in return: Content-Type: text/html and Content-Encoding: gzip. Josephine's Web browser now knows that it will receive nice lean gzipped pages, and must expand them as HTML pages.

You, the ace Webmaster, define in httpd.conf exactly which mime types and file types will be gzipped, and there are a whole lot of them. See the Configuration section below.

Installation
mod_gzip is available in the usual package formats, and source tarballs. If your particular package does not include bales of documentation, including HTML docs and a well-commented sample configuration file, get the tarball. It's only 415 kilobytes, a small price to pay to get the good documentation.

To build mod_gzip from sources, first edit the Makefile, giving it the path to the apxs file on your system. Usually this is /usr/bin/apxs. (Debian users, you'll need the apache-dev package to get this file.) It's the first line in the file:

APXS=/usr/local/sbin/apxs

Change this to

APXS=/usr/bin/apxs

Then run

$ make
# make install

Continued on Page 2: ConfigurationContinued From Page 1

Configuration
This is the tedious part, because Apache refuses to read your mind, and insists that you define every last thing in httpd.conf. First, tell Apache to load your nice new mod_gzip module; put this line with the other LoadModule entries:

LoadModule gzip_module modules/mod_gzip.so

Next, find the AddModule lines, and add:

AddModule mod_gzip.c 

Now you can stuff all the mod_gzip directives into its own IfModule section:

<IfModule mod_gzip.c>
#turn mod_gzip on. You can disable it in individual virtual hosts, 
#if you wish
mod_gzip_on                    Yes      
#this tells mod_gzip to re-use compressed files, instead
#of compressing them for every request                                      
mod_gzip_can_negotiate   Yes   
#assemble all the bits of a dynamically-generated page,
#and compress it as one page                                         
mod_gzip_dechunk            Yes  
     
#there's no point in compressing small files (bytes)                               
mod_gzip_minimum_file_size  600       
#sometimes scripts create loops- you
#don't want mod_gzip to get stuck, and create
#an enormous file that eats your hard drive
#and crashes the server
mod_gzip_maximum_file_size  100000      
#maximum size of file in memory                                 
mod_gzip_maximum_inmem_size 100000    

#temp files- say yes only for debugging                                     
mod_gzip_keep_workfiles     No                                             
mod_gzip_temp_dir           /usr/local/apache/gzip 

# define the required HTTP version of the client, to
#automatically weed out requests from antique browsers,
#proxy servers, search engines, bots, and other entities
#that cannot handle compression. Only uncompressed
#pages will be served to these. 
#1000 = HTTP/1.0, 1001 = HTTP/1.1
mod_gzip_min_http             1001

#exclude these from compression, because they are
#troublemakers
mod_gzip_handle_methods GET POST 

#now the fun stuff- tell which file and mime types
#you want compressed                        
mod_gzip_item_include       file \.html$ 
mod_gzip_item_include       file \.html$
mod_gzip_item_include       file \.shtml$
mod_gzip_item_include       file \.shtm$                                   
mod_gzip_item_include       file \.txt$                                    
mod_gzip_item_include       file \.jsp$                                    
mod_gzip_item_include       file \.php$                                    
mod_gzip_item_include       file \.pl$   
                             
mod_gzip_item_include       mime ^text/.*                                  
mod_gzip_item_include       mime ^application/x-httpd-php                  
mod_gzip_item_include       mime ^httpd/unix-directory$                    
mod_gzip_item_include       handler ^perl-script$                          
mod_gzip_item_include       handler ^server-status$                        
mod_gzip_item_include       handler ^server-info$   
              
mod_gzip_item_exclude       file \.css$                                    
mod_gzip_item_exclude       file \.js$                                     
mod_gzip_item_exclude       mime ^image/.*   

Naturally, you will want to refine this for your own needs. Find the mod_gzip.conf.sample file on your system; it's a detailed reference for mod_gzip options. When you're finished, restart Apache. And that's pretty much all there is to it- a few minute's work for an immediate large payback. If only more things in life worked like this.

Browser Language Magic
The author of mod_gzip, Michael Schrpl, has configured his Web site to recognize the language setting in your browser, and to deliver either German or English pages accordingly. In Mozilla/Firefox/Netscape, edit -> preferences ->navigator -> languages. In Opera, file -> preferences -> languages. In other browsers, eh, you'll figure it out.

Resources