Enterprise Networking Planet   Earthweb  
Images Events Jobs Premium Services Media Kit Network Map E-mail Offers Vendor Solutions Webcasts
   subjects:
EnterpriseNetworkingPlanet Webcasts:
Blades Burst Onto Data Center Scene

Will Virtualization Pay Off for Your Enterprise?

Benefit Now from Improved Data Center Management

more Webcasts...


Network Security & Privacy Blog
Reprise: Leopard vs. Vista on Security

Zeroshell and My Interop Security Hangover

Ripping Passwords With Your Friend John

More Open Networks Today



Search EarthWeb Network

internet.commerce
Be a Commerce Partner
KVM Switch over IP
Car Donations
Computer Deals
Memory Upgrades
Promos and Premiums
Televisions
Disney World Tickets
Holiday Gift Ideas
PDA Phones & Cases
Baby Photo Contest
Prepaid Phone Card
Auto Insurance Quote
Online Shopping
Promotional Golf

Networking & Communications : Administration & Management: Sawing Linux Logs with Simple Tools

Glossary
directory service
honeynet
intranet
intrusion detection system
network appliance
NFS
port scanning
protocol
security
VPN
Search for more networking terms ...
 
FREE Tech Newsletters

Whitepaper: HP All-in-One Storage System with VMware ESX Server. See what happens when virtualization and network storage technologies are combined. Click here to open this PDF.

Sawing Linux Logs with Simple Tools
September 15, 2004
By Carla Schroder

So there you are with all of your Linux servers humming along happily. You have tested, tweaked, and configured until they are performing at their peak of perfection. Users are hardly whining at all. Life is good. You may relax and indulge in some nice, relaxing rounds of TuxKart. After all, you earned it.

Except for one little remaining chore: monitoring your log files. [insert horrible alarming music of your choice here.] You're conscientious, so you know you can't just ignore the logs until there's a problem, especially for public services like Web and mail. Somewhere up in the pointy-haired suites, they may even be plotting to require you to track and analyze all sorts of server statistics.

Crafting clever, complex regular expressions is quite fun, but there are many simple searches that do the job just fine.
Not to worry, for there are many ways to implement data reduction, which is what log parsing is all about. You want to slice and dice your logs to present only the data you're interested in viewing. Unless you wish to devote your entire life to manually analyzing log files. Even if you only pay attention to logfiles when you're debugging a problem, having some tools to weed out the noise is helpful.

Good Ole grep

The simplest method is a keyword search. Suppose you want to separate out the 404 errors in your Apache log, and see if you have any missing files:

$ grep 404 bratgrrl.com-Aug-2004
...
212.27.41.34 - - [30/Aug/2004:02:25:13 -0700] "GET /robots.txt HTTP/1.0" 404 - "-"
Pompos/1.3 http://dir.com/pompos.html"
65.54.188.90 - - [30/Aug/2004:10:32:26 -0700] "GET /robots.txt HTTP/1.0" 404 - "-"
"msnbot/0.11 (+http://search.msn.com/msnbot.htm)"
207.65.113.58 - - [12/Aug/2004:06:49:11 -0700] "GET /favicon.ico HTTP/1.1" 404 - "-"
"Opera/7.21 (X11; Linux i686; U) [en]"
...

These entries are typical. This site has no robots.txt or favicon, so any requests for these files generate a 404 error. The first two are Web bots. The third entry is probably some random surfer. You can ignore these. So let's screen out robots.txt and favicon, and see what is left:

$ grep 404 bratgrrl.com-Aug-2004 | grep -v -E "favicon.ico|robots.txt"
....
200.16.116.3 - - [29/Aug/2004:20:59:27 -0700] "GET /images/142spacer.gif HTTP/1.0"
404 - "http://www.bratgrrl.com/" "Mozilla/5.0 Galeon/1.2.7 (X11; Linux i686; U;) Gecko/20030131"
200.16.116.3 - - [29/Aug/2004:21:00:08 -0700] "GET /email_crimes.html HTTP/1.0" 404
- "http://www.bratgrrl.com/" "Mozilla/5.0 Galeon/1.2.7 (X11; Linux i686; U;) Gecko/20030131"
....

Now we're getting somewhere. These two files — images/142spacer.gif and email_crimes.html — are referenced somewhere on the Web site, but they do not exist. This is something that should be fixed. How to find the URLs that refer to these files? grep can do this too. Suppose all the site files are in /var/www/bratgrrl:

$ grep -R "142spacer.gif" /var/www/bratgrrl

Here's another cool grep trick for Apache logs. You doubtless noticed that the above examples were referred from http://www.bratgrrl.com. When you're checking to see where your traffic is coming from, you don't care about local referrals. Weed them out with this:

$ cat bratgrrl.com-Aug-2004 | fgrep -v bratgrrl | cut -d\" -f4 | grep -v ^-
http://www.computerbits.com/archive/2004/0800/schroder0408.html
http://www.pdxlinux.org/resources/nw_linux
www.dianagaydon.com/
http://www.netcraft.com/survey/
http://www.techsupportforum.com/computer/topic/3520-1.html
http://us.altavista.com/web/results?tlb=1&kgs=0&ienc=utf8&q=carla+schroder

Now you can see where traffic to your site is coming from, uncluttered by local references. Here's how it works, piece by piece:

  • fgrep -v bratgrrl means "look for the literal string bratgrrl, then exclude lines that contain it."
  • cut -d\" -f4 means "using quotation marks as the delimiter, print only the text in the fourth field." The fourth field is the text between the third and fourth quotation marks.
  • grep -v ^- means "exclude lines that start with a hyphen." Try running the command without this to see why.

More Simple Stuff

Crafting clever, complex regular expressions is quite fun, and a more worthy use of one's time than comatose drooling in front of "Reality TV." However, there are many simple searches that do the job just fine. You can search /var/log/auth.log quickly to see if anyone has made an inordinate number of failed login attempts. The -i option does a case-insensitive search:

$ grep -i "fail" /var/log/auth.log ... Sep 13 16:26:34 server02 PAM_unix[27462]: authentication failure; (uid=0) -> root for
ssh service Sep 13 16:26:36 server02 sshd[27462]: Failed password for root from 12.34.45.67 port
3210 ssh2 Sep 13 16:26:38 server02 PAM_unix[27464]: authentication failure; (uid=0) -> root for
ssh service Sep 13 16:26:40 server02 sshd[27464]: Failed password for root from 12.34.45.67 port
3210 ssh2 ...

Well well, someone came a' knockin' on the SSH (secure shell) door. Knowledge is power — at this point, you could fine-tune your iptables to drop packets from the originating IP, or you could do a little sleuthing to find the source, or you could create a nice honeypot and amuse yourself trapping the no-good person trying to get into your system. You can even count the number of attempts:

$ grep "12.34.45.67" /var/log/auth.log | wc -l
8656

That's a rather persistent little twit, I'd say.

Syslog, The Dumping Ground

The syslog — /var/log/syslog — is a dumping ground for log entries from all kinds of daemons, such as Samba and cron:

$ grep -i samba /var/log/syslog
Sep 13 08:50:47 windbag nmbd[1123]:   become_logon_server_success: Samba is now a 
logon server for workgroup HOMENET on subnet 192.168.1.5 Sep 13 08:50:51 windbag nmbd[1123]: Samba server WINDBAG is now a domain master
browser for workgroup HOMENET on subnet 192.168.1.5 Sep 13 08:51:06 windbag nmbd[1123]: Samba name server WINDBAG is now a local
master browser for workgroup HOMENET on subnet 192.168.1.5 $ grep -i cron /var/log/syslog Aug 18 21:18:01 windbag /USR/SBIN/CRON[1752]: (amavis) CMD (test -e /usr/bin/sa-
learn && test -e /usr/sbin/amavisd-new && /usr/bin/sa-learn —rebuild >/dev/null 2>&1)

These two snippets demonstrate that you can verify that certain Samba functions are working correctly, and that your cron jobs are running when you want.

Another useful item in /var/log/syslog is those strange-looking MARK messages:

Sep 13 19:10:30 windbag — MARK —
Sep 13 19:30:30 windbag — MARK —
Sep 13 19:50:30 windbag — MARK —

This is where you find out if your system rebooted during the night when it wasn't supposed to; the MARK sequence will be interrupted, and you'll see shutdown and startup messages.

Next month's Scripting Clinic will show how to set up automated email alerts, so when something nasty that requires your attention shows up in your logs, you won't be left in the dark.

Resources

  • See the man pages for grep, cut, and wc.
  • Linux in a Nutshell, by Ellen Siever, is my #1 indispensable Linux command reference

Tools:
Add www.enterprisenetworkingplanet.com to your favorites
Add www.enterprisenetworkingplanet.com to your browser search box
IE 7 | Firefox 2.0 | Firefox 1.5.x
Receive news via our XML/RSS feed

Administration & Management Archives

14-Day Qualys Trial: Find Out in Minutes if Your Network is Vulnerable!
HP eBook: Using Business Service Management (BSM) to Manage Your Business Applications
Five Trends for Application Development & Program Management. Download Complimentary Report Now.
Keep up with the latest business and technology news and information! Visit Internet.com.
Best Practices: Make the Case for IT Investments. Complimentary Independent Report. Download Now!





JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
Microsoft Article: HyperV-The Killer Feature in WinServer ‘08
Avaya Article: How to Feed Data into the Avaya Event Processor
Microsoft Article: Install What You Need with Win Server ‘08
HP eBook: Putting the Green into IT
Whitepaper: HP Integrated Citrix XenServer for HP ProLiant Servers
Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 1
Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 2--The Future of Concurrency
Avaya Article: Setting Up a SIP A/S Development Environment
IBM Article: How Cool Is Your Data Center?
Microsoft Article: Managing Virtual Machines with Microsoft System Center
HP eBook: Storage Networking , Part 1
Microsoft Article: Solving Data Center Complexity with Microsoft System Center Configuration Manager 2007
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
Intel Video: Are Multi-core Processors Here to Stay?
On-Demand Webcast: Five Virtualization Trends to Watch
HP Video: Page Cost Calculator
Intel Video: APIs for Parallel Programming
HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
Microsoft Silverlight Video: Creating Fading Controls with Expression Design and Expression Blend 2
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Sun Download: Solaris 8 Migration Assistant
Sybase Download: SQL Anywhere Developer Edition
Red Gate Download: SQL Backup Pro and free DBA Best Practices eBook
Red Gate Download: SQL Compare Pro 6
Iron Speed Designer Application Generator
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
How-to-Article: Preparing for Hyper-Threading Technology and Dual Core Technology
eTouch PDF: Conquering the Tyranny of E-Mail and Word Processors
IBM Article: Collaborating in the High-Performance Workplace
HP Demo: StorageWorks EVA4400
Intel Featured Algorhythm: Intel Threading Building Blocks--The Pipeline Class
Microsoft How-to Article: Get Going with Silverlight and Windows Live
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES