Hone Your Scripting With a Regexp Toolbox

Regular expressions are the power tools of shell scripting. Here are a few to help with your administrative tasks.

 By Carla Schroder
Page 1 of 2
Print Article

The Linux command line is the ultimate power tool. Thanks to the Bash shell, regular expressions, and all the wonderful GNU tools, Linux users can do more cool, useful power-user tasks than with many other operating systems. Regular expressions are the magic incantations that let you find and replace mass quantities of text with a single command, pluck specific text or files out of gigabytes of stuff with precision, and string commands together to perform amazing feats of computing wizardry. Today I shall share with you an assortment of my favorite one-liners for all occasions.

Finding Files

Finding big files, little files, new files, old files, changed files, files by UID or GID, finding files when you're not sure of the name- these are all child's play for the wonderful find command. You can even do some simple security audits. We all know the dangers of setting the SUID bit on files- so why not generate a list of the ones on your system to keep track?

# find / \( -perm -4000 -fprintf suid-list.txt '%#m %u %p\n' \)

This creates a list in the file suid-list.txt that looks like this:

04755 root /bin/umount
04755 root /bin/mount
04754 root /usr/sbin/pppd

There should be around a couple dozen in a typical Linux installation. (wc -l suid-list.txt counts them for you.) You don't want any homegrown SUID scripts, unless you really really know what you're doing. man find tells what the options mean. The backslashes prevent Bash from interpreting the characters that follow them, so they are sent on to the find command unmolested. This is important; otherwise you'll experience slower performance, or even errors.

This command searches for orphaned files; these are files that belong to logins that are not in /etc/passwd, or groups that are not in /etc/group. This might happen when a user is removed from the system, and you don't track down all their leftover files for deletion or archiving, or whatever you want to do with them:

# find / -nowner
# find / -nogroup 

What if you find some? If you want to keep them, you should assign them to a different user:

# find / -nouser -exec chown alrac {} \;

You can also change ownership by the UID of the files:

# find / -uid 1325 -exec chown alrac {}\:

How do you find the UID or GID of files? Use the stat command. Once you know a specific UID/GID to search for, find can show them all to you:

# find /var -uid 1325

Here is a slick trick for finding all files in your home directory created after a date and time of your choosing. First create a file with the time and date you want (YYYYMMDDhhmm), then find all the files created after that:

$ touch -t "200705011200"  date.txt
$ find ~  -newer date.txt

Use \! -newer to find files created before your date.txt file:

You can find files changed or created in the previous few minutes. This is handy when you've lost a new download, or want to see what a new application dumped on your system:

$ find ~ -mmin -5

A nice variation on this is files created or changed more than five minutes ago, but less than ten:

$ find ~  -mmin +5 -mmin -10

Everyone knows how to do simple find searches, like find / -name foo. When you do this as a non-root user, you get all those annoying "Permission denied" messages. Consign the derned things to /dev/null:

$ find / -name foo 2>/dev/null

Find your biggest files, print the size, owner, and name of each one, and add up the total:

# find / -size +900M -printf '%#s %u %p\n'| awk '{print $1, $2, $3; total += $1} END \ 
{ print "The total size of these files is", total}'

In this example, the backslash after END indicates that even though the line appears as broken on this page, it is really one long unbroken line. If you enter it as one unbroken line, omit the backslash.

This article was originally published on May 22, 2007
Get the Latest Scoop with Networking Update Newsletter