Tricks of the Shell-Scripting Trade

If you spend a lot of time managing systems from the console, you probably already do some basic scripting. Here's a collection of tips to take your scripts to the next level.

By Charlie Schluting | Posted Feb 15, 2007
Page of   |  Back to Page 1
Print ArticleEmail Article
  • Share on Facebook
  • Share on Twitter
  • Share on LinkedIn

Plenty of shell programming tutorials exist already; this isn’t yet another “howto.” We’re going to spend a little time talking about some of the frequently unused or misunderstood techniques in shell programming, and also cover some neat tricks involving the shell that may not be obvious to novice users.

For the most part, we’re talking about the bourne shell, because it exists everywhere and is compatible with bash. Bash does have some interesting features, but for our purposes the bourne shell will do just fine.

1. Implementing a lockfile
Every once in a while we’ve got a shell script that needs to run, but dangerous things can happen if we’re running two copies at the same time. The simplest method is to just bail out if a lockfile exists, but you can also implement blocking. Here’s a basic example of checking for a lock file and then bailing out:
#!/bin/sh
LOCKFILE=/tmp/mylock

if [ ! –s $LOCKFILE ]; then
      echo $$ > $LOCKFILE
      # do stuff; the bulk of the script is here
      : > $LOCKFILE
exit 0
else
      echo “PID `cat $LOCKFILE` running” | mailx –s “$0 can’t run” root
      exit 1
fi

Of course you can get much fancier. If this was inside a while loop, you could sleep a few seconds and then retry the lock. The above example uses ‘test’ with the bracket notation to check if the lockfile exists, and that it isn’t empty (the -s option to test). If the return value of test is true, the block of code runs and puts the script’s current PID into the lockfile. At the end of the “do stuff” block of code, which will probably call another script, we truncate the lockfile by using the null command. If the lockfile is non-empty, we send mail to root and bail out with an unsuccessful return code. The subject of the message includes the name of the script ($0), and the body of the message indicates which process ID is currently running.

2. Check return values, always

If you call an external program, or another script of your own, you must always check the return value of that program. Unexpected failures of commands can cause the rest of your script to misbehave, so you need to make sure everything ran properly. The bourne shell has a built-in variable, $?, that holds the return value of the last command. See item number three for an example.

3. Using return codes

Remember, the ‘if’ statement uses the return value of the statement immediately following ‘if’ to determine whether or not it should succeed or fail. So the test command can be used (with bracket notation), and so can any other command.

if [ “$?” -eq ‘0’ ]; then echo yay ; fi

The above statement will print “yay” if the last command executed returned success, else it does nothing. Remember that ‘0’ in the shell is success; it’s the opposite of when you’re programming in C. We also collapsed everything onto a single line here. The parser doesn’t see it that way, though, since the semi-colon represents a newline.

You can also put commands inside if statements. Let’s test to see if a list of machines ping before we try to scp the known_hosts file to them (Solaris ping syntax):

#!/bin/sh
for host in `cat ./hostlist.txt`
do
  echo "doing $host.."
  if ping $host 1; then
  scp /etc/ssh/ssh_known_hosts2 $host:/etc/ssh
fi
done

4. Capturing output from multiple commands

Frequently, people who want to capture output from multiple commands in a script, will end up appending output to the same file. You can use a subshell instead, and redirect all stdout (or stderr if you need to) from the subshell:

#!/bin/sh
(
cat /etc/motd
cat /etc/issue
) > /tmp/motd-issue

5. Other subshell tricks

Changing directories, and then executing a command, can be very useful when you’re piping stdout over ssh. It’s useful for local commands too. Using tar to copy the present directory into /tmp/test/:

tar cf - . | (cd /tmp/test && tar xpf -)

Note that if the cd fails (directory doesn’t exist), then tar will not be executed.

Parallel execution:
In our previous scp example, it would have run very slowly with a long list of hosts. We can make each scp command execute almost in parallel, by backgrounding the subshell process. The following portion of the script will burn through the loop very quickly, because the subshell is backgrounded, and therefore the commands run and the shell doesn’t wait for them to complete before looping again.

if ping $host 1; then
(
scp /etc/ssh/ssh_known_hosts2 $host:/etc/ssh
scp /etc/ssh/ssh_known_hosts2 $host:/etc/ssh
) &
fi

 

7. Piping data and commands over ssh

There are many examples of how piping things over ssh is useful. You can print remotely: cat file.ps | ssh $host lpr Which works because ssh sends its stdin to the stdin of the program you tell it to execute.

You can use tar to copy a directory to a remote host, over ssh:
tar cf - ./dir | ssh $host “cd /dir/ && tar xpf –“

And finally, you can execute scripts without having to manually copy the script to the host:
cat script.sh | ssh $host /bin/sh

8. Here documents

If you want to run commands over ssh in a long argument, similar to #7’s tar copy, you must take care to escape any special characters, or the local shell will try to use them. A quick and dirty way to avoid this is with a here document:
ssh $host <<EOF
#!/bin/sh
# Entire script goes here, on multiple lines
EOF

If you omit the backslash before EOF, variable substitution is also possible. Here documents are frequently used to embed a script within a script too. As a matter of fact, some software vendors distribute their software as a shell script, and the tar file containing the binaries is actually a here document. To extract a here document to /tmp/file, just use this in the script: cat <<EOF > /tmp/file and then start the document on the next line. You can use any string you want, EOF is just the standard.

9. Live shell programming

Remember how we said that the semicolon is used to indicate a newline? This implies that scripts can be written on the same line, excluding here documents. Here’s a quick example of a command I recently ran:

ssh $box "if test `uname -r` = '5.10';
then "mv /etc/dt/config/Xresources /etc/dt/config/C/ &&
/etc/init.d/dtlogin reset" ; fi"

You can add all kinds of shell logic here, and the backslash is useful to start typing on a new line--the shell won't execute the command until you hit return without a backslash at the end. Just remember to quote things that your local shell will try and interpret. Generally people will use && in succession with multiple commands, or if you want the second (or third, or…) command to execute regardless of the return value of the previous command, just use the semicolon.

You can also just type “for” at the command prompt, and it will bring you to a new line waiting for input. An entire shell script embedded inside a loop can be written in real-time in this manner.

Knowing all these shell tricks certainly makes one’s life easier, but you really shouldn’t be spending all your time running commands on remote hosts. We’ll begin to talk about configuration management next week, which, in theory, can make it so you'll never need to login to a remote server to administer it.

Comment and Contribute
(Maximum characters: 1200). You have
characters left.
Get the Latest Scoop with Enterprise Networking Planet Newsletter