Monday, March 22, 2010

UNIX Special Environment Variables

There are several special variables the shell uses, and there are special variables the system defined for each user. SunOS and Solaris systems use different environment variables. If in doubt, check the manual pages. I'll describe some the important Solaris variables.


  •   PWD - always the current directory
  • RANDOM - a different number every time you access it
  • $$ - the current process id (of the script, not the user's shell)
  • PPID - the "parent process"s ID. (BUT NOT ALWAYS, FOR FUNCTIONS)
  • $? - exit status of last command run by the script
  • PS1 - your "prompt". "PS1='$PWD:> '" is interesting.
  • $1 to $9 - arguments 1 to 9 passed to your script or function (you can actually have higher, but you need to use braces for those) 

PATH - Sets searchpath

The "PATH" environment variable lists directories that contain commands. When you type an arbitrary command, the directories listed are searched in the order specified. The colon is used to separate directory names. An empty string corresponds to the current directory. Therefore the searchpath
:/usr/bin:/usr/ucb
contains three directories, with the current directory being searched first. This is dangerous, as someone can create a program called "ls" and if you change your current directory to the one that contains this program, you will execute this trojan horse. If you must include the current directory, place it last in the searchpath.
/usr/bin:/usr/ucb:

HOME - Your home directory

The "HOME" variable defines where the "cd" goes when it is executed without any arguments. The HOME evvironment variable is set by the login process.

CDPATH - cd searchpath

When you execute the "cd" command, and specify a directory, the shell searches for that directory inside the current working directory. You can add additional directories to this list. If the shell can't find the directory in the current directory, it will look in the list of directories inside this variable. Adding the home directory, and the directory above the current directory is useful:
CDPATH=$HOME:.. export CDPATH

IFS - Internal Field Seperator

The "IFS" variable lists the characters used to terminate a word. I discussed this briefly earlier. Normally, whitespace separates words, and this variable contains a space, a tab and a new line. Hackers find this variable interesting, because it can be used to break into computer systems. A poorly written program may carelessly execute "/bin/ps." A hacker may redefine the PATH variable, and define IFS to be "/." When the program executes "/bin/ps," the shell will treat this as "bin ps." In other words, the program "bin" is executed with "ps" as an argument. If the hacker has placed a program called "bin" in the searchpath, then the hacker gains privileged access.

PS1 - Normal Prompt

The "PS1" variable specifies the prompt printed before each command. It is normally "$ ." The current directory cannot be placed inside this prompt. Well, some people make a joke, and tell a new user to place a period inside this variable. A "." does signifies the current directory, however, most users prefer the actual name.

PS2 - Secondary Prompt

The "PS2" environment variable defines the secondary prompt, This is the prompt you see when you execute a multi-line command, such as "for" or "if." You also see it when you forget to terminate a quote. The default value is "> ."

MAIL - Incoming mail

The "MAIL" variable specifies where your mailbox is located. It is set by the login process.

MAILCHECK - How often to check for mail

The "MAILCHECK" variable specifies how often to check for mail, in seconds. The default value is 600 seconds (10 minutes). If you set it to zero, every time the shell types a prompt, it will check for mail.

SHACCT - Accounting file

This variable defines the accounting file, used by the acctcom and acctcms commands.

MAILPATH - searchpath for mail folders

The "MAILPATH" variable lists colon-separated filenames. You can add a "%" after the filename, and specify a special prompt for each mailbox.
In addition, several environment variables are specified by the login process. "TERM" defines the terminal type, and "USER" or "LOGNAME" defines your user ID. "SHELL" defines your default shell, and "TZ" specifies your time zone. Check the manual pages, and test your own environment to find out for sure. The external program "env" prints all current environment variables.

Bourne Shell Variables - Alternate Formats

Earlier, I discussed simple variables in the Bourne shell. Now is the time to go into more detail. Suppose you wanted to append a string to a variable. That is, suppose you had a variable "X" with the value of "Accounts," but you wanted to add a string like ".old," or "_new" making "Accounts.old" or "Accounts_new," perhaps in an attempt to rename a file. The first one is easy. The second requires a special action. In the first case, just add the string
mv $X $X.old
The second example, however, does not work:
mv $X $X_new # WRONG!
The reason? Well, the underscore character is a valid character in a variable name. Therefore the second example evaluates two variables, "X" and "X_new." If the second one is undefined, the variable will have a value of nothing, and the shell will convert it to
mv Accounts
The mv command will take the offered arguments, and complain, as it always wants two or more variables. A similar problem will occur if you wish to add a letter or number to the value of a variable.

Using quoting and shell variables

There are several solutions. The first is to use shell quoting. Remember, quoting starts and stops the shell from treating the enclosed string from interpretation. All that is needed is to have a quote condition start or stop between the two strings passed to the shell. Place the variable in one string, and the constant in the other. If the variable "x" has the value "home," and you want to add "run" to the end, all of the following combinations are equal to "homerun:"
$x"run" $x'run' $xrun $x''run $x""run "$x"run

Using curly braces with variables

There is another solution, using curly braces:
${x}run
This is a common convention in UNIX programs. The C shell also uses the same feature. The UNIX make utility uses this in makefiles, and requires braces for all variable references longer than a single letter. (Make uses either curly braces or parenthesis).
This form for variables is very useful. You could standardize on it as a convention. But the real use comes from four variations of this basic form, briefly described below:
+-------------------------------------------------------------+
|Form      Meaning          |
+-------------------------------------------------------------+
|${variable?word}   Complain if undefined        |
|${variable-word}   Use new value if undefined        |
|${variable+word}   Opposite of the above        |
|${variable=word}   Use new value if undefined, and redefine. |
+-------------------------------------------------------------+
Why are these forms useful? If you write shell scripts, it is good practice to gracefully handle unusual conditions. What happens if the variable "d" is not defined - and you use the command below?
d=`expr $d + 1`
You get "expr: syntax error"
The way to fix this is to have it give an error if "d" is not defined.
d=`expr "${d?'not defined'}" + 1`
The "?" generates an error: "sh: d: not defined"
If instead, you wanted it to silently use zero, use
d=`expr "${d-0}" + 1`
This uses "0" if "d" is undefined.
If you wish to set the value if it's undefined, use "="
echo $z
echo ${z=23}
echo $z
The first echo outputs a blank line. The next 2 "echo" commands output "23."
Note that you can't use
new=`expr "${old=0}" + 1`
to change the value of "old" because the expr command is run as a subshell script, and changing the value of "old" in that shell doesn;t change the value in the parent shell.
I've seen many scripts fail with strange messages if certain variables aren't defined. Preventing this is very easy, once you master these four methods of referring a Bourne shell variable. Let me describe these in more detail.

${variable?value} - Complain if undefined

The first variation is used when something unusual happens. I think of it as the "Huh???" option, and the question mark acts as the mnemonic for this action. As an example, assume the following script is executed:
#!/bin/sh cat ${HOME}/Welcome
But suppose the environment variable "HOME" is not set. Without the question mark, you might get a strange error. In this case, the program cat would complain, saying file "/Welcome" does not exist. Change the script to be
#!/bin/sh cat ${HOME?}/Welcome
and execute it, and you will get the following message instead:
script: HOME: parameter null or not set
As you can see, changing all variables of the form "$variable" to "${variable?}" provides a simple method to improve the error reporting. Better still is a message that tells the user how to fix the problem. This is done by specifying a word after the question mark. Word? Yes, the manual pages says a word. In a typical UNIX-like way, that word is very important. You can place a single word after the question mark. But only one word. Perfect for one-word insults to those who forget to set variables:
cat ${HOME?Dummy}/Welcome
This is a perfect error message if you wish to develop a reputation. Some programmers, however, prefer to keep their jobs and friends. If you fall into that category, you may prefer to give an error message that tells the user how to fix the problem. How can you do that with a single word?
Remember my discussion earlier on quoting? And how the shell will consider a whitespace to be the end of the word unless quoted? The solution should be obvious. Just quote the string, which makes the results one word:
cat ${HOME?"Please define HOME, and try again"}/Welcome
Simple, yet this makes a shell script more user-friendly.

${variable-default} - Use default if undefined

The next variation doesn't generate an error. It simply provides a variable that didn't have a value. Here is an example, with user commands in boldface:
$ echo Y is $Y Y is $ echo Y is ${Y-default} Y is default $ Y=new $ echo Y is ${Y-default} Y is new $
Think of the hyphen as a mnemonic for an optional value, as the hyphen is used to specify an option on a UNIX command line. Like the other example, the word can be more than a single word. Here are some examples:
${b-string} ${b-$variable} ${b-"a phrase with spaces"} ${b-"A complex phrase with variables like $HOME or `date`"} ${b-`command`} ${b-`wc -l ${b-`ypcat passwd | wc -l`}
Any command in this phrase is only executed if necessary. The last two examples counts the number of lines in the password file, which might indicate the maximum number of users. Remember - you can use these forms of variables in place of the simple variable reference. So instead of the command
echo Maximum number of users are $MAXUSERS
change it to
echo Maximum number of users are ${MAXUSERS-`wc -l
If the variable is set, then the password file is never checked.

${variable+value} - Change if defined

The third variation uses a plus sign, instead of a minus. The mnemonic is "plus is the opposite of the minus." This is appropriate, as the command does act the opposite as the previous one. In other words, if the variable is set, then ignore the current value, and use the new value. This can be used as a debug aid in Bourne shell scripts. Suppose you wanted to know when a variable was set, and what the current value is. A simple way to do this is to use the echo command, and echo nothing when the variable has no value by using:
echo ${A+"Current value of A is $A"}
This command does print a blank line if A does not have a value. To eliminate this, use either the Berkeley version of echo, or the System V version of echo:
/usr/bin/echo ${A+"A = $A"}"c" /usr/ucb/echo -n ${A+"A = $A"}

${variable=value} - Redefine if undefined

Don't forget that these variations are used when you reference a variable, and do not change the value of the variable. Well, the fourth variation is different, in that it does change the value of the variable, if the variable is undefined. It acts like the hyphen, but if used, redefines the variable. The mnemonic for this action? The equals sign. This should be easy to remember, because the equals sign is used to assign values to variables:
$ echo Y is $Y Y is $ echo Y is ${Y=default} Y is default $ echo Y is $Y Y is default $

Undefining Variables

As you use these features, you may wish to test the behavior. But how do you undefine a variable that is defined? If you try to set it to an empty string:
A=
you will discover that the above tests do not help. As far as they are concerned, the variable is defined. It just has the value of nothing, or null as the manual calls it. To undefine a variable, use the unset command:
unset A
or if you wish to unset several variables
unset A B C D E F G

${x:-y}, ${x:=y}, ${x:?y}, ${x:+y} forms

As you can see, there is a different between a variable that has a null value, and a variable that is undefined. While it might seem that all one cares about is defined or undefined, life is rarely so simple. Consider the following:
A=$B
If B is undefined, is A also undefined? No. Remember, the shell evaluates the variables, and then operates on the results. So the above is the same as
A=
which defines the variable, but gives it an empty, or null value. I think most scripts don't care to know the difference between undefined and null variables. They just care if the variables have a real value or not. This makes so much sense, that later versions of the Bourne shell made it easy to test for both cases by creating a slight variation of the four forms previously described: a colon is added after the variable name:
+----------------------------------------------------------------------+
|Form       Meaning            |
+----------------------------------------------------------------------+
|${variable:?word}   Complain if undefined or null         |
|${variable:-word}   Use new value if undefined or null         |
|${variable:+word}   Opposite of the above          |
|${variable:=word}   Use new value if undefined or null, and redefine. |
+----------------------------------------------------------------------+
Notice the difference between "${b-2}" and "${b:-2}" in the following example:
$ # a is undefined $ b="" $ c="Z" $ echo a=${a-1}, b=${b-2}, c=${c-3} a=1, b=, c=Z $ echo a=${a:-1}, b=${b:-2}, c=${c:-3} a=1, b=2, c=Z

Order of evaluation

One last point - the special word in one of these formats is only evaluated if necessary. Therefore the cd and pwd commands in the following: is only executed if the word is executed:
echo ${x-`cd $HOME;pwd`}
Also - the evaluation occurs in the current shell, and not a sub-shell. The command above will change the current directory, but the one below will not, as it executes the commands in a new shell, which then exits.
echo `cd $HOME;pwd`

Special Variables in the Bourne Shell

Earlier, I discussed Bourne shell variables, and various ways to use them. So far I have only given you the foundation of shell programming. It's time for discussing special Bourne shell variables, which will allow you to write useful scripts. These special variables are identified by the dollar sign, and another character. If the character is a number, it's a positional parameter. If it's not a letter or number, it's a special purpose variable.

Positional Parameters $1, $2, ..., $9

The most important concept in shell scripts is passing arguments to a script. A script with no options is more limited. The Bourne shell syntax for this is simple, and similar to other shells, and awk. As always, the dollar sign indicates a variable. The number after the dollar sign indicates the position on the command line. That is, "$1" indicates the first parameter, and "$2" indicates the second. Suppose you wanted to create a script called rename that takes two arguments. Just create a file with that name, that contains the following:

#!/bin/sh
# rename: - rename a file
# Usage: rename oldname newname
mv $1 $2

Click here to get file: rename0.sh

Then execute "chmod +x rename" and you have a new UNIX program. If you want to add some simple syntax checking to this script, using the techniques I discussed earlier, change the last line to read:
mv ${1?"missing: original filename"} ${2?"missing new filename"}
This isn't very user friendly. If you do not specify the first argument, the script will report:
rename: 1: missing: original filename

As you can see, the missing variable, in this case "1," is reported, which is a little confusing. A second way to handle this is to assign the positional variables to new names:

#!/bin/sh
# rename: - rename a file
# Usage: rename oldname newname
oldname=$1
newname=$2
mv ${oldname:?"missing"} ${newname:?"missing"}

Click here to get file: rename.sh

This will report the error as follows:
rename: oldname: missing
Notice that I had to add the colons before the question mark. Earlier I mentioned how the question mark tests for undefined parameters, while the colon before the question mark complains about empty parameters as well as undefined parameters. Otherwise, the mv command might have complained that it had insufficient arguments.
The Bourne shell can have any number of parameters. However, the positional parameters variables are limited to numbers 1 through 9. You might expect that $10 refers to the tenth argument, but it is the equivalent of the value of the first argument with a zero appended to the end of the value. The other variable format, ${10}, ought to work, but doesn't. The Korn shell does support the ${10} syntax, but the Bourne shell requires work-arounds. One of these is the shift command. When this command is executed, the first argument is moved off the list, and lost. Therefore one way to handle three arguments follows:
#!/bin/sh arg1=$1;shift; arg2=$1;shift; arg3=$1;shift; echo first three arguments are $arg1 $arg2 and $arg3
The shift command can shift more than one argument; The above example could be:
#!/bin/sh arg1=$1 arg2=$2 arg3=$3;shift 3 echo first three arguments are $arg1 $arg2 and $arg3
This technique does make it easier to add arguments, but the error message is unfriendly. All you get is "cannot shift" as an error. The proper way to handle syntax errors requires a better understanding of testing and branching, so I will postpone this problem until later.

$0 - Scriptname

There is a special positional parameter, at location zero, that contains the name of the script. It is useful in error reporting:
echo $0: error
will report "rename: error" when the rename script executes it. This variable is not affected by the shift command.

$* - All positional parameters

Another work-around for the inability for specifying parameters 10 and above is the "$*" variable. The "*" is similar to the filename meta-character, in that it matches all of the arguments. Suppose you wanted to write a script that would move any number of files into a directory. If the first argument is the directory, the following script would work:

#!/bin/sh
# scriptname: moveto
# usage:
# moveto directory files.....
directory=${1:?"Missing"};shift
mv $* $directory

Click here to get file: moveto.sh

If this script was called "moveto" then the command
moveto /tmp *
could easily move hundreds of files into the specified directory. However, if any of the files contain a space in the name, the script would not work. There is a solution, however, using the $@ variable.

$@ - All positional parameters with spaces

The "$@" variable is very similar to the the "$*" variable. Yet, there is a subtle, but important distinction. In both cases, all of the positional parameters, starting with $1, are listed, separated by spaces. If there are spaces inside the variables, then "$@" retains the spaces, while "$*" does not. An example will help. Here is a script, called EchoArgs, that echoes its arguments:


#!/bin/sh
# Scriptname: EchoArgs
# It echoes arguments
#First - make sure we are using the Berkeley style echoes
PATH=/usr/ucb:$path;export PATH
E="echo -n"
# echo the name of the script
${E} $0:
# now echo each argument, but put a space
# before the argument, and place single quotes
# around each argument
${E} " '${1-"?"}'"
${E} " '${2-"?"}'"
${E} " '${3-"?"}'"
${E} " '${4-"?"}'"
${E} " '${5-"?"}'"
${E} " '${6-"?"}'"
${E} " '${7-"?"}'"
echo

Click here to get file: EchoArgs.sh

Second, here is a script that tests the difference:

#!/bin/sh
EchoArgs $*
EchoArgs $@
EchoArgs "$*"
EchoArgs "$@"

Click here to get file: TestEchoArgs.sh

Now, let's execute the script with arguments that contain spaces:
./TestEcho "a b c" 'd e' f g
The script outputs the following:
./EchoArgs: 'a' 'b' 'c' 'd' 'e' 'f' 'g' ./EchoArgs: 'a' 'b' 'c' 'd' 'e' 'f' 'g' ./EchoArgs: 'a b c d e f g' '?' '?' '?' '?' '?' '?' ./EchoArgs: 'a b c' 'd e' 'f' 'g' '?' '?' '?'
As you can see, $* and $@ act the same when they are not contained in double quotes. But within double quotes, the $* variable treats spaces within variables, and spaces between variables the same. The variable $@ retains the spaces. Most of the time $* is fine. However, if your arguments will ever have spaces in them, then the $@ is required.

$# - Number of parameters

The "$#" variable is equal to the number of arguments passed to the script. If newscript returned $# as a results, then both
newscript a b c d
and
newscript "a b c" 'd e' f g
would report 4. The command
shift $#
"erases" all parameters because it shifts them away, so it is lost forever.

$$ - Current process ID

The variable "$$" corresponds to the process ID of the current shell running the script. Since no two processes have the same identification number, this is useful in picking a unique temporary filename. The following script selects a unique filename, uses it, then deletes it:

#!/bin/sh
filename=/tmp/$0.$$
cat "$@" | wc -l >$filename
echo `cat $filename` lines were found
/bin/rm $filename

Click here to get file: CountLines0.sh

Another use of this variable is to allow one process to stop a second process. Suppose the first process executed
echo $$ >/tmp/job.pid
A second script can kill the first one, assuming it has permissions, using
kill -HUP `cat /tmp/job.pid`
The kill command sends the signal specified to the indicated process. In the above case, the signal is the hang-up, or HUP signal. If you logged into a system from home, and your modem lost the connection, your shell would receive the HUP signal.
I hope you don't mind a brief discourse into signals, but these concepts are closely related, so it is worth while to cover them together. Any professional-quality script should terminate gracefully. That is, if you kill the script, there should be no extra files left over, and all of the processes should quit at the same time. Most people just put all temporary files in the /tmp directory, and hope that eventually these files will be deleted. They will be, but sometimes the temporary files are big, and can fill up the /tmp disk. Also some people don't mind if it takes a while for a script to finish, but if it causes the system to slow down, or it is sending a lot of error messages to the terminal, then you should stop all child processes of your script when your script is interrupted. This is done with a trap command, which takes one string, and any number of signals as an argument. Therefore a script that kills a second script could be written using:
#!/bin/sh # execute a script that creates /tmp/job.pid newscript & trap 'kill -HUP `cat /tmp/job.pid`' 0 HUP INT TERM # continue on, waiting for the other to finish
Signals are a very crude form of inter-process communication. You can only send signals to processes running under your user name. HUP corresponds to a hang-up, INT is an interrupt, like a control-C, and TERM is the terminate command. You can use the numbers associated with these signals if you wish, which are 1, 2, and 15. Signal number zero is special. It indicates the script is finished. Therefore setting a trap at signal zero is a way to make sure some commands are done at the end of the script. Signal 1, or the HUP signal, is generally considered to be the mildest signal. Many programs use this to indicate the program should restart itself. Other signals typically mean stop soon (15), while the strongest signal (9) cannot be trapped because it means the process should stop immediately. Therefore if you kill a shell script with signal 9, it cannot clean up any temporary files, even if it wanted to.

$! - ID of Background job

The previous example with $$ requires the process to create a special filename. This is not necessary if your script launched the other script. This information is returned in the "$!" variable. It indicates the process ID of the process executed with the ampersand, which may be called an asynchronous, or background process. Here is the way to start a background process, do something else, and wait for the background job to finish:
#!/bin/sh newscript & trap "kill -TERM $!" 0 1 2 15 # do something else wait $!
I used the numbers instead of the names of the signals. I used double quotes, so that the variable $! is evaluated properly. I also used the wait command, which causes the shell to sleep until that process is finished. This script will run two shell processes at the same time, yet if the user presses control-C, both processes die. Most of the time, shell programmers don't bother with this. However, if you are running several processes, and one never terminates (like a "tail -f)" then this sort of control is required. Another use is to make sure a script doesn't run for a long time. You can start a command in the background, and sleep a fixed about of time. If the background process doesn't finish by then, kill it:
#!/bin/sh newscript & sleep 10 kill -TERM $!
The $! variable is only changed when a job is executed with a "&" at the end. The C shell does not have an equivalent of the $! variable. This is one of the reasons the C shell is not suitable for high-quality shell scripts. Another reason is the C shell has a command similar to trap, but it uses one command for all signals, while the Bourne shell allows you to perform different actions for different signals.
The wait command does not need an argument. If executed with no arguments, it waits for all processes to be finished. You can launch several jobs at once using
#!/bin/sh job1 & pid=$! job2 & pid="$pid $!" job3 & pid="$pid $!" trap "kill -15 $pid" 0 1 2 15 wait

$? - error status

The "$?" variable is equal to the error return of the previous program. You can remember this variable, print it out, or perform different actions based on various errors. You can use this to pass information back to the calling shell by exiting a shell script with the number as an argument. Example:
#!/bin/sh # this is script1 exit 12
Then the following script
#!/bin/sh script1 echo $?
would print 12.

$- Set variables


The variable "$-" corresponds to certain internal variables inside the shell. I'll discuss this next.

Options and debugging


The Bourne Shell set command is somewhat unusual. It has two purposes: setting certain shell options, and setting positional parameters. I mentioned positional parameters earlier. These are the arguments passed to a shell script. You can think of them as an array, of which you can only see the first nine values, by using the special variables $1 through $9. As I mentioned earlier, you can use the shift command to discard the first one, and move $2 to $1, etc. If you want to see all of your variables, the command
set
will list all of them, including those marked for export. You can't tell which ones are marked. But the external command env will do this.
You can also explicitly set these variables, by using the set command. Therefore the Bourne shell has one array, but only one. You can place anything in this array, but you lose the old values. You can keep them, however, by a simple assignment:
old=$@ set a b c # variable $1 is now equal to "a", $2=b, and $3=c set $old # variable $1 now has the original value, as does $2, etc.
This isn't perfect. If any argument has a space inside, this information isn't retained. That is, if the first argument is "a b," and the second is "c," then afterwards the first argument will be "a," the second "b," and the third "c." You may have to explicitly handle each one:
one=$1;two=$2;three=$3 set a b c # argument $1 is "a", etc. set "$one" "$two" "$three" # argument $1, $2 and $3 are restored
If you wanted to clear all of the positional parameters, try this:
set x;shift

Special options

As you recall, the dollar sign is a special character in the Bourne shell. Normally, it's used to identify variables. If the variable starts with a letter, it's a normal variable. If it starts with a number, it's a positional parameter, used to pass parameters to a shell script. Earlier, I've discussed the $*, $@, $#, $$, and $! special variables. But there are another class of variables, or perhaps the proper term is flags or options. They are not read. That is, you don't use them in strings, tests, filenames, or anything like this. These variables are boolean variables, and are internal to the shell. That is, they are either true or false. You cannot assign arbitrary values to them using the "=" character. Instead, you use the set command. Also, you can set them and clean them, but you cannot read them. At least, not like other variables. You read them by examining the "$-" variable, which shows you which ones are set.
Excuse me, but I am going to fast. I'm teaching you how to run, before I explained walking. Let's discuss the first flag, and why it's useful.

X - Bourne Shell echo flag

If you are having trouble understanding how a shell script works, you could modify the script, adding echo commands so you can see what is happening. Another solution is to execute the script with the "x" flag. There are three ways to set this flag. The first, and perhaps easiest, is to specify the option when executing the script: To demonstrate, assume the file script is:
#!/bin/sh a=$1 echo a is $a
Then if you type
sh -x script abc
the script will print out
a=abc + echo a is abc a is abc
Notice that built-in commands are displayed, while external commands are displayed with a "+" before each line. If you have several commands separated by a semicolon, each part would be displayed on its own line.
The "x" variable shows you each line before it executes it. The second way to turn on this variable is to modify the first line of the script, i.e.:
#!/bin/sh -x
As you can see, the first way is convenient if you want to run the script once with the variable set, while the second is useful if you plan to repeat this several times in a row. A large and complex script, however, is difficult to debug when there are hundreds of lines to watch. The solution is to turn the variable on and off as needed, inside the script. The command
set -x
turns it on, while
set +x
turns the flag off again. You can, therefore, turn the "echo before execute" flag on or off when convenient.

V - Bourne Shell verbose flag

A similar flag is the "v," or verbose flag. It is also useful in debugging scripts. The difference is this: The "v" flag echoes the line as it is read, while the "x" flag causes each command to be echoed as it is executed. Let's examine this in more detail. Given the script:
#!/bin/sh # comment a=${1:-`whoami`};b=${2:-`hostname`} echo user $a is using computer $b
typing "sh -x script" causes:
+ whoami a=barnett + hostname b=grymoire + echo user barnett is using computer grymoire user barnett is using computer grymoire
However, "sh -v script" reports
#!/bin/sh # comment a=${1:-`whoami`};b=${2:-`hostname`} echo user $a is using computer $b user barnett is using computer grymoire
As you can see, the comments are echoed with the verbose flag. Also, each line is echoed before the variables and the commands in backquotes are evaluated. Also note the "x" command echoes the assignment to variables a and b on two lines, while the verbose flag echoed one line. Perhaps the best way to understand the difference is the verbose flag echoes the line before the shell does anything with it, while the "x" flag causes the shell to echo each command. Think of it as a case of Before and After.

Combining flags

You can combine the flags if you wish. Execute a script with
sh -x -v script
or more briefly
sh -xv script
Inside, you can use any of these commands
set -x -v set -xv set +x +v set +xv
The first line of a script has an exception. You can use the format
#!/bin/sh -xv
but the following will not work:
#!/bin/sh -x -v
UNIX systems only pass the first argument to the interpreter. In the example above, the shell never sees the "-v" option.

U - unset variables

Another useful flag for debugging is the "u" flag. Previously, I mentioned how the variable form "${x:?}" reports an error if the variable is null or not set. Well, instead of changing every variable to this form, just use the "-u" flag, which will report an error for any unset variable.

N - Bourne Shell non-execute flag

A simple way to check a complex shell script is the "-n" option. If set, the shell will read the script, and parse the commands, but not execute them. If you wanted to check for syntax errors, but not execute the script, use this command.

E - Bourne Shell exit flag

I haven't discussed the exit status much. Every external program or shell script exits with a status. A zero status is normal. Any positive value is usually an error. I normally check the status when I need to, and ignore it when I don't care. You can ignore errors by simply not looking at the error status, which is the "$?" variable I mentioned last time. (If the program prints error messages, you have to redirect the messages elsewhere). Still, you may have a case where the script isn't working the way you expect. The "-e" variable can be used for this: if any error occurs, the shell script will immediately exit. This can also be used to make sure that any errors are known and anticipated. This would be very important if you wanted to modify some information, but only if no errors have happened. You wouldn't want to corrupt some important database, would you? Suppose the following script is executed:
#!/bin/sh word=$1 grep $word my_file >/tmp/count count=`wc -l echo I found $count words in my_file
The script searches for a pattern inside a file, and prints out how many times the pattern is found. The grep program, however, exits with an error status if no words are found. If the "e" option is set, the shell terminates before executing the count program. If you were concerned about errors, you could set the "e" option at the beginning of the script. If you find out later that you want to ignore the error, bracket it with instructions to disable the option:
set +e # ignore errors grep $word my_file >/tmp/count set -e

T - Bourne Shell test one command flag

Another way to make a script exit quickly is to use the "t" option. This causes the shell to execute one more line, then exit. This would be useful if you wanted to check for the existence of a script, but didn't want it to complete. Perhaps the script takes a long time to execute, and you just care if it's there. In this case, executing
sh -t script
will do this for you.

A - Bourne Shell mark for export flag

Previously, I mentioned you had to explicitly export a variable to place it in the environment, so other programs can find it. That is, if you execute these commands
a=newvalue newscript
The script newscript will now know the value of variable "a."
in the environment with
export a
A second way to do this is to assign the variable right before executing the script:
a=newvalue newscript

This is an unusual form, and not often used. There is no semicolon on the line. If there was a semicolon between the assignment and myscript, the variable "a" would not be made an environment variable.
Another way to do this is to set the "a" option:
set -a
If set, all variables that are modified or created will be exported. This could be very useful if you split one large script into two smaller scripts, and want to make sure all variables defined on one script are known to the other.

K - Bourne Shell keyword flag

While many of the options I have discussed are useful for debugging, or working around problems, other options solve subtle problems. An obscure option is the "k" switch. Consider the following Bourne shell command
a=1 myscript b=2 c d=3
When myscript executes, four pieces of information are passed to the program: The environment variable "a" has the value 1. Three arguments are passed to the script: "b=2," "c," and "d=3."
Any assignment on the same line as a command is made an environment variable, but only if it's before the command. The "-k" options changes this All three assignments become environment variables, and the script only sees one argument.

H - Bourne Shell hash functions flag

I've read the manual page, and was unclear. This seems to be a way to speed up program executings by pre-storing the paths for each command. The Bash manpage says this is enabled by default. "-h" option.

The $- variable

As I mentioned, you can use the set command to change the value of these flags. However, it cannot be used to check the values. There is a special variable, called "$-," which contains the current options. You can print the values, or test them. It has another use. Suppose you had a complex script that called other scripts. Suppose you wanted to debug all of the scripts. You could modify every script, add the option you wanted. That is, assume, newscript might contain
#!/bin/sh myscript arg1 arg2
If this is replaced by
#!/bin/sh sh -$- myscript arg1 arg2

then if you typed "sh -x newscript," myscript would also see the "-x" option.

- - Bourne Shell hyphen option

I should mention that you can set options as well as positional parameters on the same set command. That is, you can type
set -xvua a b c
There is another special option, that isn't really an option. Instead, it solves a special problem. Suppose you want one of these parameters to start with a hyphen? That is, suppose you have the following script, called myscript:
#!/bin/sh # remember the old parameters old=$@ set a b c # $1, $2, $3 are changed. # now - put them back set $old
Looks simple. But what happens if you execute this script with the following arguments:
myscript -d abc
Can you see what will happen? You will get an error, when the system reports
-d: bad option(s)
The set command thinks the "-d" argument is a shell option. The solution is to set the special hyphen-hyphen flag. This tells the shell that the rest of the arguments are not options, but positional parameters:
#!/bin/sh
# remember the old parameters
old=$@
set a b c
# $1, $2, $3 are changed.
# now - put them back, NOTE the change
set -- $old

No comments:

Post a Comment