*please forgive some of the formatting. vi to webpage is not the best conversion *

Using Perl on the Command Line

Most of us have at least heard of Perl as a scripting language, but what many of us do not know is that Perl also serves as a wonderful command-line Unix power tool. Perl can be used by itself on the command line or in combination with other standard Unix tools to form commands that have the power of even lengthy shell scripts. One of Perl's greatest facets is that it was built on the backs of several utilities that were already very effective in their own right -- namely sed and awk. In this tutorial my aim is to go over some the different command-line switches available to Perl, and then show you some examples of command-line Perl in action both by itself and with other Unix utilities.

While it is extremely helpful to know something about Perl and regular expressions ahead of time to help you formulate your own commands using Perl, some of the Perl power commands I will demonstrate can be used by anyone. In addition to this, this entire tutorial is written for the Unix/Linux environment. While these examples may work in Windows, there is no guarantee that they will work without some tweaking.

Perhaps the best way to segue into using Perl strictly on the command-line is to first review Perl's interactive mode. Many interpreted languages (like Perl) have the ability to be called on the command-line without any arguments. When this is done, the interpreter prepares to receive instructions one line at a time. Perl will stop reading lines when it receives a Ctrl-D, and then parse through all of them to produce a result:

Code:
# perl
print "Hello World!\n";
[ctrl-d]
Hello World!
#
While this can be handy if you have a short script to write and do not want to be bothered saving it out to a file, this method is rather clunky and there is very little to be gained by it. In addition, it can be very difficult to write a script in this method, as it is impossible to go back without deleting an entire line. It is easy to stumble into syntax errors, and correcting them can be a chore.

A better way to use Perl on the command-line is to make use of its robust command-line switches, each designed for efficiency and power. Perl has command-line switches to create looping constructs, to edit files in-place, to assume an array structure, along with a few others. To use these switches you must first designate your command-line scripts as one to be executed on the command-line as opposed to being contained within a file. To do this, use the -e switch. With the -e switch, Perl will not scan for a filename for the script as it executes. To demonstrate this syntax, here is the structure of the same simple script from the above interactive mode example re-written as a command-line Perl script:

Code:
# perl -e 'print "Hello World!\n";'
Hello World!
#
Notice that the meat of your script (the individual Perl command) is enclosed in single quotes. Within these quotes you may use double-quotes without a problem, but if you need to use another single-quote remember to escape it with a backslash (\).

From the above example you can see that the -e cannot do much of anything by itself other than tell Perl that the command following is to be read from the command-line. We haven't gained much in terms of efficiency or time consumption by using the command-line switches versus the interactive mode for such a small script, but it does get better.

The next switch to know is the -n switch which causes the commands enclosed in the single quotes to interate over a standard while loop. Intermediate Perl programmers may have noticed by now in their careers that a good number of scripts have a common loop construct between them that looks like this:

Code:
while(<>) {
  ... # the meat of your script
}
The -n switch mimics this construct and allows you to only enter the code that would normally go where the ... is in the above example. Here is an example of the steps needed to create a simple looping script designed to print out all the services in the /etc/services file beginning with a 'z' implemented as a fully developed shell script:

Code:
# vi service_search
Code:
#!/usr/bin/perl -w
while (<>) {
  if ($_ =~ /^m.*/) {
    print;
  }
}
exit;
Code:
# chmod 700 service_search
# ./service_search /etc/services
zip		  6/ddp	   #Zone Information Protocol
z39.50		210/tcp	   wais		#ANSI Z39.50
z39.50		210/udp	   wais		#ANSI Z39.50
zannet		317/tcp
zannet		317/udp
zserv		346/tcp	   #Zebra server
zserv		346/udp	   #Zebra server
zion-lm		1425/tcp   #Zion Software License Manager
zion-lm		1425/udp   #Zion Software License Manager
zephyr-clt	2103/udp   #Zephyr serv-hm connection
zephyr-hm	2104/udp   #Zephyr hostmanager
zebrasrv	2600/tcp   #zebra service
zebra		2601/tcp   #zebra vty
And below is the same script as above written as a one-line Perl command:

Code:
# perl -ne 'print if $_ =~ /^z.*/;' /etc/services
zip		  6/ddp	   #Zone Information Protocol
z39.50		210/tcp	   wais		#ANSI Z39.50
z39.50		210/udp	   wais		#ANSI Z39.50
zannet		317/tcp
zannet		317/udp
zserv		346/tcp	   #Zebra server
zserv		346/udp	   #Zebra server
zion-lm		1425/tcp   #Zion Software License Manager
zion-lm		1425/udp   #Zion Software License Manager
zephyr-clt	2103/udp   #Zephyr serv-hm connection
zephyr-hm	2104/udp   #Zephyr hostmanager
zebrasrv	2600/tcp   #zebra service
zebra		2601/tcp   #zebra vty
As you can see, both produce the same results, but the latter method was significantly shorter in terms of time and keystrokes.

A quick side note about the command-line method above: A leading dash (-) is not necessary for each switch. You may join most switches together into one clump with a few notable exceptions that I will touch on later.

The next command-line switch to know is the -p switch. -p is very similar to the -n switch in that it iterates the specified commands over a while loop construct, but with one difference -- a print command is added to print each line of the source data. Here's how it looks in a standard setting:

Code:
while (<>) {
  ...
} continue {
    print;
}
Essentially, this code will execute the code you specify in ..., as well as print every single line in the data stream. You can use this method to do some very rudimentary in-place editing on a configuration file. For an example, let's use the /etc/services file again. Suppose you wanted to add a warning line before each line that had "irc" in the name of the service so future admins would pay extra attention to them. Here's a full-blown Perl script creation method for that process:

Code:
# vi irc_warn
Code:
#!/usr/bin/perl -w
while (<>){
  if ($_ =~ /irc/) {
    print "*** Warning! IRC Service ***\n";
    print $_;
  }
  else {
    print $_;
  }
}
Code:
# chmod 700 irc_warn
# irc_warn
...
x11		6000/tcp   #6000-6063 are assigned to X Window System
x11		6000/udp
x11-ssh		6010/tcp   #Unofficial name, for convenience
x11-ssh		6010/udp
xdsxdm		6558/tcp
xdsxdm		6558/udp
*** Warning! IRC Service ***
ircd		6667/tcp   #Internet Relay Chat (unofficial)
acmsoda		6969/tcp
acmsoda		6969/udp
...
Naturally, the results of the above script would actually print out every line of your current /etc/services file with the addition of the warning line, "*** Warning! IRC Service ***" before each line containing the term "irc" in the service name. You could redirect the output of the script to a file and have a newly-edited copy of your /etc/services file with some added security warnings. Now, here is the same process as the above script at a much lower cost in keystrokes and a much better result in efficiency:

Code:
# perl -pe 'print "*** Warning! IRC Servie ***\n" if $_ =~ /irc/;' /etc/services
...
x11		6000/tcp   #6000-6063 are assigned to X Window System
x11		6000/udp
x11-ssh		6010/tcp   #Unofficial name, for convenience
x11-ssh		6010/udp
xdsxdm		6558/tcp
xdsxdm		6558/udp
*** Warning! IRC Service ***
ircd		6667/tcp   #Internet Relay Chat (unofficial)
acmsoda		6969/tcp
acmsoda		6969/udp
...
Again, we get the same result as the above script, but at a greatly reduced overall cost.

The next helpful switch in Perl's lineup is the -a switch. -a is for turning on the autosplit mode, and needs to be used with either the -n or -p. Each line read into the while loop created by the -n or -p switch is broken up using the Perl split command and stored in an internal array called @F. The default delimiter for the split is a space, but you can change the delimiter by using the -F switch (discussed later) immediately followed with a new delimiter. Here is what the autosplit construct looks like in standard Perl code:

Code:
while(<>) {
  @F = split / /, $_;
  ...
}
Autosplit is an especially useful component to command-line Perl, as it helps to quickly parse through delimited data for specific values. Delimited data is more common than you may think. For example, it may someday become necessary to see at a glance all of the disabled users in the /etc/passwd file (a file delimited by colons). If you were bent on the long way of doing this, you could create a script like this:

Code:
#!/usr/bin/perl -w
while(<>) {
  @F = split /:/, $_;
  if ($F[1] eq "*") {
    print "$F[0]:$F[1]\n";
  }
}
And now here is the same script as above condensed onto a single command-line command:

Code:
# perl -F: -ane 'print "$F[0]:$F[1]\n" if $F[1] eq "*";' /etc/passwd
Note: for those of you running shadow passwords, remember to substitute /etc/passwd with /etc/shadow.

Again, the command-line version of the script is obviously less work, and the result is the same. The above example also contained the -F switch that I mentioned earlier. As it appears syntactically above, you can provide the delimiter for the -F switch immediately after it (no spaces). If your delimiter is more than a space, you can enclose the entire term in "", //, or ''. The -F takes in any pattern as a delimiter, so regular expressions may be used.

The autosplit mode is also useful for data that does not seem to be delimited. With a clever use of the -F switch, you can extract very specific pieces of data in a short amount of time. For example, if you wanted to get system messages out of the /var/adm/syslog file without the timestamp at the beginning, you might create a complicated delimiter like this:

Code:
# perl -F"\w{3}\s\d{2}\s\d{2}:\d{2}:\d{2}\smyhost\s" -ane 'print "$F[1]";' /var/adm/syslog
This effectively divides each line into 2 segments. Since the delimiter (the timestamp) is the first thing on the line, the first segment is just null. The second segment is everything after the timestamp delimiter which is simply the rest of the line. Cool, eh?

The final command-line switch for Perl I will cover here is the -i switch. -i is used for in-place editing. One of the unfortunate things about the Perl predecessors (sed and awk) is that they were stream editors. They could not effect the file itself, only stream any new modifications to the screen. The results then had to be captured through redirection or a secondary process after the editing had completed. With the -i, Perl takes care of backing up the original file(s) and making any modifications immediately live. To use this option, simply include the the extension you would like to use for each file you edit. Be sure to include the "." if you want it, otherwise the extension will just be smooshed on the end of the filename. This mechanism can be invaluable for individuals working with mass edits on multiple files, like webmasters. Here is an example of a mass edit of multiple HTML files in a single directory using the full-script
approach:

Code:
$oldargv = "";
while(<>) {
  if ($ARGV ne $oldargv) {
    rename($ARGV, $ARGV . '.bak');
    open (FILE, ">$ARGV");
    select(FILE);
    $oldargv=$ARGV;
  }
  s/header\.php/footer\.php/;
} continue {
    print; 
}
Our full-script approaches are starting to become a bit more lengthy. To compare, once again, here is the command-line approach:

Code:
# perl -pi.bak -e 's/header\.php/footer\.php/;' *.php
Two methods with the same results, however the latter is definitely preferrable! The extension used in this case was ".bak", therefore the results of this script would first produce a copy of each PHP file (referenced by the "*.php) called "filename.php.bak". Next, the original file would be altered by changing the first instance of "header.php" to "footer.php".

When building your command-line power tools with the Perl command-line options, remember the many shortcuts Perl has for simple statements. For example, the standard if-else construct in Perl looks like this:

Code:
if (condition) {
  ...
} else {
 ...
}
This is a common construct, but remember that Perl provides a conditional statement that can shorten this block of code to one line:

Code:
condition ? satisfied result : else result;
Here is a conditional example to demonstrate the conditional construct described above:

Code:
if ($_ =~ /yes/) {
  print;
} else {
  continue;
}
This would translate into the following conditional statement:

Code:
$_ =~ /yes/ ? print : continue;
Perl has a fair number of these "shorthand" statements that can make your command-line tools shorter and more efficient. Keep these in mind when formulating your tools.

All these switches and constructs are powerful in their own right, but in combination with other standard Unix tools, you can create some extremely powerful tools with infinite possibilities. The Unix/Linux find command, for example, is a wonderful utility for searching a file system for files matching very specific criteria. Imagine executing a mass edit on all the files under your web server root directory that are permission level 755 or lower. It can be done as easy as this:

Code:
find /var/www/htdocs  -perm +755 -print -exec perl -pi.bak -e 's/footer/footer2/;' {} \;
Here is the breakdown of the above command:

find -- start the find command
/var/www/htdocs -- the directory you want to search under
-perm +755 -- search for files matching permission level 755 or lower
-print -- write the results to the screen (necessary for the Perl component to read each result)
-exec -- tells find to run the command following this switch for each file
perl -pi.bak -e 's/footer/footer2/;' -- our Perl command to swap "footer" with "footer2 for each result"
{} -- this construct allows you to reference the result produced by the find

With commands built like the one above, Perl can give you the ability to affect vast portions of your system with a single command. Unfortunately, that can be a blessing and a curse. Be sure that you are intimately familiar with these switches before you attempt mass changes on a system level! I hope this tutorial has given you a good foundation of Perl command-line options to build some robust Unix/Linux power tools. I encourage you to learn more about the Perl switches available to you as well as the other Unix/Linux tools available to you. A lot of system administration is finding ways to make your job easier. Perl and the standard Unix/Linux tools have the potential for making your job a breeze.

Feel free to PM me with any questions concerning the above material. I'll do my best to answer you promptly. At least 1 example shown here came from the book Programming Perl by Wall, Christiansen, and Schwartz. I highly recommend it. The rest of this tutorial is pure Ros.

-- roswell1329