|| Perl Tutorial, Part 1 ||
By Ch4r/Niels |

- Binary Universe - | #binaryuniverse
- DI-Security - | #damageinc
- st0rage - | #st0rage
-Reflections -

| Copy Info |

This tutorial may be redistributed as long as it remains completely unchanged and full credit is given to me, Ch4r/Niels.

| Shouts |

Shouts to mu, dlab, Cryptic, Oropix, deep, CreepyNodque, Sintigan, ScM, Tele, Ic3D4ne, Ee77, ponyboy, Inviz, and everyone that I forgot.

| Introduction |

If you bothered to read the title, you'd know that this is a Perl tutorial. This tutorial doesn't aim to teach you every function in the Perl scripting language - it's meant to be a simple introduction to the language that teaches you the basics and should provide with enough for you to take your first step down the path of Perl coding. If you're looking for some more material on Perl after reading this, I'd recommend O'Reilly's Perl books, Google, (look in the Perl section), and (Okidan's site, which has several tutorials and ebooks related to Perl as well as Cryptography, Security, etc).

So what is Perl? Perl, short for Practical Extraction and Report Language, is a scripting language that was originally invented by Larry Wall for working with text files. It's not a compiled language like C/++ or Java in that programs are first compiled into executables; it's an interpreted language, meaning that you simply tell the interpreter to parse the file containing the Perl script and it executes it line by line. This makes programs quicker in the sense that there is no compile time to wait through, but it also means that the programs may run a bit slower than the equivalent in a compiled language.

If you have any comments on this tutorial, please feel welcome to contact me in whatever way you please with your feedback. Also note that I'm far from mastering Perl, so if you see something that looks messed up please contact me. Enjoy...

| What You Need |

You will need a few things to get started with Perl programming. First, you will need an interpreter to interpret and run the scripts you write. If you're using some form of *nix (Linux, BSD, Mac OS, Solaris, etc) you probably already have a Perl interpreter installed - you can check by opening a shell and entering the command 'perl'. If you get a 'command not found' error, you probably don't have Perl installed.

If you're running Windows (yuck) or for some other reason don't have Perl installed, you can download Perl from While we're on the topic of operating systems, I should also say this: I use Linux in this tutorial, but this should also work on other operating systems.

| Hello, World! |

Similar to many other programming texts, we're going to begin by creating a simple program that
prints "Hello, World!" to standard output. Open up a text editor such as pico, vim, Notepad, etc
and enter the following code (the ----- is just to show where the code is, don't enter that):


print "Hello, world!\n";


Save the file as (or anything else with a .pl extension) and then open a command line interface. Change directories to the directory you saved the script in and type 'perl'. The output should be "Hello, world!":

bash-2.05$ perl
Hello, world!

Let's go over the source... the first line, "#!/usr/bin/perl", tells the interpreter the path to where Perl is installed on your computer. If you use Windows you don't really need this line, but if you use *nix you must have it at the first line of your script and replace /usr/bin/perl with
wherever Perl is installed. If you're having problems and are on *nix that could be why, but note that Perl is usually installed to /usr/bin/perl. The line that reads 'print "Hello, world!\n";' tells the interpreter to print the text 'Hello, world!' followed by a newline. The print function is used to print the text between quotes, the text 'Hello, world!' is the text to be printed to standard output by the print function, and the '\n' character tells Perl to print a newline. The semicolon is required at the end of a line in Perl, similar to several other languages.

| Strings |

Strings are simply several characters put together, usually forming something meaningful. For example, the word "perl" is a string as is the sentence "Binary Universe pwns all" or the progression of random characters "vre896q7cdrnhfqy7r8_@09". In Perl, strings are printed to standard output (usually the screen) using the print function. For example, we saw the line

print "Hello, world!\n";

in the script. As we already saw, this prints the string "Hello, world!" followed by a newline. The \n (newline) character is what is called an escape sequence. There are many escape sequences that perform various tasks such as \t, which prints a tab, and \a, which is a beep. Note that \n is by far the most commonly used escape sequence.

So what if we wanted to print the progression of characters "\" and "n" instead of printing a newline? In that case, we would execute one of two possibilities. The first is to replace the double quoted string "blah blah blah\n" with 'blah blah blah\n'. The single quotes tell the interpreter that the string should be interpreted literally, meaning that whatever is typed will be printed whether or not it is an escape sequence. Replacing the print statement in the script with a single quoted string would print "Hello, world!\n" instead of "Hello, world!" and a newline.

The second possibility is to replace "\n" with "\\n". This works because \\ in a double quoted string means that the character \ is printed, leaving the character "n" unaffedted by the previous backslash. Thus, the "n" is printed as well, making the output "Hello, world!\n".

Now let's make things a tad more complex - look at the following example:

print "Hello, " . "world" . "!\n";

If you substitute the print statement in the script with the above one, the same result
is achieved. This is because the dot is used for concatenation, meaning that it is used for
joining several strings together. In the above example it joins the strings "Hello, ", "world",
and "!\n" together to be printed.

The same effect could be achieved by using commas instead of dots:

print "Hello, ", "world", "!\n";

This works the way it does because the print function takes a list of strings to print separated by commas (although normally only one string is printed and there is no need for commas).

| Numbers and Numeric Expressions |

Mathematical expressions (eg 2+2, 6-5, 2+2+6-5, etc) can be evaluated in Perl. The following are some commonly used operators:

+ - addition
- - subtraction
* - multiplication
/ - division
% - modulus/remainder


print 20+3; #Prints 23
print 20-3; #Prints 17
print 20*3; #Prints 60
print 20/3; #Prints 6.66666666666667
print 20%3; #Prints 2

This should be fairly straightforward with the exception of one thing - wtf does the "#" mean? In Perl, # denotes a comment, meaning that everything following the # until the end of the line is to be ignored by the interpreter. So when we enter "#Prints 23" the interpreter doesn't care because it simply ignores that part of the line seeing as it is commented out.

One very important concept to remember when writing code with mathematical statements in it is operator precedence. For example, 5+5*5 returns 30 because the interpreter first evaluates 5*5,
which yields 25, and then adds five to it. The same concept holds true in basic arithmetic. If we wanted the interpreter to evaluate 5+5 first and then multiply the result by five, we would simply enclose the statement 5+5 in parenthesis: (5+5)*5. This returns 50, not 30.

| Variables |

A variable is simply a series of characters used to represent a string or number. For example, if I assign the variable "$variable" the string value "I am a variable", wherever we use $variable we are actually using the string "I am a variable" (with a few exceptions such as if it is used in a single quoted string). So if we print "$variable" we are actually printing "I am a variable".

Scalar variables in Perl always begin with the $ character followed by the name of the variable (a scalar variable is one variable that is assigned one value; we'll see examples of non-scalar variables later). In addition to starting with the $ character, the character following the $ character must be either an underscore or a letter and all the characters after that may be underscores, letters, or numbers. Following are what would be valid variable names:


And now, what are invalid variable names that generate errors:


Values are assigned to variables using the = operator and can be printed using the print statement. Here's another modification to the program:
#!/usr/bin/perl - using variables

$hello_var = "Hello, world!\n";
print $hello_var;

In this example first the string "Hello, world!\n" (with \n being a newline) is assigned to the variable $hello_var and $hello_var is printed to standard output. Since $hello_var is a variable containing the string "Hello, world!\n", "Hello, world!\n" is printed. Variables can also be printed as part of a double quoted string, or concatenated together with other strings in the print statement.

To print the string "$hello_var" to standard output literally instead of the contents of the variable $hello_var we have the same possibilities that arose when we wanted to print the string "\n" literally instead of a newline: using a single quoted string instead, or placing a backslash before the value we want to print literally:

print "\$hello_var is $hello_var";
print '$hello_var is' . " $hello_var";

The first prints "$hello_var is Hello, world!" and then a newline (we didn't need to add the \n escape sequence at the end because the variable $hello_var already contained a newline) and the second example prints the same thing using a single quoted string concatenated with a double quoted string.

| Boolean Tests & Control Structures |

Most useful programs usually execute different code based on whether a given test returns true or false. The concept of having a script act differently based on whether something is true or false is a very important one in your future as a coder as you will find yourself implementing boolean tests into your programs with frequency of you pursue much of a future in coding at all.

One of the most commonly used features of the Perl language (and many other programming languages for that matter) is the concept of the 'if' control structure. The if control structure receives an expression, and if the expression is true it executes the block of code enclosed in {} braces directly after it. If the previous sentence seems confusing, don't worry -- after seeing some examples you'll find the if control structure easy to understand. Here is the basic syntax of the if control structure:
if (boolean_test) {
statements to execute
if boolean test is true
go here
For example, the following code tests to see if 5 > 3 (five is greater than three) and if it is prints the message "It's true...5 is greater than 3!", and then a newline:
if (5 > 3) {
print "It's true...5 is greater than 3!\n";
If we switched the 5 and the 3 so that it read "if (3>5) {...}", the print statement would _not_ be executed because 3 is _not_ greater than 5. The if control structure can be read in English simply as "if [test_here] then [code_to_execute_if_test_returns_true]". In this case, "if five is greater than three then print 'It's true...5 is greater than 3!'".

We aren't limited to using > in if conditional. We can test whether a number is less than, equal to, or not equal to another as well as performing other tests. In addition, we can specify multiple tests that must return true for the body of the control structure to execute. Some of the operators we could use are as follows:

== - equal to
> - greater than
< - less than
>= - greater than or equal to
<= - less than or equal to
!= - not equal to
|| - logical or
&& - logical and

These are fairly straightforward and can be implemented in the same fashion as > was in our example with the exception of the last two, which require an explanation.

||, logical or, is used when multiple tests are implemented in the if control structure. If any one of the tests joined together with the || operator(s) is true, then the expression as a whole returns true and the body of the if structure is executed. For example, the following executes the print statement if any of the tests 5 < 3, 4 < 7, or 10 < 9 is true:
if (5 < 3 || 4 < 7 || 10 < 9) {
print "One of the conditions was true\n";

&&, logical and, works the same way except all of the conditions must be true for the expression as a whole to return true. If you substitute all of the || operators in the previous example with && operators, the print function is _not_ executed as not all of the conditions were true (namely, 5 is not less than three and 10 is not less than 9). However, if we use the following the print statement _is_ executed:
if (3 < 5 && 4!=9 && 6>= 5 && 8 == 8) {
print "All of the conditions were true\n";

A useful variation of the if control structure is the if-else control structure. The if-else control structure functions in the same way as the if control structure but in addition to executing a block of code if the given condition is true, it executes an alternate block of code if the given condition is false. Example:
if (3 > 5) {
print "Three seems to be greater than five.\n";

else {
print "Well, three is not greater than five.\n";
In this example, the Perl interpreter first checks whether 3 > 5. As three is not greater than five, the interpreter skips over the rest of the if control structure. It then arrives at the else structure and seeing as 'else' specifies an alternate block of code to execute when the condition is false, the print statement declaring that three is not greater than five is executed.

So far we've seen two conditional control structures, if and if-else. There is, however, another widely used type of control structure that is used as a method of repitition - a loop. The first type of loop covered here is the while loop. This loop is given a test, similar to the if control structure, and while the test is true it continues to execute the block of code enclosed in braces. Here's an example:
$i = 0;
while ($i <= 10) {
print "$i is not more than 10\n";
$i = $i + 1;
This loop prints the following to standard output:

0 is not more than 10
1 is not more than 10
2 is not more than 10
3 is not more than 10
4 is not more than 10
5 is not more than 10
6 is not more than 10
7 is not more than 10
8 is not more than 10
9 is not more than 10
10 is not more than 10

How does this work? The while loop is given a condition -- $i <= 10. $i is set to 0 and zero is less than or equal to ten, so the body of loop is executed. The body of the loop consists of a print statement, and then assigns the variable $i a value of itself plus one. $i is now 1. As one is less than or equal to ten, the process repeats. This continues until the final iteration of the loop, when $i is 10. Ten is less than or equal to ten, so the block is again executed. Now $i is again assigned the value of itself plus one, which equals 11. As eleven is not less than or equal to ten, the body of the while loop is not executed and execution of the script continues past the while loop.

Note that Perl offers us a couple of commonly used shortcuts to rewrite the expression "$i = $i + 1". The first of these allows us to replace "$i = $i + n" with "$i += n" (where n is a number). This is not simply limited to adding a given amount to a variable though -- the same notation can be implemented for subtracting, multiplying, or dividing.

The following chart lists some expressions that can be rewritten with this shorter notation, and then shows the equivalent using Perl's +=/-=/*=//= shortcut.

Shorter Equivalent
$i = $i + 17
$i += 17
$j = $j * 12
$j *= 12
$a = $a / 27
$a /= 27
$k = $k + 3
$k += 3
$v = $v - 7
$v -= 7

Perl also offers a second shortcut that is used to add one or subtract one from a specific variable. The syntax for this shortcut is simply:

$variable++; #increments $variable
$variable--; #decrements $variable

Thus, we could rewrite the while loop used previous with "$i++" instead of "$i = $i + 1" and achieve the same result:
$i = 0;
while ($i <= 10) {
print "$i is not more than 10\n";
$i++; #we could also use $i += 1

The for/foreach loop is a bit trickier to understand than the while loop. The following is the same as our previous while loop that prints "$i is not more than 10\n" and then adds one to $i, but it is implemented with a for loop instead:
for ($i = 0; $i <= 10; $i++) {
print "$i is not more than 10\n";
The most confusing area is the line directly after the keyword "for", which in the while, if, and if/else control structures held a value that needed to return true for the body of the structure to be executed. In the for (and foreach, as we'll discuss in a moment) control structure, this area is broken down into three sections which are separated by semicolons.

The first section is where the counter variable is assigned a value. As the for loop is used primarily to repeat something a specific number of times, it usually uses a variable to keep track of how many times the body of the loop has been executed. In the while loop, we executed the body of the loop 11 times (0 -- 10) and used the $i variable to keep track of how many times we had iterated (gone through) the loop. The variable used for this purpose is referred to as the counter, as it counts the number of times we have iterated through the loop. In this case, the counter is $i and here it is assigned a value of 0:

for ($i = 0; $i <= 10; $i++) {

The second section of the above line, which begins immediately after the first semicolon and is terminated with the second semicolon, supplies the condition that must be met for the body of the loop to be executed. In this case, the variable $i must be less than or equal to ten ($i <= 10) for another iteration of the loop to take place.

The last of the three sections between the parentheses, which begins directly after the second (and final) semicolon, is the action that must be performed on the counter variable at the end of each iteration of the loop. In this case, 1 is added to the current value of $i ($i++).

Note that the variables used in the three different sections of the first line do not have to be the same; we could have used completely different variables such as:

for ($i = 0; $j <= 10; $k++)

However, this doesn't make much sense and defeats the purpose of the for loop, which is to have a cleaner and more organized way of iterating through a loop a specific number of times.

Also note that the foreach loop works exactly the same way as the for loop. The following accomplishes the same thing as the for and while loops we used before:
foreach($i = 0; $i <= 10; $i++) {
print "$i is not more than 10\n";

There are alternate uses for the for/foreach loop and we will cover them in upcoming sections.

Note that these are not the only control structures that Perl provides. You may also hear about or see the until and unless control structures. The until control structure is the exact opposite of a while loop: it executes its body as long as the condition it is given is false. The unless control structure is the opposite of the if control structure: it executes its body if the condition it is given is false. Finally, we'll also cover the if/elsif/else control structure later in this tutorial.

| Arrays |

We discussed scalar variables earlier -- scalar variables were one variable assigned one value. Now we'll discuss arrays, which are one variable assigned multiple values. Arrays will prove quite useful for organizing data, and although the idea of one variable with several values may sound like a confusing idea as well as one that isn't necessary, you'll soon see that it is actually quite easy to understand and quite useful. Following is a diagram that charts scalar variables and array variables, and how they are organized:

$variable ->
name ->

@array ->
"value 1"
"another value"
name ->

The first diagram shows the anatomy of a scalar variable. A variable, in this case one named $variable, is assigned a value, in this case the string "value". Simple enough; we've been doing that since almost the beginning of this tutorial.

The second diagram shows the anatomy of an array. A variable, in this case one named @array, is assigned multiple values, in this case "value 1", 27, "another value", and 34565. If you take a look at the bottom row, you'll notice that each value has an index number: the first value has an index of 0, the second value has an index of 1, the third an index of 2, and the fourth an index of 3. As an array variable can hold many values, we need a way to define which value we are referring to when we use the name of the array in our program. The index numbers assist us in this -- when we want to refer to an element of an array variable, we use the name of the array and the index number of the element. For example, we could refer to the string "another value", which is an element of the array @array, as "@array, index 2" (as that string has an index number of 2 in @array).

You may have noticed by now that the variable name is "@array" and not "$array". Array variables are prefixed with the @ symbol, while scalar variables are prefixed with the $ symbol. To confuse things even more, when we refer to an individual element of an array, we prefix the array name with $ and not @. This is because an individual element of an array is scalar data by itself, and when a variable holds one piece of data, scalar data, it is prefixed with $. When we are referring to the array as a whole, however, it holds several pieces of data, so we prefix it with @.

All this discussion is useless if we do not know how to implement it within our Perl scripts though. The following code is used to declare an array that holds the values "value one", "another value - value two", 39, and 908.

@new_array = ("value one", "another value - value two", 39, 908);

It was previously mentioned that a specific element of an array is accessed by using the name of the array and the index number of the element. How is that implemented in our Perl code though? Take a look at the following example:
print $new_array[0] . "\n";
This line prints the element at index 0 in @new_array concatenated with a newline (note that we used $new_array as opposed to @new_array, as a specific element of an array is by itself scalar data -- one piece of data). Thus, the above is the same as the line:
print "value one\n";

If you study the first example you'll be able to tell that to access an individual element of an array, we use the name of the array followed the index number, which is enclosed in square brackets. Once we've specified it's location, we can treat it similar to other scalar variables (if you're saying 'huh? But it's in an array, so it isn't a scalar variable!', the answer is that while it's in an array, it is a scalar variable by itself. This is an important concept to grasp, which is why I've repeated it several times :P)...perform mathematical expressions on it, assign it a value, print it, and perform a variety of other operations on it.

I mentioned earlier in this tutorial that the for/foreach loop had other uses and that I would cover them later on. "Later on" has arrived. It turns out that the for/foreach loop can be used to iterate through the elements of an array. The foreach loop is in this way close to English in that is says "foreach @array", which translates to "foreach element of @array". As an example, the following code is used to print each element of the array we used previously, @new_array:
foreach (@new_array) {
print $_;
There is one question concerning the above code that will probably arise, and it is worth an explanation -- you, the reader, are probably sitting there going "Wtf does '$_' mean?!" The answer is that in the above example, the current element of @new_array for each iteration is stored in $_.

If we wanted to change the variable name that this value was stored in, we would simply add the variable to store the value in immediately before the part of the first line that reads "(@new_array)". For example, the following code does the same thing as the last bit of code used to demonstrate the foreach loop but it stores the current element of the array in $a_var as opposed to $_:
foreach $a_var (@new_array) {
print $a_var;

| Input |

So far we've seen how to send data to standard output, but receiving input from users is often a requirement for a useful script. This is actually a fairly easy task in Perl. The following example receives some input from a user and stores it in the variable $teh_inputz0r:
chomp($teh_inputz0r = <STDIN>
<STDIN> is the main component here. The less than and greater than symbols denote a file handle to be used and "STDIN" is the name of the file handle (in this case, standard input; yes, standard input is represented as a file). An understanding of how file handles work is not needed to understand standard input though, and file handles will not be covered in this tutorial. In other words, <STDIN> represents standard input, which is usually input received from the keyboard. In this case, we're assigning the input to a variable, $teh_inputz0r ($teh_inputz0r = <STDIN&gt.

The chomp function simply removes the ending newline of a string. When the user enters text that is assigned to $teh_inputz0r, it is terminated with a newline. The chomp function removes that trailing newline from the string.

| Wrapping It Up |

I've decided to release my Perl tutorial in several parts. This is part one, and part two, along with a possible part three, will introduce more concepts such as hashes, functions, regular expressions, sockets, and more. This tutorial should have provided a very basic, although not quite complete, introduction to Perl and my future tutorials will build upon that. Expect them to be out in not too long.

I've decided to add an extra feature to this tutorial to demonstrate some very basic ways the information presented in this tutorial can be used. This is a simple script that covers most of the topics introduced in this text. It receives several numbers as input from the user and finds the average of them. Note that before it does this, it asks the user to enter how many numbers they will be entering. This is a (somewhat simple) example example of what arrays can do that simple scalar variables can not. Here's the script:
# Finds the average of numbers entered by the users

print "This script allows you to enter however many numbers you choose and then finds the mean of those numbers. How many numbers will you be entering? ";
chomp ($count = <STDIN>

for ($i = 0; $i < $count; $i++) {
print "Enter number: ";
chomp ($num = <STDIN>
push @num_array, $num

foreach (@num_array) {
$average += $_;

$average /= $count;
print "The average is $average.\n";

# The end.

I hope you enjoyed this tutorial and learned something from it. Won't be long until part two's here!


/* Edit: it appears that the formatting got a little messed up. I'll try to fix that. */