Home > Whitepapers > A fast start to PERL programming
Bookmark and Share

A fast start to PERL programming

Posted on January 09, 2000 | Filed in Scripts

This tutorial may be reproduced without any modifications to this document for non-commercial, non-profit purposes. All other forms of usage requires the permission of the author – Ranjan Chari, email: [email protected]


This PERL tutorial aims to be ‘pretty’ comprehensive while being a fast tutorial. i.e. it aims to be substantial in nature and should leave the reader at a point where he/she will know know enough to examine and understand any other PERL code or any other new PERL functions and modules, by looking at the specific help or manual page and be able to use it for their own purposes. A PERL reference book from O’Reilly publishing is highly recommended.


Chapter 1: Some Basics


Unix has a concept of ‘shell scripting’. i.e. You type some UNIX commands into a plain text file, mark it executable and it will do whatever the commands inside it should do if typed one after another. The same concept applies to the “BAT” file under DOS. PERL is similar but is an interpreted language, i.e. it’s going to go line by line thru your code, strip irrelevant junk out, do an internal compilation and then execute the program. It does not generate any object code though. The advantage is that no unnecessary code is lying around to confuse you and eat hard disk space but compile times for each and every time a program has to be executed even for a small problem does slow things up.

PERL 5.XX has now implemented several caching mechanisms to get around this. So this point should not worry you also. But since PERL has an Interpreter of it’s own one little thing needs to be done every time you write a PERL program.

So start every bit of code with this:

#!/usr/bin/perl

The ‘#!’ is compulsory for all PERL code. The stuff following that is the path to the PERL interpreter. This is a UNIX example but in DOS you would type in the DOS directory path, after the ‘#!’ .Of course if the interpreter is in your system PATH, then you can leave it out. But put in anyway.

To get the path in UNIX,

type in:

% which perl

And you’ll get to know the path to the PERL interpreter if its already installed. If not, do it yourself.

#!/usr/bin/perl -w

You can tell the PERL interpreter to use the strict checking mechanism using the ‘-w’ switch that will point out a lot of useful things like dangerous constructs, extra warnings etc etc. Use it for a development environment.

Note: People call PERL code ‘scripts’ or ‘programs’. I feel ‘program’ is a better definition, as this is a full featured programming language and capable of making some very complex applications. Also if you’re in a commercial situation, calling your code a program should get you more money and respect, which you surely deserve for being a PERL programmer.

PERL comments being with a ‘#’ sign. So put plenty of comments in using this.

Example:

# foo bar. - this is a comment

You don’t need to close the comment line with another ‘#’ but if it looks nice do it.

PERL is also incredibly relaxed with a lot of things. You can use white spaces liberally. But use it to lay out the program in a clean and readable way.

New lines sometimes create problems with different editors. So keep each line of code without any new lines.


Chapter 2: Our first program and some simple functions


One advantage of PERL this chapter should illustrate is that this is an incremental language. One can use just a small set of functions and get going and do some interesting stuff. And pick up some more little by little. One can read about 5 pages worth from a good PERL book and be on the road to PERL programming.

Lets start out with a simple program.

#!/usr/bin/perl -w
print "This is my first program \n";

save it into a file called whateeveryoulike.pl

Execute it like this:

% perl whateveryoulike.pl

Output:

This is my first program

‘print’ here is the function that is used to print anything to standard output.

Anything that need to get printed including formatting options goes into ” “.

\n‘ = means a new line. So if there’s any output after the statement it comes on a new line.

;‘ denotes an end of statement. You’ll get used to putting this in automatically soon enough.

Taking in some input:

# Print out a message
print "Enter your name buddy ! ";
# Print out an input prompt where the user is suppose to enter something followed by an "Enter".
# <STDIN> does exactly this. $buddy will contain the value entered.
$buddy= <STDIN>;
# Remember that the user typed in something followed by an "Enter" stroke on the keypad. So this new line needs to be stripped out. 'chomp' does this.
chomp ($buddy);
# You can now print the value of $buddy inside a print statement with some formatting.
print "Hello, $buddy \n";

Chapter 3: Scalar Values


What is a scalar value?

A scalar value is the most basic of PERL data types. A scalar can be anything. Real numbers, decimals, integers, lots of text, just about anything.

Scalar values can be ‘treated’ with PERL operators to generate another scalar value.

Stuff like this is also acceptable:

4.56e45 or -6.8e34 or -3e-20. PERL stores constants as double precision floating point values. So you don’t have to specify anything about about a numerical value. You can however turn on an integer mode in PERL but it’s not turned on by default.

You should’nt start number with ‘0’ though because PERL supports octal and hexadecimal values that use representations for values beginning with a ‘0’. So don’t confuse PERL.

Octal numbers start with a ‘0’ and Hexadecimal numbers start with a ‘0x’.

Strings:

Perl allows you to assign strings of any length to a scalar variable.

The shortest possible string would be on with no characters in it. And the largest can be anything that your computer’s memory can hold.

Let’s say you wanted to print some strings.

You have two options.

1.   The Single Quote.

The single quote is used to tell the PERL interpreter the beginning and the end of the string.

print 'This is a test ! \n ';

Will print : This is a test ! \n

\n which is a control character and denotes that a new line needs to be put in, gets treated as a string.

2.The Double Quote.

The double quote is used when certain control characters are needed along with the string.

The same example above would yield.

This is a test !

Followed by a new line.

Some things to keep in mind about using strings.

a) You can join two strings together using a period (.).

print 'This is a test '. "\n". "Absolutely ! \n";

This is perfectly fine with PERL.

b)  Suppose you wanted to use the Double Quotes and wanted to put in some ‘  or \ or ” ”  into a string just put a backslash before that (\).

print ' \' ';
print " \" \n";

The Backslash and Control Characters:

Backslashes before certain characters achieve certain tasks or special actions. These actions are also called “escape sequences”.

Here’s a list:

\\: backslash

\”: Double Quote

\a: bell

\e: escape

\f: formfeed

\n: new line

\r: return

\t: tab

\l lowercase the following letter

\L: lowercase all letters until \E

\u: uppercase next letter

\U: uppercase all letters until \E

\Q: backslash quote all non alpha numerics until \E

\E: Terminate \L, \Q, or \Q

\cC: any control character, in this case the character is “C”.

\007: Any Octal ASCII Value

\x7f: Any Hex ASCII Value

Scalar Operators:

Addition: 5.6 + 7.1

Subtraction: 3.4 – 2.2

Multiplication: 2.5 * 5.6

Division: 8/2

Exponent: 2 ** 3 ( Two raised to the power three )

Modulus: % ( 10 % 4 will return the remainder which is 2. 10.4 % 4.1 is also the same as it is first converted to an integer value.)

Logical Operators:

Logical operators are used to compare numerical or string values and return either a true or a false value.

Comparison Numeric String
Equivalence = = eq
Not Equal != ne
Less Than < lt
Greater Than > gt
Less Than or equal to <= le
Greater Than or equal to >= ge

The String Repetition operator: ‘ x ‘.

This operator takes a string and makes concatenated copies.

Example:

(2+1) x 4 : will return “3333”

“Hello” x 3: will return “HelloHelloHelloHello”

Assigning and manipulating Scalar Variables:

$a = $b + 5; # Add 5 to $b and then assign it to $a.

$a = ($c = 6); # Assign $c the value of 6 and then make $a the resultant value.

$b = 1 + ($c = 3); # Assign 3 to $c, add 1 to the resultant value and then make $b equal to the sum.

$b= $b * 10; # ( Assign the value of $b multiplied by 10 to $b)

Perl Shorthand – Binary assignment Operators:

Perl tries to state expressions in a very concise way and gives you some new syntax.

A normal notation: $a = $a + 3;

Binary Assignment: $a +=3;

A normal notation: $a = $a -3;

Binary Assignment: $a -=3;

A normal notation: $a = $a * 3;

Binary Assignment: $a *=3;

A normal notation: $a = $a ** 3;

Binary Assignment: $a ** = 3;

Autoincrement and Autodecrement:

++ a; # prefix autoincrement

— a; # prefix autodecrement

$a = ++$b; The value of $b is incremented by 1 and assigned to $a.

$a = $b++; The value of $b is assigned to $a and then incremented by 1.

Chop and Chomp:

Chop is used to take a variable string and truncate the last character from it.

chop($a);

Chomp is used to remove a trailing carriage return from a variable. If there isn’t a carriage return then nothing gets removed.

Example:

chomp ($a);

Variable Interpolation and Double Quote Interpolation:

When a string is assigned to a scalar variable, PERL checks for possible scalar value variables witihn the string.

$a = "Hello";
$b = " $a World !"; # $b is now "Hello World !"

Now suppose you wanted to embed something like “$a” as a string. You need to put in single quotes around the
value to prevent it from getting translated. Or alternatively you can use a backslash (\).

$a = '$b'; # Now $a is equal to "$b"
$amigo = '$friend';
$a = "Hi $amigo"; # Now $a is equal to to 'Hi $friend' - No double substitution takes place.
$a = "Hi \$amigo"; # Now $a is equal to 'Hi $amigo'.

BASIC Input and Output:

To take in any information from the User you use <STDIN>.

The syntax is:

$a = <STDIN>; # Take in something and assign the input to $a.
$a = <STDIN>; # Same thing. Just a short notation.
chomp ($a); # Get rid of the new line
print "You entered the value $a "; # Print out the value that was entered from the prompt by the user.
print ("You entered the value $a"); # The same thing. Use parenthesis if it helps clarity.

Chapter 4: Arrays and Lists


A List is a collection of ordered scalar data. Data within a list have a sequence that they follow. Lists are denoted by ‘@’. Items in a list can be printed out by specifying the position in the list. For example to extract the third item from a list, use @a[2] where ‘@a’ is the list name. When there is a single item under consideration you can also use the syntax $a[2];. Note that in a list the first item is ‘0’.

$a = 1;
$b = 8;
# @
@x = ($a .. $b);

Output: 12345678

$a = 0.1;
$b = 8.1;
@x = ($a .. 8,9,10);
print @x;

Output: 012345678910

@a = ("Dragonfly", "Carl Cox", "Dj Stein", "Scooter", "Flying Rhino");
print "@a \n";
print "@a[0] \n";
print "@a[1] \n";
print "@a[2] \n";
print "@a[3] \n";
print "@a[4] \n";

Output: Dragonfly Carl Cox Dj Stein Scooter Flying Rhino
Dragonfly
Carl Cox
Dj Stein
Scooter
Flying Rhino

To extract more than one item use this:

print "@a[3,4]";

Another way to create a list of strings is by using the ‘qw’ function. Which stands for “quote word”.
What it does is create a list using strings that lie between parenthesis and which separated by white spaces.
This looks much cleaner but is to be used when strings inside the list consist of a single word.

@a = qw(Amiga Amstrad Atari);

More operations on arrays:

@a = (1,2,3); 
@b= (Batman, Robin, Joker); 
@c = (@a,@b); # @c is now '1,2,3,Batman, Robin, Joker' 
$d = @c; # $d is now equal to 6 which is the number of items in the list. 
($d) = @a; # $d gets the first element in the list which is '1'. 
$e = @a[-1]; # $e gets the last element in the list or the first element backwards which is '3'. 
print $#a; # Outputs 2, which is the list count of the items in the array. 
$a[0]++; # increments the first item by '1'. @a is now 2,2,3.

Push and Pop:

These two functions add and remove items from the right side of the list respectively.

@a= (Batman,Robin, Joker);
push(@a, "Penguin"); #Add the new item 'Penguin' to the right side of the list.
pop (@a); # Remove the last item from the right side of the list.

Unshift and Shift:

These two functions add and remove items from the left side of the list respectively.

@a= (Batman,Robin,Joker);
unshift(@a, "Penguin"); # @a is now 'Penguin, Batman, Robin, Joker'
shift (@a); # @a is now 'Batman, Robin, Joker'

Reverse:

@a= (Batman,Robin,Joker);
@b = reverse(@a); @b is now 'Joker, Robin, Batman'

It’s obvious that Reverse reverses the order of items in the list.

Sort:

@a= (Batman,Robin,Joker);
@b =sort(@a); # @b is now 'Batman, Joker, Robin'

Sort sorts the items in ascending order without altering the original list. If there were numbers inside the list even they would be sorted as if they were strings.

Note: You may wonder why the ” “ has been left out while declaring the strings in the examples above. Well. it’s optional. But in pristine, “ Queen’s Perl ” syntax, the ” “ should be used. But this is the nature of PERL. You develop your own style of doing things. And if possible stick to a clean and easy way.


Chapter 5: Control Structures


[if – unless – elsif]

Control structures allow you to execute blocks of statements in a program based on certain conditions. This allows you to write programs that do different things under different situations.

Lets take a look at some examples to illustrate the various types of control structures available in PERL.

Example:

print "How much money do you have ? The movie ticket costs $5. ";
# Take in an input and chomp the trailing newline.
$moolah = <STDIN>; chomp ($moolah);
# Open a loop and check for a condition. And do something if the condition is true.
if ($moolah < 5) { print "Sorry Bud ! Not enough money !"; }
# If the first condition is not true then do this.
else { print "Great ! That should cover it !"; }

Suppose you left out the else condition, the program would just do nothing if the input was not less than 5.

Lets do this another way.

Example:

print "How much money do you have ? The movie ticket costs $5. ";
$moolah = <STDIN>; 
chomp ($moolah);
# unless is used to set a single condition which if FALSE generates a response.
unless ($moolah >= 5) { print "Sorry Bud ! Not enough money !"; }

Now suppose instead of two conditions you have 5 conditions to check for, and each deserved a suitable response. You use the elsif statement.

Example:

print " How many people are eating today ? \n";
$hungryfolks = <STDIN>;
if ($hungryfolks = 1) { print " a 10\" pizza should be good for a start"; }
elsif ($hungryfolks = 2) {print " a 16\" pizza should be good for a start";}
elsif ($hungryfolks = 3) {print " a 18\" pizza should be good for a start";}
elsif ($hungryfolks = 4) {print " a 21\" pizza should be good for a start";}
else {print "You need a buffet !";}

[while – until]

Example:

print "How many times do you want me to say Hello to you ? \n";
$hello = <STDIN>;
chomp ($hello);
$counter = 0;
while ($counter < $hello)
{
print " Hello \n";
$counter++;
}

Example:

print "How many times do you want me to say Hello to you ? \n";
$hello = <STDIN>;
chomp ($hello);
$counter = 0;
until($counter = = $hello)
{
print " Hello \n";
$counter++;
}

[do – while – until]

Example:

print "How many times do you want me to say Hello to you ? \n";
$hello = <STDIN>;
chomp ($hello);
$counter = 0;
do { print " Hello \n"; $counter++; }
until $counter = = $hello;

Example:

print "How many times do you want me to say Hello to you ? \n";
$hello = <STDIN>;
chomp ($hello);
$counter = 0;
do { print " Hello \n"; $counter++; }
while $counter < $hello;

[for loop]

Example:

for ($counter = 0; $counter < 5; $counter ++)
{
print "Hello ! \n";
}

[foreach loop]

Example:

@a = (0,1,2,3,4);
foreach (@a)
{
print " Hello World \n";
}

Chapter 6: Hashes


Hashes are lists where there is a key associated to each value in a list array. The key can be used to reference the associated list item.

Example:

$a{"email"} = "billgates\@whitehouse.gov"; # 'email' is the list and '[email protected]' is the value.
print $a{"email"};

Output: [email protected]

# Note that the backslash is used before the ‘@’ to prevent translation.

Example:

%a = ("John","Hyundai","Sammy","Toyota","Romola","Subaru");
print "$a{John} \n";
print "$a{Sammy} \n";
print "$a{Romola} \n";

Output:
Hyundai
Toyota
Subaru

Example:

Another way to use hashes if it looks more fun to you !:

%a = (
Hyundai => "4 Door Sedan",
BMW => "Two Door Sports",
Hummer => "Utility SUV"
);
print "$a{Hyundai} \n";
print "$a{BMW} \n";
print "$a{Hummer} \n";

Output:

4 Door Sedan
Two Door Sports
Utility SUV

The ‘values’ function:

This function returns all the key-values inside a hash table.

Example:

%a = (
Hyundai => "4 Door Sedan",
BMW => "Two Door Sports",
Hummer => "Utility SUV"
);
@cars =values(%a); print @cars;

Output:

Door SedanUtility SUVTwo Door Sports

The ‘delete’ function:

This function delete a key-value pair from a hash table.

%a = (
Hyundai => "4 Door Sedan",
BMW => "Two Door Sports",
Hummer => "Utility SUV"
);
delete $a{"Hyundai"};
@cars =values(%a); print @cars;

Output:

Utility SUVTwo Door Sports

From the examples above. You would notice that PERL stores the Key-value pairs in alphabetic order with respect to the key. This helps PERL reference key-value pairs easily. There is no way to put your own order into the storage mechanism.


Chapter 7: Subroutines & User defined functions


Example:

hello( );
sub hello
{
print "Hello World ! \n";
}

Output: Hello World !

Example:

$a = 1;
$c = $a + b();
print $c;
sub b
{
$b = 2*$a;
}

Output: 3

Passing Arguments to a Subroutine:

Example:

sub hello
{
print "Hello, $_[0] ! \n";
}
hello ("PERL");

Example:

sub hello
{

print “Hello, $_[0] ! \n”; print “Hello, $_[1] ! \n”; } hello (“PERL”, “Internet”);

As you can see, subroutines can occur anywhere in a program and are not evaluated unless called for.
Arguments passed to a subroutine are assigned to a special variable denoted as: @_

The standard syntax for calling a value in a list is used in the examples above.


Chapter 8: More useful functions:


Chapter 9: Regular Expressions:


Regular Expressions are templates that are used to match strings. They are useful to parsing data and reformatting it or for searching for strings within some data. Regular expressions are a very powerful feature of PERL. Although Regular expressions are used by several languages, PERL uses a superset of almost all the regular expression operators available.

To start off, a regular expression is represented in-between to forward slashes. (‘/’).

Example: Matching a string ‘mig’.

while (<>)
{
if (/mig/)
{ print "I found the string \"mig\" \n"; }
}

So here ‘amigo’, ‘mig’, ‘amiga’ will generate a response.

Let’s expand on this a bit.

Example:

while (<>) {
if (/.ig/)
{
print "I found the string $_ \n";
}
}

Here, the reg. exp. has the special character (.). This will match anything followed by “ig”. So ‘Solveig’, ‘mig’, ‘dig’, ‘trigger’ will all match.

Note: The dot will only match characters. New Lines, Carriage returns, Null etc. are not matched.

Example: Matching a character set – [a-z]

while (<>) {
if (/[a-z]ig/)
{
print "I found the string $_ \n";
}
}

This type of string matching will match any small case letter from a to z. So in this case abig, 123abig, mig will all match. There has to be atleast one small case alphabet for a match to occur.

The same is true for – [0-9] where a single numeral has to match.

Substitute on Match. (s).

This command option takes a match and replaces it with the substitution string.

Example:

while (<>) {
if (s/eat/leat/)
{print "I found the string $_ \n";}
}

More syntax:

/ab*c/ : Match zero or more (b)s.
/[abcde]/ : Match exactly one of any on the letters.
/[ab\[]/: To put in a special character and match it literally use a backslash as is the case with PERL everywhere.
/[a-b0-9]/: Match any lower case character or any numeral.

Negated Character Class.(^) The caret denotes a negated class where the match takes place on any character not in the list.

/[^abcde]/

Multipliers:

(+) one or more of the immediately previous character.

(?) Zero or one of the immediately previous character.

/a{2,5}/ Match between 2 to 5 (a)s.
/a{2}/ Match exactly 2 (a)s.
/a{2,}/ Match two or more.

Alternation:

Matching one or more strings.

/string|anotherstring/

Remembering a matched string:

/amiga(.)amstrad(.)apple\1\2/;

Here, amiga is matched followed by a character followed by amstrad followed by a character followed by apple followed by the first character matched which is now represented by 1 and followed by the second character now represented by 2.

Example:

while (<>) {
if (/amiga(.)amstrad(.)apple\1\2/)
{
print "I found the string $_ \n";
}
}

In the example above: ‘amiga1amstrad2apple12’ will match.

The split Function:

Suppose you had a string like

$string ="abc::def::ghi::jkl::mno"

You can use the split command to search for a common string and return all the other values that don’t match up as a list of values.

$string ="abc::def::ghi::jkl::mno";
@alphabet = split(/::/, $string);
print "$alphabet[0] \n";
print "$alphabet[1] \n";
print "$alphabet[2] \n";
print "$alphabet[3] \n";
print "$alphabet[4] \n";

Output:
abc
def
ghi
jkl
mno

The join Function:

The join command does the reverse of split. It concatenates a series of values from a list and puts in a string between each element.

$rebuild = join ("::", @alphabet);

The command above will return the original string.


Chapter 10: Advanced Input/Output:


Chapter 11: Objects:


Chapter 12: Socket Programming:


Sockets are ‘virtual devices’ on servers that act as end points for any sort of communication. The IO::Socket module in PERL faciliates socket programming. In this chapter we’ll see the usage of sockets and how to apply it towards bi-directional communication. The IO:Socket module is standard with PERL distributions 5.004 onwards.

Example:

#!/usr/bin/perl
# A really cool library module for fiddling with Sockets. PERL has native support for sockets but this module

# module does things more elegantly.
use IO::Socket;
# use any vacant port. A high number is better as they have the least chances to be used.
$server_port=8200;

$server = IO::Socket::INET->new(LocalPort => $server_port,
Type => SOCK_STREAM,

Reuse => 1,
Listen => 2)

or die "Cannot be a TCP Server on $server_port : $@ \n";
while ($socket = $server->accept()) {
print $socket "Ranjan's Used Comic Book Store \n";
}
close ($server);

Now if you were to run this script on a webserver – you should be able to telnet onto the port on the server the script is running on and get the message. The ‘Reuse’ option tells the system to allow re-use of the port after the program exits. This is useful when you want a service to use the same port even on abnormal program termination. The ‘Listen’ option specifies how many request should be queued up before the client trying to connect gets a message like “connection refused”.


Chapter 13: A look at CGI:


Well what is CGI ? It is stands for Common Gateway Interface. Well, if you want to take in some information from a form on a webpage and do something with it such as printing an appropriate response or sending email or anything which includes server site processing, there is a need for a CGI scripts. PERL is very handy in these matters. Most guestbooks, feedback forms, database forms, quizzes, surveys and things where information is accepted on the web use PERL as the CGI scripting language.

Now information gets sent to a script via a HTTP request. Values passed from the form to the CGI script follow the CGI specification.

Suppose from a form you wanted to accept to values. An email address and a name and pass it on to the CGI script, the string that gets passed to the script from the form is of the form http://www.yourdomain.com/cgi-bin/[email protected]&name=yourname

If you know the form object names then it is not required that you use the form. You can achieve the same effect by constructing the request string and punching it into the browser promt.

It is up to the script sitting at the server to parse this string and seperate all values and do something useful with it.

To make long matters short and to also simplify programming, PERL 5.004 and onwards come with a really nifty module called CGI.pm. If your version of PERL does not have this module you can get the same from cpan.org. Cpan is a repository for PERL modules and user contributed libraries that are reusable for a variety of task. A general search will display all modules available for a particular task.

CGI.pm will parse all incoming CGI requests for you.

Let’s take a look at an example:

The form:

<html>
<head><title>My First CGI Script</title></head>
<body>

<form method="post" action="http://www.yourserver.com/cgi-bin/myscript.pl" name="myform"><p>Name:
<input type="text" name="name" size="50" maxlength="50"></p><p>Email: 
<input type="text" name="email" size="50" maxlength="50"></p><p><input type="submit" name="submit" value="Submit"><input type="reset" name="reset" value="Reset"></p></form>

</body>

</html>

The CGI script:

#!/usr/bin/perl
use CGI qw(:standard);
my $name= param('name');
my $email= param('email');
print header, start_html("My first CGI script");
print p("Hi $name, your email address is $email.");

Output: Hi yourname, your email address is youremail.

To execute a script that lies within your CGI-BIN directory or where ever you choose to place it on the server, you need to give it read and execute permissions. Under UNIX the command to do so is : chmod a+rx xyz.pl

It is up to you what you want to call your script. You may optionally like to keep your scripts with the suffix .cgi. That’s fine as long as your server likes them called that way.


© 2000 Ranjan Chari