Thursday, October 4, 2007

Newbie: Learn Perl in 10 easy lessons(Part II)

Learning Perl: Lesson 6

In the previous lessons we've used a lot of functions that were built-in the language. For instance, we used "split", "print", "chomp" ...etc. What we haven't done yet, is to define our own functions. In Perl we call them subroutines.


A subroutine is basically a few lines of code you give a name to. You can then run these lines by simply calling the subroutine from your script. You can pass arguments from the scripts to the subroutine, and depending on the subroutine it can return a result or not.

If you've never used any programming language before, you probably don't know what to think right now. But don't worry, it's very simple. Let's look at an example to make things a little bit clearer.

Have a look at the following script:

As you probably guessed, this script does something pretty stupid. It shows you the lines which contain "house" and then the lines which contain "dog" within the file /etc/myProgram/myFile. It's not what it does that is shocking though, it's how it does it.

Look at the code. It does exactly the same thing twice... once for "house" and once for "dog". If there was a bug in the way it shows the lines containing "house", would I remember to also fix the same bug for "dog" when I fix it? Of course not, I'm only human after all.

The idea is simple. Instead of showing the lines which contain "dog" or "house", we'll define a subroutine which can show lines containing "something". And then we'll call this subroutine with "dog" and "house".

As you can see, we reduced the amount of code and more importantly, we removed code that was repeated twice. The first part of the script defines a subroutine called showLines. The second part simply calls it twice, once giving it the argument "house", and then giving it the argument "dog".

Within the subroutine, we simply replaced the references to "dog" and "house" with a variable representing the argument. When you call a subroutine with an argument or a list of arguments, the subroutine is executed and the arguments are stored in the following array variable: @_.

The line my ($keyword) = @_; is simply assigning the values of the arguments to an array containing variables. In this case the value of the argument is assigned to $keyword, which receives "house" and then "dog" when called from the script.

Local variables

The reason why the assignment is preceded by "my" is because we want the $keyword variable to be local to the subroutine. When a variable is defined with the keyword "my" it becomes a local variable and it only exists within the subroutine. This way, we do not have to worry whether or not another variable with the same name is defined in the script. It's always a good idea to make variables local inside your subroutine, so always declare them with the "my" keyword.


Let's go further. Instead of a subroutine which prints the lines containing "something" from the file /etc/myProgram/myFile, we'll define a subroutine which prints the lines containing "something" from "some" file. This time we'll use two arguments: the file to look into and the keyword to search for.

We could even define a subroutine which searches for a keyword in a list of files. For this we would pass two arguments: a keyword, and an array containing filenames.

In this example, note how the subroutine retrieves the arguments. The array @_ initially contains two objects, a scalar and an array. By using the "shift" Perl function we retrieved and deleted the first element of the array and assigned it to the local variable $keyword. We're then left with @_ only containing the array of filenames.

Returning something

Subroutines can return values. In our previous example, our subroutine was returning nothing. We were simply calling it and it was outputting some lines on the screen. In this example, we'd like the subroutine to do the same, but also to count the number of lines it found and to return it.

As you can see, the subroutine simply returns the number of lines by using the "return" keyword. In the script, the return value is assigned to the variable $nb.


Thanks to subroutines you can now define code which can be called more than once. Through the use of arguments and return values, you can make your subroutines more flexible. By extracting common behavior and defining it within subroutines you'll reduce the cost of maintenance in your scripts and the number of eventual bugs it might contain. In most cases this will also make your scripts more scalable. In the next lesson we'll look at how to define our subroutines outside of our script, so that they can be called and used by more than one script.

1 comment:

Aira said...

Hi Rakesh,

The blogs which you have posted is really very useful..Thanks for such a nice blog. One more I would like to tell you that Linux people are very nice compare to Windows. Linux people are helpful and reply soon. You are a debian. I appreciate for that.