One of the great things in Perl is that it is a hacker's license. It evolved from a programmer to get work done, and it just grew. It's designed for people who like to just get things done. And to that end, there are a lot of shortcuts you can take and a lot of things you can omit.
I think that a lot of the shortcuts make it more difficult to read code. So I tend to go out of my way to being explicit and try to clarify what's going on. I've been programming long enough that I've had to go back to code that I wrote a long time ago, and gotten confused. So I've realized the importance of clarity in coding.
That being said, I'm a hacker at heart, so I'd say: experiment a little bit. Figure out which shortcuts work for you, and which don't. Different people will have different degrees of hacker, and will have different ideas of what makes code readable.
These are not meant to be rules, or God forbid, standards of coding practices. But these are just an outline of what has worked for me. And if you're trying to read my code, this is an outline of what to expect.
In Perl, you don't have to name your iterator, but I believe that you always should. Let's start with an example I don't like:
1 #!/bin/sh
2 #! -*- perl -*-
3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
4 if 0;
5
6 # Assume $material was already defined
7
8 $materialExp = $materialExpHash{$material};
9
10 foreach (@patches) {
11 next if (! /$materialExp$/);
12 &bindMaterial($_, $material);
13 }
Listing 3.1.1 for code_untested/bindMaterial-1.pl
I don't like this because it uses the $_ variable a lot. Don't
get me wrong. I'm not saying to ban this special variable. I use it
myself. However, many beginners are a little put off when they see
$_, roll their eyes, and complain about how incomprehensible
Perl is. Well, it doesn't have to be.
In the above example, we skip the standin geometry (assuming the
patch name ends with the word "standin"). Now, in our regular expression
test, we just have a regular expression comparison sitting there. In this
case, since we don't define an iterating variable, then the name of each
patch is stored in the special variable, $_. Many many Perl
commands will operate on the $_ variable if none other is
specified. This includes the regular expression match operator. Finally,
we send the $_ variable to the bindMaterial.
While this is done explicitly, it still puts people off to just see the
$_.
Perl allows you to define the iterating variable. I think it is
actually scoped to the foreach, but I still explicitly scope it
to the block, using my just in case. As long as I'm explicitly
declaring my iterators, I'll also iterate over the materials:
1 #!/bin/sh
2 #! -*- perl -*-
3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
4 if 0;
5
6 foreach my $material (@materials) {
7 my $materialExp = $materialExpHash{$material};
8 foreach my $patch (@patches) {
9 if ($patch =~ /$materialExp$/) {
10 &bindMaterial($patch, $material);
11 }
12 }
13 }
Listing 3.1.2 for code_untested/bindMaterial-2.pl
Here, we've given the name $patch to the iterator variable. I
think this looks more readable. Also, note that since we defined an
iterator variable, we need to use it explicitly in the regular expression
test. I think this is a good thing. Since we're using nested loops, I
think it is especially good to declare the iterating variable.
As you read from a file handle, it is normai to do something like:
1 #!/bin/sh
2 #! -*- perl -*-
3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
4 if 0;
5
6 # Suppose RIBFILE names coordinate systems by using
7 # CoordinateSystem "coordSysName"
8
9 $ribFileName = $ARGV[0];
10 @namedCoordinateSystems = ();
11
12 open(RIBFILE, $ribFileName) or die "can't open $ribFileName!";
13 while (<RIBFILE>) {
14 next if (! /^CoordinateSystem/);
15 my(@tokens) = split(/\s+/);
16 push @namedCoordinateSystems, $tokens[1];
17 }
18 close(RIBFILE);
19
20 print "I have the named coord systems [@namedCoordinateSystems]\n";
Listing 3.1.3 for code_untested/getRibCoords-1.pl
Again, we are overusing $_, mostly implicitly. When
you read from the filehandle RIBFILE in the scalar context, the
value of the current line is stored in the variable, $_. Also,
the regular expression and the split will operate on $_ by
default. I don't like having this extra layer of stuff to think about. I
like to name this variable just to remind myself what I was doing with the
current variable.
1 #!/bin/sh
2 #! -*- perl -*-
3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
4 if 0;
5
6 # Suppose RIBFILE names coordinate systems by using
7 # CoordinateSystem "coordSysName"
8
9 $ribFileName = $ARGV[0];
10 @namedCoordinateSystems = ();
11
12 open(RIBFILE, $ribFileName) or die "can't open $ribFileName!";
13 while (<RIBFILE>) {
14 my $currline = $_;
15
16 next if ($currline !~ /^CoordinateSystem/);
17 my(@tokens) = split(/\s+/, $currline);
18 push @namedCoordinateSystems, $tokens[1];
19 }
20 close(RIBFILE);
21
22 print "I have the named coord systems [@namedCoordinateSystems]\n";
Listing 3.1.4 for code_untested/getRibCoords-2.pl
When dealing with an array, you can get the number of elements of the array by getting the scalar context of the array. That is, we could say:
while (@remainingActions) {
my $currAction = shift @remainingActions;
# do something with the action
#
}
Inside the while, the array gets implictly evaluated as a scalar.
When we shift off the last action, there will be zero elements left in the
array, so we hit the termination condition of the while loop. This is a
compact valid way to do this. Personally, I like bringing attention to the
fact that there is a conversion to a scalar, so I do it explicitly. Also, I
like to make it explicit that the terminating condition of the loop is that
there are 0 elements left. So you will see a lot of my while loops look like:
while (scalar(@remainingActions)>0) {
my $currAction = shift @remainingActions;
# do something with the action
#
}
This chapter isn't called "My Idosyncracies" for nothing.
In Perl, you don't really need to explicitly return out of a
subroutine. Often I forget to. But I don't like myself for that. When
possible, I try to use the return statement. If you forget the
return, then it will just return the last evaluated value in the code block,
which is not always desirable. When you use:
return;this will return an
undef do you can actually use this to trap
for errors. In The Perl Cookbook
[CHRI98]
, they bring up the good point that if your subroutine is returning an
array, then you don't want to simply say return, but you actually probably
want:
return ();If you're wondering why, the Cookbook explains this. But basically, you are probably going to receive the return value into an array. However, if you simply return, then you will return an undef. An undef is a valid element for an array element. So you will actually get an array with one element, namely undef. Generally, it will be more productive to use the empty list as an error or "no data" condition than it will a 1-element array with an undef.
I just prefer not to implicitly return values either. I think it
makes it easier to read the code if I can go back to the subroutine and just
look for the return.
It's useful to know the order of precedence of Perl operators. You can read the man pages and see a table. But let's say you have an if block that you want to evaluate if one of 3 conditions is satisfied:
if ( $hr < 10 && $ampm eq 'pm' || $name1 eq 'Jennifer' && $name2
eq 'Love' && $name3 eq 'Hewitt' || $cashBox < 200) {
&admitGuest();
}
Okay, maybe this really wasn't a production example. I just felt like using
the name "Jennifer Love Hewitt" in an example.
Now, I just think that's a mess. Consult with the precedence tables, and you'll see that it's okay. But I can never remember whether and(&&) or or(||) has precedence, and it's hard for me to group terms together. Since i can never remember precedence, and I know I won't remember precedence rules when I debug the code later (yes, my code will usually have bugs), I prefer to see:
if ( (($hr < 10) && ($ampm eq 'pm'))
|| (($name1 eq 'Jennifer') && ($name2 eq 'Love')
&& ($name3 eq 'Hewitt'))
|| ($cashBox < 200)) {
&admitGuest();
}
Potayto, potahto. But I like throwing a lot of parens in there so I can
keep track of what is bound to what. It doesn't hurt to spread things out
with multiple lines, indents, and intelligently applied whitespace.
I would also make a note about multiline expressions. I personally like to start each line with an operator as a reminder that I should look back at the previous line to see the beginning of the expression. Also, it reminds me that it is not starting a new line. I should mention that this is against the recommendations of the perl style. In this case (and, or), the perlstyle man pages (I think they're written by Tom Christiansen) actually agree. But he says that in most cases, you should leave the operator ending the previous line. I personally prefer to start lines with the operators. Christiansen actually writes a lot of intelligent stuff, so you may want to consider listening to him. But everyone has their own style.
Actually, one other thing - indents. Christiansen recommends 4 space indents. I like to just hit the tab key. 4 spaces makes sense, You will be able to fit more code on a line, and you can go deeper easier. But if it gets to be an issue, I tend to start breaking things out into subroutines. I don't say the tab-indent is a good thing. That's just what I do, and it is honestly a product of laziness (off course, Larry Wall, the inventor of Perl, recognizes the 3 virtues of a programmer are: laziness, impatience, and hubris.
foreach my $leftPatch (sort {$a->{'patchName'} cmp $b->{'patchName'}}
grep($_->{'patchName'} =~ /^L_/,
@{$scene->{'patchList'}}) {
# do something with left patches
}
Well, that's an awful lot going on. I tend to be rather loose about making
new variables. I'm not designing operating systems. I usually write
scripts that do something and get out, so quite frankly, even if I blow 100
Meg of RAM, I don't really care much. That's usually still small enough to
land on a desktop machine on the queue anyway. So, with that in mind, I
would tend to really write the above code as:
@patchList = @{$scene->{'patchList'}};
@leftPatches = grep( $_->{'patchName'} =~ /^L_/, @patchList);
foreach my $leftPatch (sort
{$a->{'patchName'} cmp $b->{'patchName'}}
@leftPatches) {
# do something with left patches
}
Memory is cheap again, and so variables are too. I like making these
variables even if I just use them once because it breaks down my thought
process and kind of provides self documentation in the code. Some people
might even extend that one further and separate the sort criterion into a
subroutine by itsef:
sub byPatchName {
# Okay, okay, so I don't ALWAYS say 'return'. Sort routines
# are a notable exception.
$a->{'patchName'} cmp $b->{'patchName'};
}
@patchList = @{$scene->{'patchList'}};
@leftPatches = grep( $_->{'patchName'} =~ /^L_/, @patchList);
# I don't think I'm allowed to use & down here otherwise I would.
foreach my $leftPatch (sort byPatchName @leftPatches) {
# do something with left patches
}
This is a good idea, though usually I'm to lazy to do this. It's good from
a code re-use standpoint too. One thing though is that if you do this, be
careful about where you define the subroutine - you might not get as much
re-use as you hoped, especially if you use hash values defined inside the
code block you're executing the sort from. I think they call these
closures, expecially if I use a variable lexically defined in that block.
Yeah, yeah, that probably sounds like spouting gibberish. Try not to worry
about it.
When I expand variables inside double quotes, I like to use the
curly braces. It just explicitly says to me "This is a perl expansion."
This helps a lot (to me) when I stick MEL generators inside of Perl. Since
I can have variables in MEL (and I usually do) that also have the form,
$variableName, it helps me distinguish a perl expansion since
only Perl variables can have the form ${variableName}.
Also, suppose I'm generating some texture maps that have the
form, patchName_color.tx for the base color. I would say:
$textureName = "${patchName}_color.tx";
Note that there is a problem with:
# BAD $textureName = "$patchName_color.tx";Since underscore(
_) is a valid part of variable names in Perl,
then Perl will think the variable name is $patchName_color.
Since I really wanted $patchName and
$patchName_color probably doesn't even exist, it will probably
evaluate to the null string. But using the curly braces protects us from
this. So I see 2 advantages in using the curly braces: