DISCLAIMER: THESE PAGES ARE STILL UNDER CONSTRUCTION. NO CODE EXAMPLE BEEN TESTED YET.

Perl - my idiosyncracies

Avoid Barewords - subroutine calls and quoting


[Previous Page] |[Next Page] Table of Contents: small | med | large

What is a bareword? [back to top]

A bareword is any combination of letters, numbers, and underscores, and is not qualified by any symbols. That is, while $apple is a scalar and @apple is an array, apple is a bareword.

There are a couple areas in Perl where a bareword is allowed. A bareword string does not need to be qualified with quotes, and a subroutine doesn't always need the &. However, it is my belief that in most cases, taking these shortcuts leads to less readable code.

Calling a subroutine [back to top]

Perl allows more than one way to do a lot of things, including calling a subroutine. For instance,

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 sub sayHello {
  7 	my $message	= shift;
  8 	my $name	= shift;
  9 
 10 	print $message, $name, "\n";
 11 	return;
 12 }
 13 
 14 &sayHello("Hello ", "kitty.");

Listing 3.2.1 for code_untested/subroutineCallEx1.pl

This is the way I like to do things. But you should be aware that there is more than one way to call a subroutine. In this case, since we have already declared the subroutine, we don't actually need to preface it with &. That is,

	# Valid, but I don't like it.
	sayHello("Hello ", "kitty.");
is also considered a valid way of calling sayHello.

It is worth noting that print is an intrinsic part of the Perl language, so you do not need to use &. Because of this, the & and the parenthesis give us an immediate clue that sayHello is a user-defined subroutine, and not part of the core language. As you are debugging your code, you will know immediately to search your code for details about this function rather than your Perl reference books.

We don't even need the parenthesis (we never did).

	# Valid, but I don't like it.
	sayHello "Hello ", "kitty.";
is actually considered valid syntax as well. But I don't like this either because it makes you think for a moment about context.

There are also some more implications to this. In a Perl subroutine, you simply pass in a list of values, and it is up to the subroutine to interpret them. This makes it very convenient to pass in variable length parameter lists and make optional parameters. In the above example, we could call sayHello like this as well:

	# just says Hello without a name.
	sayHello "Hello";

	# just prints a newline - no params in
	sayHello;
I really don't like these. Especially that last one. We just have a bareword. So the chances are that it is a call to a previously delared subroutine. But what if it isn't? It could also be interpreted simply as a string (I'll get to that in a minute) and it really doesn't give you context as to what is happening, where something as simple as:
	# just says Hello without a name.
	&sayHello("Hello");

	# just prints a newline - no params in
	&sayHello();
clearly tell you that you're using a user-defined subroutine. Perl is great in that it allows this flexibility in coding styles. This happens to be mine, and I think it makes the code more readable. By the way, this is one of the big problems I have with C and C++ - there is nothing to distinguish barewords from variables or function references, or at least very little...

There are, of course, exceptions. As you can see above in line 7, I didn't use the parenthesis. All I can say is that these are guidelines, not rules that are carved in stone. In this case, it is my belief that print is such a fundamental part of Perl (it's used within the first 3 programs most people write, and usually in the first), that it is immediately recognizable as intrinsic Perl, and the following strings are immediately recognized as arguments to print.

But there's another reason. Of course, there are some cases where the parenthesis are actually invalid. For instance, in print, if you wanted to print to your user-defined filehandle, OUTPUT_FILE, you would say:

	print OUTPUT_FILE $message, $name, "\n";
Note here that there is no comma between OUTPUT_FILE and $message. In print, this is the format to write to a filehandle. However, if you add the parenthesis, you are declaring the contents to be a list or array.
	#Invalid syntax
	print(OUTPUT_FILE $message, $name, "\n");
is actually wrong because the parenthesis indicate a list, but inside a list, you need commas between each parameter. Note that you will run into a similar situation with intrinsic commands that allow code blocks. For instance, grep, map, and sort.
	#Okay but weird
	print OUTPUT_FILE ($message, $name, "\n");
is legal, but I never use this format. Again, because in my arbitrary judgement, I think print is so fundamental, that everyone can read the line without parenthesis.

By the way, in case it wasn't clear,

	#Legal
	print($message, $name, "\n");
is legal, and a fine syntax, but I just usually choose not to do this.

I'll discuss function call syntax a little bit more in the section on my idosyncracies about modules and objects.


Use quotes for your strings [back to top]

In the previous example, we saw that we could omit the & and the parenthesis in a function call. Another situation where we may encounter bare words is strings definitions. Let's look at a slightly altered version of the last program:

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 sub sayHello {
  7 	my $message	= shift;
  8 	my $name	= shift;
  9 
 10 	print $message, " ", $name, "\n";
 11 	return;
 12 }
 13 
 14 $message = "Hello";
 15 $name = "kitty";
 16 
 17 &sayHello($message, $name);

Listing 3.2.2 for code_untested/subroutineCallEx2.pl
Now, we couldactually omit the quotes in this example since all of the strings are letters and numbers. We could, in fact omit the subroutine & and parens as well and wind up with:

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 sub sayHello {
  7 	my $message	= shift;
  8 	my $name	= shift;
  9 
 10 	print $message, " ", $name, "\n";
 11 	return;
 12 }
 13 
 14 $message = Hello;
 15 $name = kitty;
 16 
 17 # Ick!
 18 sayHello $message, $name;
 19 
 20 # Even worse
 21 sayHello Hello, kitty;
 22 
 23 # And worse still
 24 sayHello Hello;
 25 

Listing 3.2.3 for code_untested/subroutineCallEx3.pl
I think these last couple calls are unreadable. Note that now a bareword could evaluate to either a bare string or a subroutine reference. In fact, in the last line, who is to say Hello is not a previously defined subroutine that we are getting values from? It is especially because of this ambiguity that I would recommend:


The exceptions - hash keys [back to top]

As I've said before, "No rules. Just tools." (quoting Glenn Vilppu). And to every rule, there are exceptions. Though I generally try to avoid barewords like the plague, there is one situation that I am starting to use them more.

Suppose you are defining a hash table:

	%elementFiles = (
		'sabrina'	=> 'sabrina_take001.mb',
		'hilda'		=> 'hilda_flyingRig_take203.mb',
		'zelda'		=> 'pinaydisEffect_final.mb',
		'salem'		=> 'sal_4ped_wip.mb',
			);
You can actually omit the quotes in the hash keys since they are only letters, numbers, and underscore. Now, as before, we could always omit the quotes from the string, though I usually recommend against it. Now, in this case, the operator immediately following the key is the => which actually has the super secret property that if it sees a bareword right before it, it treats it as a scalar string. So when we see =>, we know there is no ambiguity over whether the bareword is a simple string or a function reference. That being said, I am starting to use this format more:
	%elementFiles = (
		sabrina		=> 'sabrina_take001.mb',
		hilda		=> 'hilda_flyingRig_take203.mb',
		zelda		=> 'pinaydisEffect_final.mb',
		salem		=> 'sal_4ped_wip.mb',
			);
As a note, I still won't use a bareword for the hash value on the right hand side. I still have mixed feelings about using barewords in a hash itself:
	# Jury's still out on this one:
	$transport{tarzan} = 'vines';
I still feel uncomfortable neglecting the quotes on tarzan, but sometimes I'll take that shortcut. I tend to do it a little more with environment variables:
	# This feels a little more comfortable, but not entirely.
	$ENV{MAYA_LOCATION} = '/usr/local/bin/maya';

The exceptions(not really) - qw [back to top]

Now and then, it's just convenient to use barewords. But I'm torn because I don't like them. This happens mostly when I have a list of predefined strings. For instance, suppose I want to define some error codes:

	@errorCodes = ('SUCCESS', 'FAILURE', 'FAILURE_FILE_NOT_FOUND',
			'FAILURE_PARAM_OUT_OF_RANGE', 'FAILURE_CHILD_NOT_EXEC',
			'FAILURE_OUT_OF_MEMORY',
			'FAILURE_PACKET_NOT_RETURNED');
All those codes are just letters, and I'd like to just write out the list. In Perl, there is the qw mechanism to do this. See the index in Programming Perl [WALL00] . But basically it does exactly what I'm looking for. You can use the operator qw() to list bareword strings between the parenthesis without guilt:
	@errorCodes = qw(SUCCESS FAILURE FAILURE_FILE_NOT_FOUND
			FAILURE_PARAM_OUT_OF_RANGE FAILURE_CHILD_NOT_EXEC
			FAILURE_OUT_OF_MEMORY
			FAILURE_PACKET_NOT_RETURNED);
and this does the same thing as the first errorCodes example above. I would actually try to make this more readable by giving each code its own line:
	@errorCodes = qw(
			SUCCESS
			FAILURE
			FAILURE_FILE_NOT_FOUND
			FAILURE_PARAM_OUT_OF_RANGE
			FAILURE_CHILD_NOT_EXEC
			FAILURE_OUT_OF_MEMORY
			FAILURE_PACKET_NOT_RETURNED
		);

© 2001 Steve Hwan, hostname: @pacbell.net, username: svhwan
You should probably use the word "PERL" in the subject line to get my attention.
Last Modified: Wed Apr 18 01:53:11 2001