DISCLAIMER: THESE PAGES ARE STILL UNDER CONSTRUCTION. NO CODE EXAMPLE BEEN TESTED YET.

Perl - Doing it Better

File Globbing is Evil - rendering RIBs, converting Texture Maps


[Next Page] Table of Contents: small | med | large

First off, there are some notes on doing something to every file in a directory in: [CHRI98] pp. 318-320 (recipe 9.5). Also, there is an excellent recent reference in one of Randal Schwartz's articles. [SCHW01-2]

Coincidentally, this article came out just as I was about to start writing about this, so I will just more or less summarize what he wrote and encourage you to find and read his article. Actually, a lot of the Perl that I know came from books by these two guys (Christiansen and Schwartz), so I really encourage you to read their works. As always, I'll try to make my points with a short practical example:

Remember though the point here is to learn an alternative to globbing, not to render RIBs or make texture maps. But I did want some practical examples.

Render a bunch of RenderMan RIBs.
The basic methodology here is:
  1. Look at each file in the directory
  2. Use Pixar's 'render' on each file in the directory

EX 4.1.1: rendering RIBs, the naive approach

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 foreach $file (<*>) {
  7 	system "render $file";
  8 }

Listing 4.1.1 for code_untested/renderRibs-1.pl

Okay, now this will work, and it's probably the quickest way to do this. But there are a ton of things I don't like about this method, and it can be done better:

  1. $file should have a local scope to the foreach. I'm strongly opposed to scoping to objects, but I think that scoping to code blocks is a good thing, and good practice.
  2. The system line has a string with spaces instead of a list. This will result in spawing a shell to execute it, which has CPU overhead.
  3. You will want to refine your glob to rib files

EX 4.1.2: rendering RIBs, a little better

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 foreach my $file (<*.rib>) {
  7 	system 'render', $file;
  8 }

Listing 4.1.2 for code_untested/renderRibs-2.pl

This has addressed my first couple gripes, but it still has my main one, and the point of this section:

Globbing is Evil.

At first glance, it is easier to glob, especially if you came from a DOS background. Though UNIX command line also supports globbing. The first problem is that depending on your perl version, there may be extra overhead. Older Perls will actually open up a shell to evaluate the glob. On top of that, I just think the format is really bad. The problem I see is that this looks a lot like the "read-from-filehandle" operator, .

The real way to do this is to use directories.

EX 4.1.3: rendering RIBs, the right way

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 opendir(CURRDIR, ".") or die "Cannot open current directory: $!";
  7 foreach my $file (readdir(CURRDIR)) {
  8 	next if ($file !~ /\.rib$/);
  9 	system 'render', $file;
 10 }
 11 closedir CURRDIR;

Listing 4.1.3 for code_untested/renderRibs.pl

I think this is the best way to do this task. I could launch into a discussion about what I don't like about "system", but right now, this is actually the proper usage for it. You probably should also filter for .rib extentions, but I'll get into that in my next example.

You've probably noticed that instead of the intuitive glob,

<*.rib>
we now have a regular expression test:
!~ /\.rib$/
Using regular expressions is really a separate topic, but just briefly:

How about another one? Same idea, but a little more dressing around the edges:

Convert a set of TIFFs to a set of RenderMan texture maps.

EX 4.1.4: converting TIFFs to RMan tx files, the naive approach

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 # assume all files have the form: basename.0001.tif
  7 foreach $file (<*.tif>) {
  8 	($basename, $frame) = $file =~ /(.*)\.(\d+)\.tif$/;
  9 	system "txmake $basename.$frame.tif $basename.$frame.tx";
 10 }

Listing 4.1.4 for code_untested/makeTiffIntoTx-1.pl

Once again, this probably is the quickest way to accomplish the task at hand, but it is very sloppy. I have problems with the scoping, as well as system invoking a shell. Also, of course, the globbing is a problem.

In this case, I'll make an additional point about extra variables. It's my personal belief that local lexical (my) variables are cheap, and sometimes, making a couple extra steps with additional variables actually helps organize your thoughts and makes your code easier to read and figure out.

Also, you might want to redirect the textures to a seperate directory.

EX 4.1.5: converting TIFFs to RMan tx files, a better approach

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 $txDir = "/productionName/textures";
  7 opendir(CURRDIR, ".") or die "Cannot open current directory: $!";
  8 foreach my $file (readdir(CURRDIR)) {
  9 	my($basename, $frame) = $file =~ /([^\/]*)\.(\d+)\.tif$/;
 10 	next if ($& eq '');		# skip if file didn't match
 11 	my $newName = $txDir . "/" . $basename . '.' . $frame . '.tx';
 12 
 13 	system 'txmake', $file, $newName;
 14 }
 15 closedir CURRDIR;

Listing 4.1.5 for code_untested/makeTiffIntoTx-2.pl

One of the first things you may question is: Why didn't I protect the $txDir with a my? This goes back to the very fine point in the first example. I favor using scoping for code blocks, but not for objects. My feeling (and a lot of the software community will oppose me on this one, but some of the production community should agree) is that "my" should only be used inside a code block. That is, basically for temporary variables. However, if a variable persists, even if it only persists within an object, I think it should be made public. I don't think accessor methods are suffiecient. They promote the arrogance that you are the ultimate coding authority, and that no one will find a better or different use from your variables than you originally intended. I will hammer on this point repeatedly.

However, for temporary variables that only exist inside a code block (like a subroutine, or a foreach loop) then scoping them really is a good idea.

My one complaint, and it might be yours as well is that the regular expression is a little difficult to read. Unfortunately, that's the case with a lot of regular expressions. No matter how good you are, it's just a pain to keep track of how many levels of escape you're down, and sometimes, it's difficult to keep track of the parenthesis.

Okay, I do have another complaint. What in the heck is $& ? Perl is often criticized for having very obscure unnreadable variables like this. Actually, it means "last match," kind of like using & in a substitution in 'vi'. Now admittedly, I'm often guilty of letting a couple of these types of variables get through. But if you use the English module, you can make things a little more readable. It's not going to be easy to remember, so you'll have to consult the perl references a lot as you write the code. But it will make it easier on yourself and others when you reread the code later.

EX 4.1.6: converting TIFFs to RMan tx files, a little better approach

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 use English;
  7 
  8 $txDir = "/productionName/textures";
  9 opendir(CURRDIR, ".") or die "Cannot open current directory: $!";
 10 foreach my $file (readdir(CURRDIR)) {
 11 	my($basename, $frame) =
 12 		$file =~ /([^\/]*)	# basename - non-slash chars
 13 			\.		# literal dot
 14 			(\d+)		# frame number - digits
 15 			\.tif$/x;	# extention - ends w/ ".tif"
 16 
 17 	next if ($MATCH eq '');		# skip if file didn't match
 18 	my $newName = $txDir . "/" . $basename . '.' . $frame . '.tx';
 19 
 20 	system 'txmake', $file, $newName;
 21 }
 22 closedir CURRDIR;

Listing 4.1.6 for code_untested/makeTiffIntoTx.pl

$MATCH is just the contents of the last regular expression match. That is, I'm saying if the filename didn't match the "basename.####.tif" format, then skip this file.

From here, I'll leave it to you to make this a production-ready script. There are a couple things you may want to think about:


© 2001 Steve Hwan, hostname: @pacbell.net, username: svhwan
You should probably use the word "PERL" in the subject line to get my attention.
Last Modified: Fri Apr 20 02:19:14 2001