DISCLAIMER: THESE PAGES ARE STILL UNDER CONSTRUCTION. NO CODE EXAMPLE BEEN TESTED YET.

Introduction to Perl

Getting input, file handling - @ARGV, <STDIN>, backticks, open


[Previous Page] |[Next Page] Table of Contents: small | med | large

One of the things you're going to need to do often is get input from the user. There are a few common sources to this:

command line arguments- @ARGV [back to top]

As noted in a previous page, @ARGV is a special variable. This holds the arguments passed in from the command line. That is, if the user invoked:

	% myScript.pl arg1 arg2 arg3
This would set @ARGV to be ('arg1', 'arg2', 'arg3'). Note that you can also set values of @ARGV too.

If you are used to working with C or C++, you should already be familiar with the argc, argv way of passing in parameters to your program where argc (arg count) would hold the number of parameters and argv (arg values) would hold the args themselves. In Perl, you don't get argc directly, but you can get the number of parameters with scalar(@ARGV). If I remember in C, the first arg (argv[0]) is actually the name of the program. This is not the case in Perl. As you can see in the myScript.pl example above, the first arg is really the first arg. If you want the path to the script itself, you can use $0. In a future page in "Tips and Tricks," I'll talk about how to use $0 to make a common addition to @INC.


Getting some input - <STDIN> and <> [back to top]

Now it's time to get a little bit of input. The easiest thing to explain is to assume you are piping something into STDIN. Inside Perl, there are 3 standard file handles:

For right now, we're just going to think about the first one. Now, you could use this to get input from the user interactively, and Schwartz has an example of that in his book but I've found that in my 5+ years of Perl programming, I don't think I've ever once prompted the user for input, so I'm not going to go into that example. The way I would usually use STDIN is to just pipe in some input, that is, getting data from another program. For instance,
	% ls | processLs.pl
or
	% cat logfile | processLogfile.pl
The second one is actually pretty common. That is, piping a text file into a perl script. We could read the contents of the file by catting it through STDIN:
  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 $okayToPrint = 0;
  7 while (<STDIN>) {
  8 	my $currLine = $_;
  9 	if ($currLine eq "WorldBegin\n") {
 10 		$okayToPrint = 1;
 11 	} elsif ($currLine eq "WorldEnd\n") {
 12 		$okayToPrint = 0;
 13 	} else {
 14 		# some line between WorldBegin and WorldEnd
 15 		print $currLine;
 16 	}
 17 }

Listing 2.7.1 for code_untested/extractSceneFromRib-1.pl
Okay, now the color commentary...

Now, when dealing with UNIX functions, you will notice that a lot of them allow you to pass in filenames. For instance, cat, more, and grep have this behavior. If you pipe in an input stream, they will accept it. If you provide a list of files, they will open up each file in order and operate on them.

In Perl, you can do this too. Let's say we wanted our scene extracter to have this. Then we could use the <> operator:

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 $okayToPrint = 0;
  7 while (<>) {
  8 	my $currLine = $_;
  9 	if ($currLine eq "WorldBegin\n") {
 10 		$okayToPrint = 1;
 11 	} elsif ($currLine eq "WorldEnd\n") {
 12 		$okayToPrint = 0;
 13 	} else {
 14 		# some line between WorldBegin and WorldEnd
 15 		print $currLine;
 16 	}
 17 }

Listing 2.7.2 for code_untested/extractSceneFromRib-2.pl


Getting some data - backticks [back to top]

Suppose we want to process the output of a command. For instance, we might have a report generator, or something that lists jobs on your render queue, or maybe just a simple UNIX command. In the previous section, I went over how to do this by piping into STDIN.

However, sometimes, you might want to read from several programs sequentially. Or sometimes, it's just a little inconvenient to pipe in. In Perl, you can actually read in directly from the other process. If you've done csh or sh scripting, then you've probably done some backtick expressions. It works the same way in Perl:

	@fileList = `ls -l`;
	foreach my $fileLine (@fileList) {
		chomp $fileLine;
		my($perm, $unkn, $user, $group, $size, $mo, $day, $time, $name)
						= split(/\s+/, $fileLine);
		if ($size>1000000) {
			print "Big file ${size}\t${user}\t${name}\n";
		}
	}
That script will print out the usernames and filenames for all files in the current directory exceeding 1 Meg. (well, 1,000,000 bytes anyway - not quite 1 Meg) Anyway, the new thing here is the backticks. This runs ls -l in the shell, and returns the result into @fileList with 1 line per element in the list.

And if anyone's interested, chomp removes a trailing newline from a string. I think I talked about split when I talked about control structures on a previous page.

Later on, I'll get to why I don't like using backticks anymore. But for a quickie script, they get things up and running fast, and you can always go back and formalize it later.


file handling - open/close [back to top]

Earlier this page, I touched on file access. Well, just STDIN anyway. Now, of course Perl can read from any file. To read from a file, we say:

	# Supposing the first arg passed in is my filename:
	$filename = $ARGV[0];

	open(INPUT_FILE, $filename)
		or die "Couldn't open $filename for reading you GSPs!";
	while (<INPUT_FILE>) {
		my $currentLine = $_;
		# Process each line of the file in $currentLine ...

	}
	close(INPUT_FILE);
Here, we see we open a filehandle named INPUT_FILE. We can read in from it just as we did STDIN. In this case, I want to open/process a file of my choice, so I can get a filename from the command line(@ARGV).

Though it's not input, now's probably a good time to mention how we output to a file. It's very similar to getting input. It is similar to how we would direct file output on the command line - we use the > symbol. Let's take the backtick example, and let's say we want to sent the results to a logfile named in the command line:

	$outputLog = $ARGV[0];
	open(OUTPUT_LOG, ">$outputLog");

	@fileList = `ls -l`;
	foreach my $fileLine (@fileList) {
		chomp $fileLine;
		my($perm, $unkn, $user, $group, $size, $mo, $day, $time, $name)
						= split(/\s+/, $fileLine);
		if ($size>1000000) {
			print OUTPUT_LOG "Big file ${size}\t${user}\t${name}\n";
		}
	}
	close(OUTPUT_LOG);

There are actually a couple variations on this in open. You can actually use open to open a filehandle to a pipe, though I hardly do that anymore. I do, however, use open to fork off another process instead of using system or backticks. This is a somewhat advanced concept though and I'll save it for the "Doing it Better" section. (BTW, I think there is a bug in the ActiveState release, so that trick actually won't work on NT/Win32.)


© 2001 Steve Hwan, hostname: @pacbell.net, username: svhwan
You should probably use the word "PERL" in the subject line to get my attention.
Last Modified: Tue Apr 24 09:46:38 2001