DISCLAIMER: THESE PAGES ARE STILL UNDER CONSTRUCTION. NO CODE EXAMPLE BEEN TESTED YET.

Perl - Doing it Better

Piping to/from a child process without system or backtick - gzipped tar files


[Previous Page] |[Next Page] Table of Contents: small | med | large

Okay, now you know how use system and backticks(``), but that's not always the right approach. Sometimes, but not always, and there are more robust ways to do this. If you don't care about tracking the process, these are fine ways of doing something, but there are the following weaknesses:

So it's kind of a running theme with me that invoking a shell is a bad thing. That means that your program will start another process as the shell, then the shell will transfer control to the command you asked for. That's an extra level of process invocation that is completely unnecessary. Also, it makes you dependent on having a sh on your system, which is okay on UNIX, but might not be the case on other platforms. Actually, it might not be okay in UNIX. I've been having a lot of problems lately with sh meaning different things on different flavors of UNIX...

I should mention that Randal Schwartz discusses a lot of these concepts in his column [SCHW99-10] which I only discovered recently. Also, there is a pretty good discussion in Programming Perl [WALL00] in the description of open. (The 2nd edition is pretty good, but the 3rd edition is more complete.) Finally, the Perl Cookbook [CHRI98] also has a lot of examples of how to launch processes in chapter 16 (Process Management and Communication - especially Recipes 16.1-16.5, and 16.10). These are the sources I've learned from.

Basically, the trick we'll use is to open a filehandle with "-|". This will basically start a child process and read the output from that through the file handle. Now, in the 3rd edition of Programming Perl [WALL00] , there is a note that there is a 3-or-more arg version of "open." This isn't available in earlier versions of Perl though. It looks very useful, but I would wait a little while before using it. I think it's only in 5.6 (5.006?) or maybe even higher. I think IRIX (SGI) still ships with version 5.4 (5.004) or even 5.003.

Basically, I would suggest using fork, exec and open instead of system and backticks(``) for really important production/pipeline stuff. If you're just writing a hacky little script, then system and backticks are okay. Additionally, when using system and exec, I would suggest using the array context and avoiding shell metacharacters to avoid unnecessarily invoking a shell.


Maybe I should take a moment and explain what system and backtick really do. Or at least close approximations(my guesses).

fork/exec/open - what system really does [back to top]

What does system really do? Let's start from the outside looking in. What do we see as a user? The first thing you see is that you execute another program, using a shell (if necessary) to evaluate special shell metacharacters. Now I'm spewing gibberish. Let's just start by saying "execute another program."

This is where fork and exec come in. If you are going to make a lot of pipeline scripts, you really should learn this, and it's not too difficult. A good reference is UNIX Network Programming [STEV90] and there's a bit of discussion in the function reference on fork in Programming Perl. [WALL00] I'll try and summarize what I have gathered about UNIX boxes. I really have no idea how Windows NT works though.

Basically, every script, every program, even the shell itself running on your UNIX box is a "process," with a unique process identification (process id, or pid). Often in shell scripting (and Perl for that matter), you can actually access this with $$, but I digress.

In the beginning, there was nothing... well, not quite. When your system boots up it will always run a program called init. This in turn calls a lot of programs, which in turn call other programs. Among others are the programs that you interact with, such as the shell. Now, when I say that a program calls other programs, there's a little more to it than that. Except for "init," each process has a parent, so there's really a big hierarchy of processes, and as far as the system is concerned. On most (System V and POSIX compliant) UNIX systems (except BSD), you can view all the processes running with:

	ps -ef
and on BSD, you can get close to that with:
	ps -aux
and the thing to look for are the columns PID and PPID. Note that "init" should have a really low number (like 0 or 1). This is the first process that was executed. You may see a bunch of daemons running like "lpd," "httpd," or "ftpd." Actually, I'm a little behind on this. I think a lot of the internet-related processes are now handled by "inetd"(internet-daemon) and that calls the others on an as-needed basis. (I could be wrong about that though.)

This is all building up to the mechanism by which processes spawn other processes. It's quite simple, and usually referred to as "fork and exec" (they go together hand in hand). Basically, with fork, a process creates a copy of itself (the child). When it does this, a new process ID is created for the child. This was actually difficult for me to grasp, but here's my interpretation: There are now 2 copies of the current process. But I think at first, they are probably sharing code segments in memory. My best guess is that data segments are copied. That is, the parent and child follow the same list of instructions, but each one has it's own set of variables/data.

As you can probably tell, I still don't fully comprehend it. But I think you get the idea. You're running through the code, and suddenly, at the point of the fork, think of the program running twice simultaneously, as the original(parent) process and new(child) process.

The tricky part now is that you have these 2 identical processes running, but you want them to do different things. So when you're coding, how do you program something that is to be interpreted different depending on if you're in the parent or child process? That's just how fork works. fork returns a pid. If you are in the parent, then fork returns the pid of the child process. But if you are in the child process, you get a 0. Then there's additional stuff to know if things go wrong, but we'll deal with that in a minute. Anyway, for now, we'll say:

Make a tar file.

EX 4.1.1: creating a tar file - system under the hood

In this case, I'll take a simple example of a creating a tar file, and honestly, you'd probably just run "tar" directly. But you could use this to fire off a process to the queue. Also, the quickie way to do this would be:

	$dirName = $ARGV[0];
	$status = system("tar","cvf", $dirName. '.tar', $dirName);
And honestly (as you'll see later), this is how I would do it for this task. But since this heading was "what system really does," then let's see what this looks like under the hood. Just to clarify, system calls C code that does this. I'm using Perl itself to describe the equivalent steps to what is going on under the hood.
  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 $dirName = $ARGV[0];
  7 
  8 # This is pretty much lifted directly
  9 # from "Programming Perl" by Wall et al.
 10 if ($pid=fork) {	# assign result of fork to $pid,
 11 			# see if it is non-zero.
 12 	# Parent process here
 13 	# Child pid is in $pid
 14 } else {
 15 	# neglecting to test if fork even worked is actually pretty sloppy.
 16 
 17 	# Child process here
 18 	# parent process pid is available with getppid
 19 
 20 	# exec will transfer control to the tar process, and will
 21 	# finish (exit) when the tar is done.
 22 	exec("tar", "cvf", $dirName . ".tar", $dirName);
 23 }
 24 
 25 # wait for the tar to complete and get the status (like system does)
 26 waitpid($pid,0);
 27 $status = $?;
 28 

Listing 4.1.1 for code_untested/makeTar-1.pl

That's a little sloppy, but it gets the point across. (hopefully) The flaw here is that I ignore the possibility of an error when fork-ing the new process. In this case, if there was a problem forking, then the parent process will actually transfer control to "tar" because $pid is not defined so it evaluates to zero. Now, honestly, this is how I usually do it, but it's not that good, especially since Programming Perl tells you how to do it better, being much more careful about error conditions:

I guess I threw exec in there assuming that I'd have covered it in an earlier page. It's basically a lot like system, except it never returns. That is it transfers control from the current process to the new program, and lets go of it's previous list of instructions(code segment). For that matter, it also lets go of its copy of the parent's data segment. If you pass any arguments in the exec call, that list is given to the new process. (Now would be a good time to read UNIX Network Programming [STEV90] by the way...) There are also options to pass down the environment too, but that's not worth worrying about at the moment.

EX 4.1.2: creating a tar file - being more careful - system under the hood

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 $dirName = $ARGV[0];
  7 
  8 # This is pretty much lifted directly
  9 # from "Programming Perl" by Wall et al.
 10 FORK: {
 11 	if ($pid=fork) {	# assign result of fork to $pid,
 12 				# see if it is non-zero.
 13 		# Parent process here
 14 		# Child pid is in $pid
 15 	} elsif (defined($pid)) {
 16 		# Child process here
 17 		# parent process pid is available with getppid
 18 
 19 		# exec will transfer control to the tar process, and will
 20 		# finish (exit) when the tar is done.
 21 		exec("tar", "cvf", $dirName . '.tar', $dirName);
 22 	} elsif ($! == EAGAIN) {
 23 		# EAGAIN is the supposedly recoverable fork error
 24 		sleep 5;
 25 		redo FORK;
 26 	} else {
 27 		#weird fork error
 28 		die "Can't fork: $!\n";
 29 	}
 30 }
 31 
 32 # wait for the tar to complete and return status (like system does)
 33 waitpid($pid,0);
 34 $status = $?;
 35 
 36 # This part would be executed by both parent and child, except
 37 # we used "exec" in the child, so effectively, this is only the parent.
 38 

Listing 4.1.2 for code_untested/makeTar.pl

At least that's the thorough, complete way of doing it. And I would encourage you to do that way.

EX 4.1.3: Another example - running perl -c in a directory

Rather than running tar, this example runs perl -c. Not a big deal, but it also integrates earlier notes on File::Find Not a big deal, but it is something that I have used on the code repository for this site itself.

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 use File::Find;
  7 
  8 # This is pretty much lifted directly
  9 # from "Programming Perl" by Wall et al.
 10 
 11 sub usePerlDashC {
 12 	my $filename = shift;
 13 FORK: {
 14 		if ($pid=fork) {	# assign result of fork to $pid,
 15 					# see if it is non-zero.
 16 			# Parent process here
 17 			# Child pid is in $pid
 18 		} elsif (defined($pid)) {
 19 			# Child process here
 20 			# parent process pid is available with getppid
 21 
 22 			# exec will transfer control to the child process,
 23 			# and will finish (exit) when the tar is done.
 24 			exec("perl", "-c", $filename);
 25 		} elsif ($! == EAGAIN) {
 26 			# EAGAIN is the supposedly recoverable fork error
 27 			sleep 5;
 28 			redo FORK;
 29 		} else {
 30 			#weird fork error
 31 			die "Can't fork: $!\n";
 32 		}
 33 	}
 34 
 35 	# wait for the perl -c to complete and return status (like system does)
 36 	waitpid($pid,0);
 37 	$status = $?;
 38 
 39 	return $status;
 40 }
 41 
 42 
 43 &File::Find::find( sub {
 44 			if ($_ =~ /\.p[lm]/) {
 45 				print "checking [$_]\n";
 46 				&usePerlDashC($_);
 47 			}
 48 		}, "."
 49 	);
 50 

Listing 4.1.3 for code_untested/checkPerl.pl


fork/exec/open - what backticks really do [back to top]

Fundamentally, backticks really are a similar deal to system.

Let's take another simple project:

Read a file into an array. If the file is gzipped, then unzip the file before reading it in.

EX 4.1.4: reading a gzipped file - a naive approach

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 $fileName = $ARGV[0];
  6 
  7 if ($fileName =~ /\.gz$/) {	# File ends with ".gz"
  8 	@contents = `gzip -f -c -d ${filename}`;
  9 } else {
 10 	@contents = `cat ${filename}`;
 11 }

Listing 4.1.4 for code_untested/readFile-1.pl

What an attrocious piece of code that is. Why don't I like it?

  1. Well, as noted above, using backticks (or system or exec) with a space inside the string means that you will be invoking a shell. You will be making assumptions that you have a sh under there (Bourne shell or Korn shell), and you have the unnecessary cost of starting up that shell.
  2. The method of determining if the file is gzipped is horrible. For example, a binary gzipped RIB file usually has the extention of ".rib", not ".rib.gz".
  3. You pay the memory penalty of reading the entire file into memory at once. Also, you might be more efficient about treating each line seperately, for instance if you were filtering through a RIB file.
  4. Using "cat" to read the file is just plain silly. We should use filehandles.
To address most of these, we can open file handles as pipes.

EX 4.1.5: reading a gzipped file - a bit better

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 $fileName = $ARGV[0];
  6 
  7 if ($fileName =~ /\.gz$/) {	# File ends with ".gz"
  8 	open(INPUTFILE, "gzip -f -c -d ${filename} |")
  9 		or die "couldn't open $filename: $!";
 10 } else {
 11 	open(INPUTFILE, $filename)
 12 		or die "couldn't open $filename: $!";
 13 }
 14 
 15 while (<INPUTFILE>) {
 16 	my $line = $_;
 17 	# Here, we could just process the file line by line, but
 18 	# for this example, we'll just push it onto an array.
 19 	push @contents, $line;
 20 }
 21 close(INPUTFILE);

Listing 4.1.5 for code_untested/readFile-2.pl

Well, this was a little better. We're at least using file handles, and this should address the last 2 issues from example 1. I also like this because we can work on the input one line at a time.

But there's still the fundamental theme of this note: The gzipped file still opens a shell because it has spaces inside the quoted section of the "open" in line 8.

EX 4.1.6: reading a gzipped file - almost there - backticks under the hood

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 $fileName = $ARGV[0];
  6 
  7 if ($fileName =~ /\.gz$/) {	# File ends with ".gz"
  8 	my $pid;
  9 	if (not defined($pid = open(INPUTFILE, "-|"))) {
 10 		die "can't fork: $!";
 11 	}
 12 	if ($pid) {
 13 		# parent process - do nothing
 14 	} else {
 15 		# child process
 16 		system( "gzip", "-f", "-c", "-d", $filename);
 17 		exit 0;
 18 	}
 19 } else {
 20 	open(INPUTFILE, $filename)
 21 		or die "couldn't open $filename: $!";
 22 }
 23 
 24 while (<INPUTFILE>) {
 25 	my $line = $_;
 26 	# Here, we could just process the file line by line, but
 27 	# for this example, we'll just push it onto an array.
 28 	push @contents, $line;
 29 }
 30 close(INPUTFILE);

Listing 4.1.6 for code_untested/readFile-3.pl

Now, we're getting a little more complicated. But basically, in the case of gzipped file, think of this as a safer way to start up another program (using open with "-|" instead of system or backtick) and listening to its output. This is just explicitly opening that other process and keeping track of the new process ID (pid). Above, I had an extensive description of fork and exec. Well, using open with "-|" (read from handle) or "|-" (write to handle) is just using fork under the hood. But it has the additional bonus that "-|" will take the STDOUT(output) from the child process and read it in through the filehandle. Similarly, if you use "|-" then writing to the filehandle would write to the STDIN(input) of the child process.

As a note, I've tried to do this in the ActiveState version of Perl 5.6, but it wouldn't allow me to use the 2 arg form of "|-". Basically, what this means is: stick to UNIX and Linux. Stay away from NT. Actually, you can probably get away with the 3+ arg form of open for most things you try to do. But I actually regard this as a bug, because the shipped pages imply that it can handle the 2 arg form too.

Backticks just basically use this mechanism, reading from the STDOUT of a child process, and waiting for the process to complete.

What more could a programmer want? Well, I still have a couple nit-picky issues with this.

  1. The combination of system/exit can be shortened to a single exec.
  2. That darned /.gz$/ business. That's just such an unbelievably crappy way to determine if the file is gzipped. It imposes this file naming convention on the user, and sometimes it is beyond their control. For instance, if they were using MTOR to generate binary gzipped RIB files, their files would end with ".rib".

The way that I'll address that last one will be to cheat. Bascially, I will open up the file twice. The first time, I will just look at the first few characters to decide if it has the gzip magic number. This is something that is automatically put in by gzip, and is unlikely to occur by chance in another file.

EX 4.1.7: reading a gzipped file - a good solution

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 $fileName = $ARGV[0];
  6 
  7 $magicNumber_GZIP = pack("C*", 0x1f, 0x8b, 0x08, 0x08);
  8 
  9 # Just peek at the first 4 characters in the file and
 10 # close it again.
 11 open(INPUTFILE, $filename) or die "Couldn't open file $filename: $!";
 12 binmode(INPUTFILE);
 13 read(INPUTFILE, $magicNumber, 4);
 14 close(INPUTFILE);
 15 
 16 if ($magicNumber eq $magicNumber_GZIP) {    # file starts w/gzip magic number
 17 	my $pid;
 18 	if (not defined($pid = open(INPUTFILE, "-|"))) {
 19 		die "can't fork: $!";
 20 	}
 21 	if ($pid) {
 22 		# parent process - do nothing
 23 	} else {
 24 		# child process
 25 		exec( "gzip", "-f", "-c", "-d", $filename);
 26 	}
 27 } else {
 28 	open(INPUTFILE, $filename)
 29 		or die "couldn't open $filename: $!";
 30 }
 31 
 32 while (<INPUTFILE>) {
 33 	my $line = $_;
 34 	# Here, we could just process the file line by line, but
 35 	# for this example, we'll just push it onto an array.
 36 	push @contents, $line;
 37 }
 38 close(INPUTFILE);

Listing 4.1.7 for code_untested/readFile.pl

Sorry about that. That

$magicNumber_GZIP = pack("C*", 0x1f, 0x8b, 0x08, 0x08);
just came out of nowhere, didn't it? I don't know how to justify it. I just ran "od" (octal dump) on a few gzipped files, and by experimentation, saw that they all seemed to begin with this sequence of characters.

Anyway, this is pretty much the way I think this task should be done. We no longer have any extra shells open, and we check for errors in most cases, so we're opening files safely. And no "system" calls or back ticks in sight.

Well, I should say that "system" and backticks have their place, especially for quickie scripts, or scripts where you really don't care about the exit status or return values from the child process, but just want to invoke something. However, if you are working on a core system or pipeline script, you should make your script a little more robust.


What I'm not telling you - signal handling [back to top]

Really, I'm not trying to hide anything from you. Right now, I'm just not very familiar with signal handling. Basically, this is handling what happens when processes die or get killed. For example, if you hit Ctrl-C, does it hit the child process, the parent process, or both? Well, I'm not really good at that stuff yet, so this discussion will be incorporated into the above examples once I learn it.

But what I'm trying to say here is that the above doesn't tell you exactly what system and backticks do, but it comes pretty close.


piping processes to each other without new shells [back to top]

To demonstrate some of these concepts, I'll use the following:

Make a gzipped tar file (commonly referred to as a "tarball"). Do not use any temporary files (use streams).

Now, for anyone on a Linux box, you realize you can already do this on the UNIX prompt by:

	tar zcvf directoryName.tar.gz directoryName

And if you're on a Sun or IRIX box or something, you know you can do this in 2 steps:

	tar cvf directoryName.tar directoryName
	gzip directoryName.tar
or even a single step with piping between the 2 processes:
	tar cvf - directoryName | gzip -f -c > directoryName.tar.gz
and you could always make an alias or something for that...

No trade secrets there... But it will be a good demonstration of the concepts I'm going to go over. Also, since I work a lot on IRIX boxes, I do wind up using tarball.pl quite a bit.

EX 4.1.8: gzipped tar files - a naive approach

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 
  6 $dirName = $ARGV[0];
  7 
  8 system "tar cvf - ${dirName} | gzip -f -c > ${dirName}.tar.gz";

Listing 4.1.8 for code_untested/tarball-1.pl

This is a very simple implementation, but not very clean. The main thing is that our string has spaces in it, so we're invoking a shell. By now, I shouldn't have to tell you how much I dislike invoking a shell. But now, it's a little trickier, and it actually took me about three days to figure out how to do this, and it's still not quite polished.

Now, my gut instinct would be to fork/exec a process and send everything to that. But we cannot simply say:

	exec( "tar", "cvf", "-", $dirName,
			"|", "gzip", "-f", "-c", ">", "$dirName.tar.gz");
But we have a new problem. Those | and > in there are shell metacharacters. So this would actually invoke the shell anyway even though we used the list context. There's got to be an additional trick here.

The trick here is to redirect the STDOUT, and there is a description of how to do this in Programming Perl [WALL00] in the description of open

The strategy here will be to invoke the two processes, and make them talk to each other. Easier said than done. I basically work backwards, redirecting STDOUT for each process.

EX 4.1.9: gzipped tar files - a better approach

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 # Perl script tarball-2.pl
  6 
  7 # make a gzipped tar file out of a directory
  8 
  9 $| = 1;
 10 
 11 $inFile = $ARGV[0];
 12 $outFile = $inFile . '.tar.gz';
 13 
 14 if ($gzippid = open(PIPE_TO_GZIP, "|-")) {
 15 	# parent
 16 } else {
 17 	# child
 18 	open(STDOUT,">$outFile");
 19 	binmode(STDOUT);
 20 	select(STDOUT);
 21 	$| = 1;
 22 	exec('gzip','-f','-c');
 23 }
 24 binmode(PIPE_TO_GZIP);
 25 
 26 open(SAVEOUT,">&STDOUT");
 27 open(STDOUT,">&PIPE_TO_GZIP");
 28 binmode(STDOUT);
 29 select(STDOUT);
 30 $| = 1;
 31 system('tar','cvf','-',$inFile);
 32 open(STDOUT,">&SAVEOUT");
 33 print "Now I'm showing off STDOUT being restored...\n";

Listing 4.1.9 for code_untested/tarball-2.pl

Again, working backwards, we first start in line 14, opening a pipe to our gzip process. Normally with gzip, you would say:

	gzip filename.tar
but that's when you're passing in the name of a file. In our case, we're actually receiving our file piped into STDIN (of the gzip). In this context, gzip will want to pipe out to its STDOUT.

We are activating the gzip inside a child process though, so we can actually redirect the STDOUT without affecting the parent process (the main program). In this case (line 18), I redirect the STDOUT of the process to the output file. So any output that comes from this child process will write to the output file. Then in line 22, I transfer control to the gzip. I do not give it an input file, so it is just waiting for something to come in through STDIN.

The business in lines 19-21 and 28-30, I don't know if these are really necessary. But since I just redirected STDOUT, I thought it would be safe. $| just tells Perl not to buffer its output.

So now we have an open filehandle, PIPE_TO_GZIP that is waiting for some input.

Next, I get ready to do the tar. In this form of tar (line 31), we can output the tar file to STDOUT rather than to a file. (The key there is passing in the "-" as an argument.) And using the same technique as the gzip above, we redirect the STDOUT in line 27. In this case, we will invoke a process that will spew output to a file handle, and conveniently enough, we also have a filehandle(PIPE_TO_GZIP) just waiting for input. So in line 27, we get ready to send all of our STDOUT to the PIPE_TO_GZIP.

Though it'ss unnecessary, just for grins, I was playing around with saving the STDOUT filehandle and restoring it (lines 26 and 32, as described in Programming Perl [WALL00] ) In this case, it's unnecessary because after the tar is done, the program exits. But I just wanted to have that code sitting around someplace, because you never know when you might want to do that.

Now, observant people might have noticed that after this extrememly long explanation about how to avoid system, I used one right there in line 31. Well, you may also recall I said it was silly to do all this extra stuff for a process as simple as that. Also, now that I look it over, I'm very sloppy about the $gzippid in line 14. I should really check to see if it is defined first.

Honestly, that's the version that I've been using in production now for months, but if you're really picky, we can clean it up a little bit.

By the way, notice that in this case, we are piping out to gzip with "|-" where in our previous example, we were preprocessing a file with gzip to read a file from gzip using "-|".

EX 4.1.10: gzipped tar files - cleaned up

  1 #!/bin/sh
  2 #! -*- perl -*-
  3 eval 'exec $PERLLOCATION/bin/perl -x $0 ${1+"$@"} ;'
  4  if 0;
  5 # Perl script tarball.pl
  6 
  7 # make a gzipped tar file out of a directory
  8 
  9 $| = 1;
 10 
 11 $inFile = $ARGV[0];
 12 $outFile = $inFile . '.tar.gz';
 13 
 14 if (not defined($gzippid = open(PIPE_TO_GZIP, "|-"))) {
 15 	die "Can't fork: $!\n";
 16 } else {
 17 	# child
 18 	open(STDOUT,">$outFile");
 19 	binmode(STDOUT);
 20 	select(STDOUT);
 21 	$| = 1;
 22 	exec('gzip','-f','-c');
 23 }
 24 binmode(PIPE_TO_GZIP);
 25 
 26 open(SAVEOUT,">&STDOUT");
 27 open(STDOUT,">&PIPE_TO_GZIP");
 28 binmode(STDOUT);
 29 select(STDOUT);
 30 $| = 1;
 31 
 32 if (not defined($tarpid=fork)) {
 33 	die "Can't fork: $!\n";
 34 } elsif ($tarpid == 0) {	# child process
 35 	exec("tar", "cvf", "-", $dirName);
 36 }
 37 
 38 # wait for the tar to complete and return status (like system does)
 39 waitpid($tarpid, 0);
 40 $status = $?;
 41 
 42 open(STDOUT,">&SAVEOUT");
 43 print "Now I'm showing off STDOUT being restored...\n";

Listing 4.1.10 for code_untested/tarball.pl

So hopefully, you've found that a little helpful in understanding how processes work in Perl (and UNIX, for that matter). Though this example is a very simple one, hopefully, you can see how you could generalize it to string an arbitrary number of processes together. (Or perhaps even generalize the open section to take in a filehandle to redirect to (see the fork section for saving or redirecting handles) and give it an array or array reference to pass into the exec and make your own generalized system/backtick function to give yourself the best of both worlds (ability to use array context to avoid the shell and be able to read the output of the process.


© 2001 Steve Hwan, hostname: @pacbell.net, username: svhwan
You should probably use the word "PERL" in the subject line to get my attention.
Last Modified: Sun Dec 2 19:22:05 2001