Work: The Gentle Art of Software Testnig, part II

Jeffreys Copeland & Haemer

(Server/Workstation Expert, September 2000)



Quality isn't really a free variable. The only possible values are ``excellent'' and ``insanely excellent,'' depending on whether lives are at stake or not. Otherwise you don't enjoy your work, you don't work well, and the project goes down the drain.
                        --- eXtreme Programming eXplained, Kent Beck


Animal testing is a terrible idea; they get all nervous and give the wrong answers.
                        --- Stephen Fry


Well, buckaroos, we're back.

Last month, we talked about eXtreme Programming (XP), particularly about eXtreme Testing (XT) -- writing tests as you code, or even before you code, instead of after you're done.

This month, we want to drive the point home with an example.

Our first problem is how to find a good example.

We need a problem that we can explain, and write tests for before we start on the code. We also need to be able to show and explain both the tests and the code when we're done. All this needs to fit in our 2000-word column limit.

Our solution? Theft.

After we gave him a copy of our last column, our friend, Dave Taenzer, pointed us at a wonderful article by Kent Beck and Erich Gamma, at http://members.pingnet.ch/gamma/junit.htm, which takes on this same chore in Java.

Perfect. We'll just solve the problem all over again in Perl.

Look at how much we get from this simple intellectual piracy: XP advocates writing tests before writing code. What tests should we write before we begin coding? Easy. The tests we want to pass are already specified in the Beck/Gamma article. We'll just translate them into Perl, using the Test and Test::Harness modules (which will give us a chance to talk about those modules, too).

Moreover, whenever we can't figure out how to explain something, we can just claim that the solution is trivial and refer you to the original article.

Useful tip: if you're going to steal, steal from people who know what they're doing.


We won't include any of their code here -- we're pirates, not plagarists -- but to make comparisons as easy as possible for those reading the articles side-by-side, we'll try to parallel the Beck and Gamma design as closely as we can.



Money, money, money, money ...

First we'll sketch the problem. The goal is to create a class that stores and adds collections of money.

Beck and Gamma work in Switzerland, an international banking center, so the objects must be able to hold more than one kind of currency. We also need two basic operations: addition and a test for equality.



The Tests
Tests first. The structure of all our tests are similar, so we'll go through one line-by-line to explain how they're laid out.

The first thing to test is whether we've implemented comparisons correctly. We know from the outset what tests we'll use: the very same ones Kent and Eric do.

Here's our dramatic reading of the first set,

   1   #!/usr/bin/perl -w
   2   # $Id: test_equals,v 1.5 2000/06/13 03:08:40 jsh Exp jsh $

   3   use strict;
   4   use Test;
   5   use Money_fixture;

   6   ok($f12chf != undef);
   7   ok($f12chf, $f12chf);
   8   ok($f12chf, new Money(CHF=>12));
   9   ok($f12chf ne $f14chf);

  10   BEGIN { plan tests => 4 }

Line 1 is a shebang line to invoke the perl interpreter, with the '-w' flag, to warn of possible errors, Line 2 is automatically-generated, configuration-management information from RCS. Line 3 invokes the nit-picky strict pragma. These three lines are our nod to the homily, ``Cleanliness is next to Godliness.''

Line 4 brings in the Test module, which defines some utility functions for testing. We'll be using plan() and ok(). As with all other things Perl, Test.pm is available from the CPAN, (http://www.perl.com/CPAN/).

Line 5 introduces our test fixture: a context to run the tests in. Typically, a test fixture is a set of routines, constants, and variables to be shared among a set of tests. We'll show you ours in a minute; for now, we'll say that it creates things like $f12chf, a Money object containing a dozen Swiss francs.

Lines 6 through 9 use ok(), which tests an observed outcome against an expected one.

In Test.pm version 1.14, ok() accepts strings, numbers, regular expressions, and even function pointers as arguments. Function pointers are dereferenced and evaluated, and regular expressions trigger a regex match, so you can look for patterns, not just exact matches. Invoked with a single argument, ok() just looks to see whether it's non-zero.

The first test asks whether $f12chf has a value. It had better. It also tests the use of !=. The second asks if $f12chf equals itself; the third, whether it has the same value as a newly created object containing 12 more Swiss francs; and the fourth, whether it has a different value from $f14chf, which contains 14 Swiss francs. The fourth also tests the use of ne.

The last line calls plan() to say we'll be running four tests. Putting the call inside a BEGIN{} provides this information at compile time. Test.pm forces us to do this, and aborts the compilation if we don't.

Now's a good time to show our test fixture.

   1   #!/usr/bin/perl -w
   2   # $Id: Money_fixture.pm,v 1.2 2000/06/13 01:56:03 jsh Exp $

   3   package Money_fixture;

   4   use strict;
   5   use Exporter;
   6   use Money;

   7   our (@ISA, @EXPORT);
   8   @ISA = qw(Exporter);
   9   @EXPORT = qw($f12chf $f14chf $f26chf $f7usd $f21usd $fmb1 $fmb2);

  10   our $f12chf = new Money(CHF=>12);
  11   our $f14chf = new Money(CHF=>14);
  12   our $f26chf = new Money(CHF=>26);
  13   our $f7usd  = new Money(USD=>7);
  14   our $f21usd = new Money(USD=>21);
  15   our $fmb1   = new Money(CHF=>12, USD=>7);
  16   our $fmb2   = new Money(CHF=>14, USD=>21);

  17   1;

As advertised earlier, the module just creates a few Money objects. Some only have one currency type, others have two. Lines like

our $fmb1   = new Money(CHF=>12, USD=>7);
are what we think calls to our constructor should look like. Remember, we still haven't written any code; writing the tests is forcing us to define the interfaces.

By letting our constructor, which we call new() out of habit, take currency/amount pairs as arguments, it can also be an initializer. (Beck and Gamma use separate classes for objects that hold a single kind of currency and more than one kind. This seems like overkill for Perl's relaxed view of the world.)

The keyword our is new to Perl 5.6, and declares lexically-scoped globals. Earlier Perls needed a use vars pragma.

These objects in hand, we can show our tranlation of Beck and Gamma's second bucket, which checks whether bags of mixed currency compare correctly.

   1   #!/usr/bin/perl -w
   2   # $Id: test_bag_equals,v 1.4 2000/06/13 03:08:40 jsh Exp jsh $

   3   use strict;
   4   use Test;
   5   use Money_fixture;

   6   ok($fmb1 != undef);
   7   ok($fmb1, $fmb1);
   8   ok($fmb1 != $f12chf);
   9   ok($f12chf != $fmb1);
  10   ok($f7usd != $fmb1);
  11   ok($fmb1 != $fmb2);

  12   BEGIN { plan tests => 6 }



Adding Addition

What should addition look like? The simplest design we can think of is this:

$a = new Money(USD=>10);
$b = new Money(USD=>20);
$c = $a + $b;
Let's assume we can do that. Here's the Beck and Gamma simple test case:
   1   #!/usr/bin/perl -w
   2   # $Id: test_simple_add,v 1.4 2000/06/13 03:08:40 jsh Exp jsh $

   3   use strict;
   4   use Test;
   5   use Money_fixture;

   6   ok($f26chf, $f12chf + $f14chf);

   7   BEGIN { plan tests => 1 }
and here are their tests for mixed-currency addition:
   1   #!/usr/bin/perl -w
   2   # $Id: test_mixed_add,v 1.4 2000/06/13 03:08:40 jsh Exp jsh $

   3   use strict;
   4   use Test;
   5   use Money_fixture;

   6   ok($fmb1, $f12chf + $f7usd);
   7   ok($fmb2, $f21usd + $f14chf);
   8   ok($fmb1 + $fmb2, $f7usd + $f21usd + $f26chf);
   9   ok($fmb1 + new Money(CHF=>-12), $f7usd);

  10   BEGIN { plan tests => 4 }

Each of our tests is just a Perl program; we'll run individual tests by executing them from the command line.

Having written all the tests, and designed the interfaces, all that's left is writing the a class. And how will we know when we've written it? The tests will pass.


The Code
Here's our first attempt:

   1   #!/usr/bin/perl -w
   2   # $Id: Money.orig.pm,v 1.1 2000/06/13 03:08:40 jsh Exp jsh $

   3   use strict;
   4   package Money;

   5   sub new {
   6    my $class = shift;
   7    bless {@_}, $class;
   8   }

   9   sub equals {
  10    use overload ('==' => \&equals, 'eq' => \&equals );

  11    my ($s1, $s2) = @_;
  12    my @k1 = keys %$s1;
  13    my @k2 = keys %$s2;
  14    return 0 unless @k1 == @k2;
  15    foreach (@k1) {
  16     return 0 unless $s1->{$_} == $s2->{$_};
  17    }
  18    1;
  19   }

  20   sub not_equals {
  21    use overload ('!=' => \&not_equals, 'ne' => \&not_equals );
  22    !equals @_;
  23   }

  24   sub stringify {
  25    use overload('""' => \&stringify);

  26    my $s = "";
  27    while (my ($k,$v) = each %{$_[0]}) {
  28     $s .= ", " if $s;
  29     $s .= "$k => $v";
  30    }
  31    $s;
  32   }

  33   sub plus {
  34    use overload('+' => \&plus);

  35    my ($m1, $m2) = @_;
  36    my %s = (%$m1, %$m2);
  37    my $s = {};
  38    foreach ( keys %s ) {
  39     $s->{$_} =  (defined $m1->{$_} ? $m1->{$_} : 0) +
  40        (defined $m2->{$_} ? $m2->{$_} : 0);
  41    }
  42    $s;
  43   }

  44   1;

We won't do a full exegesis -- our focus is the testing, not the code -- but the perl is straightforward. Two noteworthy points:

Oh, but wait: we need to test stringify.
#!/usr/bin/perl -w
# $Id: test_printing,v 1.1 2000/06/13 03:08:40 jsh Exp jsh $

use strict;
use Test;
use Money_fixture;

ok("$f12chf", "CHF => 12");
ok("$fmb1", "CHF => 12, USD => 7");

BEGIN { plan tests => 2 }
Writing this new test reveals a design problem. A test requires predictable output, so stringify(), needs to print hash elements in a predictable order. We fix stringify to look like this:
sub stringify {
 use overload('""' => \&stringify);

 my $m = shift;
 my $s;
 foreach (sort keys %$m) {
  $s .= $s ?  ", $_ => $m->{$_}" : "$_ => $m->{$_}";
 }
 $s;
}

At last, we're ready to run the test suite. As soon as we do -- voila! -- bugs. Comparison and simple addition seem to work fine, but our mixed addition fails.

1..4
ok 1
ok 2
not ok 3
# Test 3 got: 'HASH(0x80ea938)' (./test_mixed_add at line 12)
#   Expected: 'HASH(0x82339a0)'
not ok 4
# Test 4 got: 'HASH(0x8234f44)' (./test_mixed_add at line 13)
#   Expected: 'USD => 7'
In fact, our addition results aren't even printing in the error output. We already know that Money objects print correctly, so ... addition must not be giving us Money objects.

Perl's lack of strict-typing has let us write a plus() that doesn't return the kind of object we want. Here's our next version.

sub plus {
 use overload('+' => \&plus);

 my ($m1, $m2) = @_;
 my %s = (%$m1, %$m2);
 my $s = new Money;
 foreach ( keys %s ) {
  $s->{$_} =  (defined $m1->{$_} ? $m1->{$_} : 0) +
     (defined $m2->{$_} ? $m2->{$_} : 0);
 }
 $s;
}
Should we stop and write a test to check that plus() returns a Money object? No. We already have one. That's how we knew it didn't!

Okay, now the third test passes, but the fourth still doesn't, for the same reason that Beck and Gamma's first version didn't pass their tests: Our code thinks that CHF=>0, USD=>7 is different from USD=>7. Paralleling the original paper, we'll let plus() eliminate `0' values before returning.

 my $sum = (defined $m1->{$_} ? $m1->{$_} : 0) +
    (defined $m2->{$_} ? $m2->{$_} : 0);
 $s->{$_} = $sum if $sum;

With this change, everything passes and we're done.


Test Harnesses

Having illustrated unit tests, Beck and Gamma move up a level to show a test harness: a program that runs a suite of tests, and reports the results in a stereotyped way.

Beck and Gamma develop a TestSuite object and a TestRunner tool to drive and oversee the Java tests in their paper.

Since there already is a Test::Harness module, built to work with Test.pm, we'll use it to write our own tiny harness.

   1   #!/usr/bin/perl -w -s
   2   # $Id: money_suite,v 1.5 2000/06/13 03:08:40 jsh Exp jsh $

   3   use strict;
   4   use Test::Harness;
   5   our $m;# -m switch for "mail me"

   6   eval { runtests(@ARGV ? @ARGV : <test_*>); };
   7   catch($@);

   8   sub catch {
   9    my $results = shift;
  10    print $results;
  11    mail_me ($results) if $m;
  12   }

  13   sub mail_me {
  14    my $message = shift;
  15    my $author = (getpwuid($<))[0];
  16    open(MAIL, "|/usr/lib/sendmail -oi -t")
  17     or die "Can't fork sendmail: $!\n";

  18   print MAIL <<"EOF";
  19   From: $0 (Your Friendly Test Harness)
  20   To: $author;
  21   Subject: Test summary

  22   $message

  23   EOF
  24    close(MAIL) or die "Argh!  Can't close(MAIL)\n: $!\n";
  25   }
Here's what its output looks like:
test_bag_equal......ok
test_equal..........ok
test_mixed_ad.......ok
test_printin........ok
test_simple_ad......ok
test_subclassin.....ok
All tests successful.
Files=6, Tests=21,
  3 wallclock secs
  ( 2.65 cusr +  0.19 csys =  2.84 CPU)
Nearly all the work is done by line 6, which runs every test specified on the command line (or all tests whose name starts with test_, by default). runtest() normally prints its timing and statistics summary (the last two lines of our sample output) to STDERR. By enclosing our call in an eval(), the string is captured in $@.

Our call to catch() prints this to STDOUT, but if we set up a cron job to invoke our harness with a -m flag, catch() will use mail_me() to mail us the summary. This lets us to promote conformance tests into regularly scheduled regression tests.


We Finish With Testing (for Now)

Okay. We're done.

Dave Taenzer, who started us on this journey by giving us a book, says software development methodologies are like fad diets: no matter how many haven't worked, people are always eager to try another. ``The high-carbohydrate-low-fat diet? No, no. You want the grapefruit-and-protein diet.'' ``You've been using strict typing? No wonder you've been having problems! You want typeless languages.''

Despite this mild cynicism, Dave likes both eXtreme Programming eXplained and Design Patterns: Elements of Reusable Object-Oriented Software. In our experience, Dave's recommendations are worth listening to.

We are also longtime fans and practitioners of eXtreme Testing: developing conformance tests and test harnesses from the beginning, and then writing code to pass to the conformance tests, instead of testing after the fact. We always like anything that plays to our prejudices.

We hope we've helped some readers see that good testing needn't mean armies of low-level, ``keyboard monkeys'' walking through testing scripts. Good testing is fun, intellectually challenging and technical.

We've skipped some things that deserve mentioning. We'll mention three and shut up:


Until next time, happy trails.