Implications of Arrow's "Impossibility Theorem" for Voting Methods 

Kenneth Arrow proved no voting method can satisfy a certain set of desirable 
criteria, implying no voting method is ideal.  But this does not mean we 
should abandon the search for the best (non-ideal) voting method, and 
in particular, since the set of nominees is endogenous the effect of the 
voting method on the set of nominees should be included in the analysis.  

There are often gains to be had by an organization or society by making a collective choice 
from a set of alternatives available to them, rather than having each individual act independently 
(uncertain how others will act).  However, since there are many ways to aggregate individuals' 
reports of their preferences in order to reach a collective choice, the gain (or loss) may depend 
on the procedure by which the collective choice is made.  To study this we need to model the 
nature of individuals' preferences and consider various criteria by which various aggregation 
methods can be compared. 

We make some useful abbreviations.  We use letters like i, j, etc., to denote individuals who 
vote.  Assume the group is choosing from a (possibly large) set of possible alternatives, which 
we call X.  We use letters like x, y, z, etc., as abbreviations for alternatives in X.  Assume 
the alternatives are mutually exclusive, in that at most one can be elected, and assume X is 
complete, in that it includes all possible outcomes.  Thus one and only one alternative in X  
will be elected.  The individuals might not be asked to consider every alternative in X
particularly if X is large, so we refer to the alternatives under consideration as the "agenda" 
and call them A.  We can also think of A as the set of "nominated" alternatives, those which 
appear on the ballot.  A is not determined by nature but is affected by nomination decisions 
made by individuals--perhaps only a small number of individuals are required to add an 
alternative to the agenda.  Whether or not individuals have incentives to nominate certain 
alternatives will depend on their beliefs about how the action would affect the outcome 
in the short and long term. 

We model each individual as behaving as if she has "preferences" regarding alternatives.  
Every preference is a relative comparison of some pair of alternatives.  That is, for any 
pair of alternatives, say x and y, each individual has a preference for x over y or has a 
preference for y over x or is indifferent between x and y.  We assume each individual's 
preferences are "self-consistent": Each individual who prefers x over y and y over z  
also prefers x over z, and each individual who is indifferent between x and y and 
between y and z is also indifferent between x and z.  Such self-consistent preferences 
are called "orderings" of the alternatives, in the same sense that numbers can be ordered 
from largest to smallest.  Alas, no individual's preferences can be directly observed; 
all we can observe are behaviors such as how they choose from a set of options, 
or how they answer polls (not necessarily honestly), or how they mark ballots.  
We don't attempt here to model the educational processes by which individuals 
acquire preferences, nor how preferences may change with time; we are concerned 
mostly with preferences as they are when society votes (hopefully after due deliberation, 
but not necessarily). 

Individuals' preferences may be intense, or mild, or in-between.  Depending on the 
criteria we impose on the voting method, information about preference intensities 
might not be admissible when voting, or might be ignored when tallying the outcome.  

Without loss of generality, we assume that when society votes, individuals mark ballots.  
The collection of all ballots is input to a tallying procedure, called a "choice function," 
which we will call C.  To avoid overly constraining the analysis, we will not assume C  
always chooses a single winner; in the cases where C chooses more than one we assume 
a subsequent procedure, such as flipping a coin, will be used to pick one of those chosen 
by C.  Thus our first criterion is simply the following: 

Prime directive:  The choice function must choose one or more of the 
nominated alternatives (if at least one alternative has been nominated) 
and not choose any non-nominated alternatives.  

The prime directive should not be interpreted as banning "write-in" candidates, which 
we would treat as "just-in-time" nominees.  Besides ensuring that at least one of the 
nominees will be chosen, its purpose is to ensure that no alternative left unranked 
by every voter will be chosen.

The next two criteria are straightforward and very mild constraints: 

Unanimity:  No alternative that is ranked by all voters below another 
alternative, say x, may be chosen if x is one of the nominees.

Non-dictatorship:  No voter may be so privileged that, regardless of the 
other voters' votes, the choice is always his top-ranked nominee (or from 
among his top-ranked nominees, when he votes indifference at the top).

Our next criterion serves to limit the amount of information that must be elicited from 
the voters, so they only need to express preferences regarding nominated alternatives (A).  
This is justifiable since the set of possible alternatives X might be very large, so a voting 
method that needs preference information regarding all of X would exhaust all participants.  
Or, if the voting method needs info about some non-nominated alternatives (in addition to 
info about the nominees) then it would not be obvious which alternatives outside A should 
be included, and if any individuals are given the power to decide which other alternatives 
will be voted on, they might be able to manipulate the outcomes in their favor.  Also, 
game theory predicts that this constraint is actually quite mild, since if the voters know 
that alternatives outside A cannot be chosen then their optimal voting strategies would 
elect the same alternatives as would be elected if those outside alternatives could not 
appear in their votes. 

Independence from Irrelevant Alternatives (IIA):  The choice function 
must neglect all information about non-nominated alternatives.  

The next criterion further constrains the information that may be considered by C.  
Specifically, we require C to ignore information about the intensity of voters' preferences, 
so "mild" preferences will be treated the same as "intense" preferences.  In other words, 
two ballots that rank the alternatives in the same order must be treated the same.  The 
justification for this is that, if intensity information were not ignored, it would create a 
strong incentive for each voter to exaggerate her intensities by dividing the alternatives 
into two groups and voting the maximum possible intensity between the two groups 
(and indifference within each group).  To see this, suppose pre-election polls indicate 
the two likely front-running candidates are x and y.  Then each voter who prefers x over y  
has an incentive to report the maximum possible intensity for x over y, to avoid partially 
wasting the power of her vote.  Similarly, those who prefer y over x have an incentive to 
report the maximum possible intensity for y over x.  If the voters who prefer x over y  
believe those who prefer y over x will vote the maximum intensity for y over x, they would 
be foolish not to vote the maximum intensity for x over y, etc.  While doing so, it would 
be most effective for those voting x over y to also cast the maximum possible vote for 
candidates preferred over x (in other words, indifference between them and x, since x  
already is getting their maximal vote) and the minimal possible vote for candidates less 
preferred than y (indifference between them and y), etc.  This may not be obvious at first, 
but we presume most voters would quickly learn the strategy since it is so easy.  The result 
would be that voters would express much less information in their votes than if the choice 
function ignores all intensity information.  Thus, we have our next criterion: 

Ordinality:  The choice function must neglect all "intensity" information.  
In other words, only "ordinal" information may affect the choice.  

The next criterion requires that the choice function accept a considerable amount and 
diversity of information from each voter about her preferences, if she wishes to express it.  
Since Kenneth Arrow was analyzing whether and how voters' preferences might be 
aggregated to reach a collective decision, and since there is no a priori reason to expect 
voters' preferences to adhere to any pre-ordained pattern, it makes sense to require the 
method of aggregating preferences to work no matter what the voters' preferences may be.  

Universal Domain:  The choice function must accept from each voter 
any ranking of the alternatives.  

On the other hand, we are not really limited to Arrow's framework, which was designed 
merely to try to aggregate voters' (sincere) preferences.  Although it is reasonable to require 
the voting method to work for any collection of preferences the voters may have, it does 
not necessarily follow that no constraints should be placed on the expressions voters may 
make when voting.  For instance, the so-called Approval voting method constrains each 
voter to partitioning the alternatives into two subsets, which is equivalent to a non-strict 
ordering that has at most two "indifference classes."  It is not a priori obvious that the use 
of voting methods such as Approval, which constrain the voters from completely expressing 
their preference orderings, are worse for society, so the universal domain criterion should 
be considered controversial until other arguments not explored by Arrow are examined 
(assuming those arguments support the conclusion that it is better not to constrain the voters 
from expressing orderings).  In other words, other criteria for comparing voting methods, 
in addition to Arrow's criteria, need to be evaluated. (My own conclusion is that there are 
solid reasons why it is better not to constrain the voters' expressions, but that is beyond 
the scope of this document.)  

Kenneth Arrow's theorem [1951, 1963] states that, if X includes at least 3 alternatives 
then no choice function that satisfies all of the criteria listed above also satisfies the 
following criterion: 

Choice consistency:  For all pairs of alternatives, say x and y, if the votes 
are such that x but not y would be chosen from some set of nominees that 
includes both, then y must not be chosen from any set of nominees that 
includes both. (The literature usually calls this rationality, but I prefer 
the less loaded term choice consistency.)  

(A proof of Arrow's theorem is provided in the appendix.)

Choice consistency is demanding, and requiring that it be satisfied is controversial.  It requires 
the same self-consistency of the social choice function as we observe in individuals' choices.  
If choice consistency is satisfied then it would be impossible to manipulate outcomes by 
manipulating the agenda or by misrepresenting preferences when voting. (That claim may 
not be obvious.)  But it is easy to construct an example that shows that for most voting 
methods, in particular all methods that reduce to majority rule when there are only two 
nominees, choice consistency is too demanding to satisfy:  

Suppose the nominees are x, y and z.  Suppose about a third of the voters rank 
"x over y over z" and about a third rank "y over z over x" and the rest rank 
"z over x over y."  Each voter's preferences are self-consistent (an ordering of 
the alternatives), but the social choice is not: About 2/3 of the voters rank x over y
so the choice by majority rule from the agenda {x,y} would be x, which means 
choice consistency requires that y not be chosen from {x,y,z}.  Also, about 2/3 
rank y over z, so the choice by majority rule from {y,z} would be y, which means 
choice consistency requires that z not be chosen from {x,y,z}.  And about 2/3 
rank z over x, so the choice by majority rule from {x,z} would be z, which means 
choice consistency requires that x not be chosen from {x,y,z}.  Thus choice 
consistency
 requires that none be chosen from {x,y,z}.  But the prime directive 
requires that at least one must be chosen. 

(The example also illustrates a fundamental principle of which few people are aware: 
Whenever there are more than two alternatives, there is more than one majority.) 

Arrow's result spawned a great deal of activity researching implications and variations of his 
theorem.  Many social scientists seem to have interpreted the literature to mean no voting 
method is reasonable, when it actually only means that no voting method is ideal.  The search 
for the best (non-ideal) method is still important, and no doubt much of the effort will need 
to be empirical work, since otherwise it will remain disputed which of the many possible 
weakenings of choice consistency are the most important to satisfy.  One approach might 
be to calculate for each possible voting method the fraction of the possible voting scenarios 
in which choice consistency is violated, seeking voting methods that minimize this fraction.  

Another approach stems from the argument that in large public elections, where many voters 
are not strategically minded, the easiest manipulations are accomplished (if the voting method 
is not immune to this) by finding and nominating candidates that are similar to and/or inferior 
to other nominees.  By requiring satisfaction of independence from clone alternatives 
(defined below), this would avoid the incentive for parties to nominate as many candidates 
per office as possible (which would make a farce of elections) and also significantly reduce 
incentives for parties to nominate only one candidate (or no candidates) per office.  Instead, 
parties might have a net incentive to try to increase the turnout of their supporters on election 
day by nominating a diversity of candidates, each of whom inspires enthusiasm amongst 
one or more factions of supporters. (While turning out to vote for a favorite candidate, 
many of those voters would presumably also rank their party's other nominees over other 
parties' candidates.) 

Independence from clone alternatives:  Call a group of alternatives "clones" if 
no voter ranks any nominee outside the group between any members of the group.  
The election chances of every candidate outside a group of clones must be the 
same when one of the clones is nominated as when two or more of the clones 
are nominated. (See Tideman 1987.)

Another potentially important criterion is motivated by the observation that people do not 
like to compromise, and hate compromising unnecessarily, and may be unsure how much 
they need to compromise.  Thus it may be easier to persuade voters to lower "greater evil" 
candidates than to persuade voters to raise "compromise" candidates equal to or over more 
preferred candidates.  This difference matters in scenarios where a minority who prefer a 
"greater evil" candidate over a "compromise" candidate can cause the greater evil to be 
elected by strategically misrepresenting their preferences, since it bears on the ability of 
the majority who prefer the compromise to thwart (and deter) that strategy.  It also matters 
in scenarios where a minority-preferred "greater evil" candidate would be elected even 
without strategic misrepresentation (which can easily occur using some voting methods, 
such as plurality rule, majority runoff, and instant runoff) since it bears on the ability of 
the majority to defeat the greater evil; if the majority cannot reliably coordinate their 
voting strategies, then the only option remaining to them is to deter some of their favorite 
candidates from competing (for instance by forming a large party and choosing only one 
nominee using a primary election, which may not nominate the best candidate), which may 
depress voter turnout.  The minimal defense criterion embodies the solution to these 
problems, by making a lowering strategy effective when wielded by a majority: 

Minimal defense:  The voting method must satisfy both of the following properties:  
(1)  Each voter must be allowed to vote any ordering of the alternatives, 
       with indifferences allowed (at least at the bottom of each ordering). 
(2)  For all pairs of nominees, say x and y, if the number of voters who rank both 
       "x over y" and "y no higher than tied for bottom" exceeds the number of 
       voters who rank "y higher than tied for bottom" then y must not be chosen. 

Use of the voting strategy enabled by satisfaction of minimal defense (that is, downranking 
"greater evil" alternatives to tied for bottom) may create an equilibrium that elects the same 
alternative that would be chosen if no one misrepresented any preferences.  In such cases 
the strategy should be deemed benevolent, not manipulative, and provides an argument 
why the voting method should allow voters to express indifference in their orderings, 
at least at the bottom. (A second argument is that when the number of nominees is large, 
it would be tedious to force each voter to strictly rank every alternative.  Allowing the 
voter to leave some nominees unranked, then treating those left unranked as if they 
had been ranked below those explicitly ranked, would be a useful shortcut.)  

Many criteria have been advocated in the social choice literature.  Some appear to have only 
aesthetic value, such as requiring that the outcome must not change if two additional opposite 
orderings are added to the set of votes.  Another criterion, reinforcement, requires that if the 
same alternative would be chosen by two separate groups of voters then it must be chosen 
if both groups' votes are combined together.  In principle, voting methods that do not satisfy 
reinforcement might be manipulated by individuals who have the power to decide whether 
and how the voters are partitioned, but it is simple for the rules to prevent this by not granting 
such power to a minority.  So reinforcement seems much less important than independence 
from clones
and minimal defense.  

Another criterion that seems less important is participation, which requires that the winning 
alternative must not be ranked by any voter below the alternative that would have won had 
that voter abstained.  Besides its aesthetic appeal, the argument may be made that voter 
turnout is already so low that there should be no new incentives to abstain.  There are two 
flaws with this argument.  One flaw is that the argument neglects the option of misrepresenting 
some of the voters' preferences: if a voter has the information about other voters' likely votes 
that leads her to believe she would prefer the outcome if she abstains more than if she votes 
her sincere preferences, that information points the way to a strategic vote that would 
result in an outcome she would prefer at least as much as if she abstains.  The other flaw is 
that it is an "all else being equal" argument and all else is not equal.  That is, there may be 
voting methods that do not satisfy participation yet create much greater incentives for 
non-voters to vote than do methods that do satisfy participation.  For instance, the 
nominations may depend on the voting method, and voters' decisions whether or not to 
vote may depend on which alternatives are nominated; some individuals might vote only 
if they enthusiastically support at least one nominee.  There might even be some individuals 
who will not vote if aware the voting method is easily manipulated by strategic nomination 
of clones or inferior alternatives. 

Appendix - A brief rigorous proof of Arrow's theorem

Note: A proof written in more concise mathematical notation is provided in the 
document "Proof of Arrow's Theorem."

(We patterned this after a proof by John Geanakoplos [2001].)  Assume X includes at 
least 3 alternatives and C is a social choice function that satisfies all of Arrow's criteria 
except perhaps non-dictatorship.  We will show that C violates non-dictatorship.  

Call any collection of votes, one vote per voter, a "profile."  Without loss of generality 
due to satisfaction of universal domain and ordinality, assume each vote in a profile 
may be any ordering of X.  

First we establish the following claim:  

(1) For any alternative, say x, given any profile in which each vote that does not 
rank x strictly top ranks x strictly bottom, one of the following conditions holds: 
        (1.1)  C chooses x alone from every agenda that includes x.  
        (1.2)  C does not choose x from any agenda that includes another alternative. 

Proof of 1:  Pick any alternative x.  Pick any profile such that each vote that does 
not rank x alone at the top ranks x alone at the bottom, and call this profile V.  
Suppose 1.1 does not hold given V.  We must show 1.2 holds.  Since 1.1 does not 
hold given V, there must be at least one alternative distinct from x, say y, such that 
C chooses y (and perhaps also x) from some agenda that includes x, given V.  
By choice consistency and the prime directive, the following must hold: 
        (1.3)  Given V, C chooses y (and perhaps also x) from {x,y}.  
Since there are at least three alternatives, we can arbitrarily pick an alternative distinct 
from x and y, say z.  Construct another profile, call it V', that is the same as V except 
z is moved immediately over y in every vote in V'.  That is, the following holds: 
        (1.4)  Every vote in V' ranks z over y
Since x is at the top or bottom of each vote in V, both of the following hold: 
        (1.5)  For each voter, the relative ordering of x and y is the same in V' as in V
        (1.6)  For each voter, the relative ordering of x and z is the same in V' as in V
By 1.4 and unanimity, the following must hold: 
        (1.7)  Given V', C does not choose y from {x,y,z}.  
By 1.7 and the prime directive, the following must hold: 
        (1.8)  Given V', C must choose x and/or z from {x,y,z}. 
Next we show the following statement holds: 
        (1.9)  Given V', C does not choose x from {x,y,z}.  
Suppose to the contrary C chooses x from {x,y,z} given V'.  By 1.7 and choice 
consistency
C does not choose y from {x,y} given V'.  By 1.5 and IIA
C does not choose y from {x,y} given V.  But this contradicts 1.3, 
so the contrary assumption cannot hold, establishing 1.9.  
By 1.7, 1.9 and the prime directive, C chooses z alone from {x,y,z} given V'.  
By choice consistency, C chooses z alone from {x,z} given V'.  By 1.6 and IIA
the following holds: 
        (1.10)  C chooses z alone from {x,z} given V.  
By choice consistency, C does not choose x from any agenda that includes z
given V.  Since z was any arbitrary alternative distinct from x and y
the following holds:  
        (1.11)  Given V, C does not choose x from any agenda that includes 
                  at least one alternative that is not x or y
Statement 1.11 is close to statement 1.2, which we have been aiming to establish; 
it remains only to show that, given V, C does not choose x from any agenda that 
includes y.  By 1.10, 1.3 would still hold if we swapped the labels of y and z.  
Therefore, by the same reasoning that followed 1.3 and led to 1.11, the 
following statement (like 1.11 with y and z swapped) must also hold: 
        (1.12)  Given V, C does not choose x from any agenda that includes 
                  at least one alternative that is not x or z
Together, 1.11 and 1.12 imply 1.2.  Since x was picked arbitrarily and V was 
any set of votes in which x is ranked strictly top or bottom by every voter, 
claim (1) is established.

Next we establish the following claim:  

(2) For any alternative, say x, there is an individual voter, call her dx, who "dictates" 
over all pairs of alternatives distinct from x. (By "dictating over a pair," for instance 
y and z, we mean that if dx ranks y over z then C does not choose z from any agenda 
that includes y, no matter how the other voters vote.) 

Proof of 2:  Let n denote the number of voters (assumed to be at least 1) and label 
the voters from 1 to n.  Pick any alternative, say x.  Consider any set of admissible 
votes, call it V0, such that the following condition holds: 
        (2.1)  Each vote in V0 ranks x strictly bottom (below all other alternatives). 
By unanimity, the following statement must hold:  
        (2.2)  Given V0C does not choose x from any agenda that includes 
                  another alternative. 
Construct a sequence of n admissible sets of votes, labeled V1 to Vn, such that 
all three of the following conditions hold for each integer k from 1 to n
        (2.3)  For all pairs of alternatives y and z distinct from x, the relative ordering of 
                  y and z is the same in each vote in Vk as in the corresponding vote in V0
        (2.4)  For each voter from k+1 to n, her vote in Vk ranks x strictly bottom. 
        (2.5)  For each voter from 1 to k, her vote in Vk ranks x strictly top.  
By 2.5, every vote in Vn ranks x strictly top.  By unanimity and the prime 
directive
, the following statement must hold: 
        (2.6)  Given Vn, C chooses x alone from every agenda that includes x
By 2.1, 2.4 and 2.5, x is ranked either strictly top or strictly bottom by every vote in 
each of V0, V1, V2, ..., Vn.  Therefore, by (1) one of the following two statements 
must hold for each integer k from 0 to n
        (2.7)  Given VkC chooses x alone from every agenda that includes x.  
        (2.8)  Given VkC does not choose x from any agenda that includes 
                  another alternative. 
By 2.6, 2.8 does not hold for k = n.  Thus we can let d denote the smallest integer 
between 0 and n (inclusive) such that 2.8 does not hold for k = d.  By 2.2, d is not 0.  
Thus the following two statements hold: 
        (2.9)   Given VdC chooses x alone from every agenda that includes x.  
        (2.10)  Given Vd-1C does not choose x from any agenda that includes 
                   another alternative.  
Now pick any pair of distinct alternatives distinct from x, say y and z (which we can 
do since there are at least 3 alternatives).  Construct a set of votes, call it V', that is 
the same as Vd except voter d ranks y top. (Thus voter d ranks x next-to-top in V'.)  
In V', voters 1 to d-1 rank x over y and voters d to n rank y over x, as in Vd-1.  
Thus, by IIA and 2.10, C does not choose x from {x,y} given V'.  
By the prime directive, C chooses y alone from {x,y} given V'.  
By choice consistency, C does not choose x from {x,y,z} given V'.           (2.11) 
In V', voters 1 to d rank x over z and voters d+1 to n rank z over x, as in Vd.  
Thus, by IIA and 2.9, C chooses x alone from {x,z} given V'.  
By choice consistency, C does not choose z from {x,y,z} given V'.           (2.12)
By 2.11, 2.12 and the prime directive, C chooses y alone from {x,y,z} given V'.  
By choice consistency, the following must hold: 
        (2.13)  Given V', C does not choose z from any agenda that includes y.  
Since the relative orderings of y and z in V0 were arbitrary, this means the relative 
orderings of y and z in V' were arbitrary except for voter d who ranks y over z.  It 
follows by IIA and 2.13 that given any set of votes in which voter d ranks y over z
C does not choose z from any agenda that includes y.  Thus voter d dictates over 
y and z.  Since y and z were picked arbitrarily, being any two alternatives distinct 
from x, it follows that voter d dictates over every pair of alternatives distinct from x.  
Since x too was picked arbitrarily, this means we can find a voter like d for any 
alternative, establishing claim (2).  

Since X includes at least three alternatives, we can arbitrarily pick three distinct alternatives 
and label them x, y and z.  By (2), there exist voters i, j and k (not necessarily distinct) 
such that i dictates over y and z, j dictates over x and z, and k dictates over x and y.  
We will show i = j = k.  Suppose the contrary, meaning we are dealing with two or 
three "dictatorial" voters.  By universal domain these two or three voters (like every voter) 
can vote any ordering of the alternatives, so we can find a set of admissible votes in which 
i ranks y over z and j ranks z over x and k ranks x over y.  Since the two or three voters 
dictate over their respective pairs of alternatives, this means C chooses no alternative from 
{x,y,z} given these votes, which contradicts the prime directive.  Thus the contrary 
assumption cannot hold, so i = j = k.  Since x, y and z were picked arbitrarily, it follows 
that this voter i = j = k is a unique voter who dictates over all pairs of alternatives.  That is, 
for all pairs of alternatives, for instance x and y, if voter i ranks x over y then C does 
not choose y from any agenda that includes x, regardless of how the other voters vote.  
Thus C cannot choose any alternative that voter i ranks below some nominee.  By the 
prime directive, C always chooses from among voter i's top-ranked nominees, which 
means C violates non-dictatorship.  Thus, we have established that no social choice 
function satisfies all of Arrow's criteria when there are at least three alternatives.  Ω 

References

[1]  Arrow,  Kenneth J (1951, 1963).  Social Choice and Individual Values.  
New York: John Wiley and Sons. 

[2]  Condorcet (1785).  Essai sur l'application de l'analyse à la probabilité des 
décisions rendues à la pluralité des voix
. Paris.

[3]  Geanakoplos, John (2001).  Three Brief Proofs of Arrow's Impossibility Theorem.
Cowles Foundation discussion paper No.1123RRR.  Cowles Foundation for Research
in Economics, Yale University, New Haven, Connecticut.   (http://cowles.econ.yale.edu)

[4]  Tideman, TN (1987).  Independence of Clones as a Criterion for Voting Rules
Social Choice and Welfare, 4: 185-206.