People playing games: the human face of experimental economics.
Eckel, Catherine C.
1. Introduction
Research in economics is focused primarily on the behavior of
groups of individuals aggregated into markets and economies. Economists
have paid less attention to understanding individual decision making,
which (by rights) is the realm of psychology. Much of economics appears
blind to the individual, and this results in the neglect of issues that
can be useful in understanding market phenomena such as the gender or
race wage gap. Combining a psychological interest in the characteristics
and propensities of individuals with an economist's perspective can
give insights into important economic outcomes. Experiments, both in the
lab and the field, examine the behavior of people in situations that
economists are interested in. My purpose is to show how experiments,
people playing games in the lab, can provide a window on individual
behavior, a measure of the nature and variability of the preferences
that underlie economic models, and a better understanding of puzzling
economic phenomena.
Altruism and trust are two aspects of human behavior that might
seem orthogonal to understanding economies. Yet they are very important
factors for the success of an economy. Altruism ensures that the people
in the economy who do not (or cannot) win under the market system are
taken care of. Altruists also increase the production of public goods.
Trust is said to grease the wheels of commerce by enhancing trade among
strangers and by making complete contracts unnecessary. Because of the
cost of writing and enforcing fully specified contracts, most economic
exchange must operate without such contracts, and can do so successfully
because of high levels of both trust and trustworthiness in the economy.
The research I present today focuses on these two important areas of
behavior, and the use of experiments to study them.
2. Economic Man in Residence
Economic man is a simple agent who single-mindedly maximizes his
utility, and who lives in our models. He is useful as a building block
in economic models, game theoretic among them. He arose, not as a way to
describe actual human motivation, but rather as a simplification (as
most models arise) intended to capture important aspects of behavior.
Much has been made in recent years of the systematic failure of
experimental subjects to behave like economic man. This challenge to the
vision of man as a rational being has led to the development of many
models that extend the range of predicted behaviors to include the kinds
of cooperation, trust, and reciprocity that we observe in the lab. Some,
indeed, have pronounced economic man dead and, dead or alive, apart from
a few especially earnest graduate students in economics, surely no one
really believes in him. Nevertheless, he is useful. Assumptions of
rational behavior do have handy formal properties, allowing the
(relatively) easy aggregation of "maximizing monads" (a term I
believe is due to Deidre McCloskey) into markets and economies, and
extensions of predictions about individual rational behavior into
greater understanding of complex system-level phenomena.
Reliance on economic man in our models does not require that
citizens be selfish, but in practice most models assume that agents care
only about their own well-being. In the same vein, such reliance does
not require that agents behave identically, but in practice, individuals
are modeled as being identical. Just as no one really believes people
are completely selfish, no one really believes that all people are the
same. People are different from each other, and they treat each other
differently. The differences are not random: Heterogeneity is
systematic. For example, in experimental games, about a quarter of
experimental subjects really do behave like economic men (and women), no
matter what situation we put them into. Perhaps 20% are altruists, and
behave as if the welfare of others is very important to them. The rest
could be termed "conditional cooperators," cooperating or
behaving altruistically when the costs are low or the benefits high,
trusting when the potential gains to trust are high, reciprocating when
it is the right thing to do. In a way, these conditional cooperators
seem familiar to economists-they often exhibit a kind of economical
cooperation. But subjects may be contingent in other ways, as well. In
two settings with identical payoff structures, a subject may behave very
differently depending on the context. (1) And sometimes the
contingencies are things we do not approve of and teach our children not
to pay attention to, superficial things such as the mere appearance of a
counterpart.
Who are the economic men and women, and who the altruists? Who is
trusted and who not? An exploration of the systematic differences in
behavior across different categories of persons (gender, racial, etc.)
and of the systematic difference in which different categories are
treated by others can improve our understanding of market outcomes.
3. Measuring Altruism and Trust
To investigate heterogeneity in behavior requires reliable
measurement. Self-reported survey measures of altruism, trust, or
trustworthiness have several shortcomings. First, there is no incentive
to report correctly, and economists are naturally skeptical of such
"cheap-talk" claims. Second, there may be an incentive to
misreport. To take a simple example, suppose I ask if you are altruistic or trustworthy. You may exaggerate your virtues to impress me or someone
else who might be looking on, or to validate your own self-image. You
may not even be consciously aware of your exaggeration, believing you
would behave in a certain way. There is plenty of evidence that people
over-report socially desirable behavior, such as voting, volunteering,
or even exercising.
Laboratory experiments were developed for testing theory and for
teaching, but that is not all they are good for. (See Holt (2003) for
interesting uses of experiments.) In particular, experimental games can
also be designed explicitly to measure preferences. Measuring
preferences is important because so many economic models require
parameterization of a utility function in order to have empirical
content. Knowledge of the arguments that affect utility and their
sensitivity not only to price but also to other elements of the
environment can help make model predictions more precise.
Experiments are incentivized choices: Something (usually money) is
at stake, making misrepresentation costly. Decisions made in the lab are
real, not hypothetical. Two important elements of the
experimentalists' creed are: Thou shalt pay, that is subjects are
really paid according to the decisions that they make in the experiment;
and thou shalt not deceive, that is everything we tell subjects in an
economics experiment is true. An experiment might give a subject an
opportunity to exhibit altruism, trust, or trustworthiness, at some
cost. To say that you are altruistic in the lab means giving up some of
your money. Doing so provides a behavioral measure of a preference for
altruism.
To be a good measure it should have three characteristics. It
should be replicable, giving the same result when repeated. It should be
internally valid, measuring what it is supposed to measure. And it
should be externally valid, correlated with behavior outside the lab.
All these things are possible in lab experiments, where it is possible
to control the experimental environment and specify carefully the
variations or treatments we want to implement. Experiments are
replicable and internally valid (if they are designed well). External
validity can be tested by collecting information from subjects and by
following them outside the lab.
4. Heterogeneity across People: Altruism
To illustrate the ways in which experiments can be used as
measures, consider the question: Are women more altruistic than men? To
measure altruism we use the dictator game. This is not much of a game,
but rather an allocation task. One person, the dictator (though we
don't use that loaded word in our carefully neutral instructions),
is given an amount of money by the experimenter. He then is given the
opportunity to donate some of this money to an anonymous recipient, who
was recruited separately to the experiment and is in a different room,
never observed by the dictator. In this game, economic man would clearly
keep all the money for himself, since there is no reason to do
otherwise.
Originally invented as a way to understand anomalous (i.e.,
contrary to game theory) results in bargaining studies (Forsythe et al.
1994), its use to measure altruism is more recent (Eckel and Grossman
1996a, b). In our early experiments we used a "double blind"
protocol developed by Hoffman et al. (1994), which guarantees anonymity
between subjects and between the experimenter and the subject, and so
removes one of the reasons a person might give away money--to impress
the experimenter or other subjects. This protocol was developed to
improve the internal validity of the experiment, that is, to make sure
it measured altruism.
The distribution of choices in a dictator game experiment with men
and women as dictators is shown in Figure 1 (Eckel and Grossman 1998).
Men on average donate $0.82 from a $10 endowment; women donate $1.60, a
statistically significant difference. More than half of the subjects in
this particular protocol give nothing, and more men than women fall into
this category. On the other hand, more women than men give $5, half of
their endowment, to an anonymous counterpart.
[FIGURE 1 OMITTED]
So by this measure, yes, women are more altruistic than men.
A variation on the dictator game matches subjects with a charity
recipient instead of an anonymous person. Charities are more deserving of support than random, anonymous individuals, so we might expect higher
levels of giving. (This is a rational response to the change in the
characteristic of the recipient: The benefits are greater.) Table 1
shows a selection of results from an experiment where the subject
chooses from a list of charities that her contribution will benefit
(Eckel and Grossman 2003). With a $6 endowment, men give on average
$2.63, or 44% of their endowment; women, 52%. At the higher $10
endowment we see a similar pattern, with men giving 45% and women giving
53%. In this study, we test a variety of endowments and several
different levels of subsidies for charitable giving (such as matching
amounts), and in all of them women give more than men. This provides
further evidence that women are more altruistic than men. (2) This
experiment is not restricted to gender differences: It could be (and has
been) conducted to compare any groups of people, and also can be adapted
to look at how people treat others.
5. Heterogeneity across Partners: Altruism
When the dictator game is conducted with subjects other than
university students, there is a strong tendency for subjects to divide
the endowment equally (e.g., Burks, Carpenter, and Verhoogen 2007; Whitt
and Wilson 2007). If a large fraction of the subjects do the same thing,
this is a problem. A good measure needs not only to capture altruism if
it is there, but also to vary with intensity of preference. Piling up on
a 50/50 split is not desirable. To get around this problem, we've
been exploring a variation on the game that we call the comparative
dictator game (CDG). This game introduces additional variation by giving
subjects a frame of reference. In the CDG, subjects make several
dictator game decisions, each with a different counterpart. One of the
decisions is then chosen at random and paid. (See Eckel, Johnson, and
Thomas 2006 for more detail.)
While we developed the game to enhance heterogeneity in giving to
an anonymous partner, it also illustrates one of the ways in which
people vary their decisions based on the characteristics of their
counterparts. In this case the counterparts vary in their social
distance, though they could be set up to vary in many different ways. We
match them with a stranger, a friend, and a family member. They play all
three games. After completing all three, one is selected randomly, and
the subject is paid. The counterpart is also paid: We put the money in
an envelope and either deliver or mail it to them. In this study we see
that subjects vary quite a bit, not only in their overall level of
giving, but also in how strongly they respond to the change in social
distance. In the lab, from a $20 initial endowment subjects give on
average $3.05 (15.2% of the endowment) to a stranger, $5.53 (27.7%) to a
friend, and $7.13 (35.6%) to a family member. Behavior not only deviates
from the selfish prediction of the naive rational actor model, but also
varies by the characteristics of the counterpart. People are altruistic,
and for most subjects, altruism is contingent, in the sense that it
varies with what they observe.
6. Heterogeneity across People and across Partners: Trust
Like altruism, trust typically is measured using answers to survey
questions. The trust game provides behavioral measures of both trust and
trustworthiness. The most common measure is the question from the
General Social Survey, "Generally speaking, would you say that most
people can be trusted, or you can't be too careful in dealing with
people?" In contrast, trust can be measured experimentally by
putting subjects in a situation where there are gains (and risks) to
trust. This measure is a simple, transparent game that examines trusting
and trustworthy behavior directly. It is incentivized, and therefore
misrepresentation is costly. In order to appear trusting a subject must
put her own earnings into the hands of someone else, and in order to
appear trustworthy, a subject must reciprocate trust by giving up some
earnings.
In the trust game, first studied by Berg, Dickhaut, and McCabe
(1995), two players are endowed with an amount of money, say $100. (This
is to get the attention of an audience of professional, income-earning
adults. Student subjects respond to lower stakes, say $10.) The first
mover must decide how much (if any) to send to a second mover. Any
amount sent is tripled by the experimenter. The second mover must decide
how much, if any, to send back to the first mover. (The returned amount
is not tripled.) Suppose the first mover sends $60. This would be
tripled to become $180 received by the second mover, who must then
decide how much to send back. The first move is "trust," the
second is "trustworthiness," and the multiplier produces the
potential gains from trust. Subjects engage in the game for real money,
and they keep what they earn. In the first study, and most of those that
followed, the decision makers were anonymous to each other. This was
accomplished by implementing a protocol very similar to the one in
Hoffman et al. (1994) developed for the dictator game. In this game, an
economic man in the second position would keep everything; knowing this,
the first mover would send nothing. However, when real subjects play
this game, first movers send on average about half of the endowment, and
second movers, on average, just reciprocate. Trust (barely) pays.
As with the dictator game, this game was invented to test
hypotheses about game theory, and only later came to be used to measure
and compare levels of trust and trustworthiness across types of people
or across societies. In these games frequently there are systematic
differences in trust across types--that is, people behave
differently--as well as across partners--that is, people discriminate.
For example, in a study of Harvard undergraduates, Glaeser et al. (2000)
find that members of different races paired together are less likely to
exhibit trustworthiness; less is returned by a second mover of a
different race. Fershtman and Gneezy (2001) find that Israeli men are
less likely to trust Eastern than Ashkenazic Jewish counterparts; and
Ashraf, Bohnet, and Piankov (2006) and Burns (2005) explore
discrimination among black, mixed-race, and white South Africans, and
find plenty.
Our trust game protocol differs in important ways from earlier
studies, and the changes allow us to examine individual interactions in
real time without compromising anonymity (Eckel and Wilson (2006)
provide details on the protocol). (3) In a series of experimental
studies, we match subjects at two different sites over the Internet to
play the game, and subjects see each others' photographs (framed
like a passport photo) while making their decisions. In the first phase
of the protocol, the first mover sees his counterpart's photo, then
decides how much of a $10 endowment to send. He then guesses how much
the second mover will return to him: If correct, he earns an additional
$1. The second mover first guesses what the first mover will send, again
earning $1 if correct, and then she receives the tripled amount sent by
her assigned counterpart. Her next task is the important one: to decide
how much to return. In the second phase, a separate group of subjects is
recruited whose purpose is only to rate the photos for a set of
characteristics. They are paid by the photo. We match things up so that
people at one site rate the photos that folks from that site were
matched with. I'm going to describe results from two of our
studies.
Study 1: Attractiveness
In Study 1, we examine the effect of attractiveness on trust and
reciprocity (Eckel and Wilson 2005; Wilson and Eckel 2006).
Attractiveness is interesting to us because it is consequential in labor
markets and other economically important situations. There is a
substantial beauty premium in labor markets: 14.4-16.8% of earnings for
men and 9.2-11.7% for women (Hamermesh and Biddle 1994). Attractiveness
is associated with greater electability (Ottati and Deiger 2002) and
with faster advancement (e.g., for attorneys, Biddle and Hamermesh
1998). In psychology, stranger attribution studies examine the
characteristics that people believe others embody. Here, all sorts of
positive attributes are ascribed to attractive persons. They are thought
to be more intelligent, competent, and skilled, and more is expected of
them (e.g., Webster and Driskell 1983). The beautiful also receive
better treatment from others, even with respect to judicial decisions:
Attractive persons get lower sentences (Stewart 1984)! (These and other
studies are discussed in Eckel and Wilson 2005.)
There is some evidence that attractive people are of higher
quality. Studies in evolutionary biology indicate that symmetric persons
are judged more attractive, and symmetry is an indication (and a
credible signal) of the quality of an organism. This symmetry is evident
in people's faces, and can be read by subjects who are viewing only
a facial photo (Rhodes et al. 1998; Zebrowitz and Rhodes 2004).
Psychologists point out that the superior treatment of attractive
persons, beginning in early childhood, can lead to the development of
higher quality. Whether by nature or nurture, attractive people may
indeed be superior. They score higher on intelligence (measured as IQ,
with all its attendant shortcomings), health, extraversion, confidence,
and social skills. While the stranger attribution studies are not
entirely wrong, perceptions exaggerate any inherent differences in
intelligence or productivity (Langlois et al. 2000).
In the workplace, trusting and trustworthy employees are thought to
be more productive; certainly we can well imagine the flip side--the
negative effects of suspicion and deceit. Trust and trustworthiness are
likely to be rewarded in the labor market, and if attractiveness signals
greater trust/trustworthiness, then this may partially justify the
higher earnings. Differential treatment of the attractive may make them
more trusting and trustworthy. The world is a safer, kinder place for
them, after all. On the other hand, a feeling of entitlement may make
them less trusting or less trustworthy. In the lab we can ask, are
attractive people more trusting/ trustworthy, or do we just think they
are?
Attractiveness has effects on behavior that can be detected in the
lab. Solnick and Schweitzer (1999) find that more is sent to attractive
counterparts in the ultimatum game, for example. (In this game the first
mover proposes a division of a fixed endowment, and the second mover
must accept or reject it. If accepted, payoffs are as proposed, and if
rejected payoffs are zero for both players.) This behavior is an
indication either that first movers want to give more money to
attractive counterparts (taste-based discrimination) or that they expect
attractive persons to reject lower offers (which is not taste based but
rather based on expectations or, in a way, productivity). In the
prisoner's dilemma game, attractive people are more likely to be
selected as a partner and trusted (Mulford et al. 1998).
In one of my favorite attractiveness studies, Andreoni and Petrie
(2006) show that, while the presence of attractive players initially
increases contributions in the public goods game, when subjects find out
how much others have given, they quickly adjust their contributions
downward. This suggests that the subjects are biased in their estimates
of what attractive people will contribute to the pie, a suggestion we
are able to test directly with our protocol. Of course expectations
can't be biased in equilibrium, where expectations have to be on
average correct. The story we are telling, and the evidence we find for
it, may indeed be an out of equilibrium phenomenon, sustainable for any
length of time only in one-shot settings where feedback is nonexistent or infrequent. Repeated interaction provides the information needed to
reach equilibrium, but repetition creates a whole new set of candidate
equilibria. In our work we tend to focus on initial interactions, which
allow us to measure stereotypes and biases and their effect on earnings
in a simple setting, ignoring reputational considerations or
repeated-game strategies.
In the trust game, we are able to test whether attractive people
are more or less trusting or trustworthy (heterogeneity across people),
as well as whether attractive people are treated differently
(heterogeneity across partners). People may discriminate in favor of
attractive people either because they like being nice to attractive
people (or want to curry favor with them) or because they expect
attractive people to have superior performance. In our protocol we can
distinguish between those two motivations because we measure
expectations as well as observing behavior.
To see what the experimental environment is like, some of the
screens are displayed in the figures below. Figure 2 shows the first
mover decision screen. (4) While observing his counterpart, the first
mover decides how much to send to her. In a subsequent screen, he
guesses how much she will send back. Meanwhile, the second mover
observes the first mover, as shown in Figure 3, and is guessing how much
he will send her. In a subsequent screen she finds out how much he has
sent, and decides how much to send back. After the subjects are paid and
go home, we keep their photos and have them rated by a different set of
subjects. We show the raters screens like the one in Figure 4, and the
raters decide which of each word pair best describes the photo. The 15
word pairs used in the ratings include attractive/unattractive, and
those ratings are summarized in Figure 5, which shows the distribution
of attractiveness ratings. (5) (Women are more attractive than men, but
you knew that.) The data from the phase-I experiment (decisions and
expectations), and the phase-II ratings are then combined to analyze
behavior.
[FIGURES 2-5 OMITTED]
Table 2 summarizes sending and returning behavior. This table
contains data only for subjects in the top and bottom quartiles of the
attractiveness ratings. First movers exhibit a beauty premium, sending
more to attractive than unattractive second movers. Both attractive and
unattractive first movers display this behavior. Unattractive first
movers send $5.29 from a $10 initial amount to unattractive second
movers; they send $0.90 more to attractive counterparts. Attractive
first movers send less on average, but also are more generous with
attractive counterparts, sending $0.76 more. Based on previous research,
this is the pattern we expected. Table 3 gives some insight into why
this pattern is observed. Here data are pooled for all first movers, and
presented just by the characteristics of the second mover. We see that
more was sent to attractive second movers, and more was expected back
from them. This indicates that the superior treatment is due in part to
the higher expectations of attractive second movers--a
productivity-based explanation. The third column shows that attractive
and unattractive second movers do not differ in their behavior, so the
difference seen in expectations is incorrect. (Notice also that trust
pays for unattractive first movers in this environment, since percentage
returned exceeds 33%, but not for attractive first movers.)
The bottom part of Table 2 shows the percentage returned by and to
the most and least attractive. The data show an interesting pattern. To
our surprise, and in contrast to the results of previous research,
second movers exhibit a beauty penalty in their behavior. On average,
unattractive second movers return 35% to first movers who are relatively
unattractive and 29% to first movers who are relatively attractive. The
same pattern is seen for attractive second movers, who return 40% to
unattractive and 31% to attractive first movers. (6) What is the source
of this unusual behavior? We thought perhaps the result might be a
mirage. If second movers send back a lower percentage to less generous
first movers, then that could cause a pattern of results like this one.
Fortunately, we can control for this in our analysis. In the paper we
report regression analysis showing that this effect survives controls
for the amount sent. We are able to show that the result comes from the
effect of dashed expectations, as summarized in Table 4. This table
pools the data for all second movers, and shows their expectations,
amount received from, and amount returned to unattractive and attractive
first movers. The table indicates that attractiveness confounds
intuition: Second movers expected more from attractive than unattractive
second movers ($5.50 vs. $5.15), but they received less ($4.12 vs.
$5.70). Second movers then punish attractive first movers for failing to
live up to their expectations. Interestingly, this punishment is
inflicted only on attractive first movers. As shown in Figure 6, only
the beautiful are punished when expectations are dashed.
[FIGURE 6 OMITTED]
In sum, we find evidence of a beauty premium, in that relatively
attractive people are more likely to be trusted. We also see a beauty
penalty: Beautiful people are punished for failing to live up to the
biased expectations of their counterparts. Attractiveness confounds
intuition in the sense that expectations of attractive people are
systematically too high. When we published these results we were
contacted by many media outlets, who wrote about our results and
uncovered examples from the real world. (7)
Study 2: Skin Shade
Let's turn now to study 2, an investigation of race,
ethnicity, and skin shade (Eckel and Wilson 2007). The persistent gap in
earnings between African Americans and whites in the United States and
elsewhere--about 20% after controlling for productivity-related
variables (Couch and Daly 2002)--again motivates our work in this area.
Many different methods have been used to investigate the gap, including
standard econometric studies (Couch and Daly 2002) and audit studies
(Bertrand and Mullainathan 2004). The ability to isolate elements of
decision making that might affect earnings is the reason for using lab
experiments. Our experiments focus on the relationship between skin
shade and trust/trustworthiness.
The design of this study is similar in most respects to study 1,
with three differences. First, the decision is framed as a loan. While
experimentalists prefer neutral language, we have found that a small
amount of context can sometimes make the situation easier for the
subjects to understand, in effect lowering the cognitive load. Second,
instead of allowing the first mover to send any amount, the decision is
all-or-nothing: The first mover must decide whether to send the entire
$10 amount. We did this to avoid a possible confound in the data. Since
second movers tend to return a larger percentage for higher amounts
sent, we worried that some ethnic pairings might result in such low
amounts sent that we would be unable to disentangle the effect of the
amount from the effect of the pairing. Forcing all subjects to send the
same amount would avoid this. Third, the amount sent is doubled instead
of tripled. (8)
As before, the first mover observes the photo of his counterpart
and then decides whether to make the loan (trust). He then guesses how
much will be returned. The second mover guesses whether the loan will be
sent, then decides how much to return. Photos are evaluated by a
separate group of subjects recruited for that purpose. The subjects who
participated were recruited from three schools: Virginia Tech, Rice
University, and North Carolina A&T (a historically black engineering
school). We set the experiment up to gauge discrimination. Subjects were
matched randomly, but by recruiting an ethnically diverse set of
subjects, we could examine differences in behavior by ethnic pairing.
(9)
The screens are similar to those in Figures 2 and 3. Table 5
summarizes the decision to make the loan, separated out by the
characteristic of the second mover. In the table, we see significantly
lower levels of trust for nonwhite subjects. Whites are trusted with a
loan 82% of the time, and nonwhites 63%. In analyzing the data, we
initially pursued a modeling strategy that focused on ethnic pairings.
However, we discovered that replacing ethnic categories with skin shade
ratings effectively captured the variation in decisions. Interacting
skin shade with ethnic category did not provide any additional
explanatory power (see the regressions in Eckel and Wilson 2007).
(Perhaps with more data, we would be able to distinguish ethnic-specific
skin shade effects.) Figure 7 shows that skin shade is correlated with
ethnic category, but there is considerable variation within categories.
Table 6 shows loans, expectations, and returns by the skin shade of the
second mover. Paralleling Table 5, we see that the lightest quartile is
trusted 84% of the time, and the darkest 53.3% of the time. If the
differences are based on differences in expected return, then that would
support productivity-based discrimination. However, the average expected
return does not differ significantly between the two groups, suggesting
that discrimination in trust is taste based. Differences in average
return show lower return by darker skinned second movers. However, this
difference is small relative to the differences in trust.
[FIGURE 7 OMITTED]
Table 7 shows second mover expectations and returns. Lighter
skinned second movers expect less trust from darker skinned first
movers, but they are wrong. (Darker skinned second movers are correct in
their relative assessments, but at an absolute level they are trusted
less than they expect to be.) As with the attractiveness study, second
movers who were surprised reacted strongly. In this case, darker skinned
first movers are rewarded for making the loan when it wasn't
expected.
In sum, we find a skin shade penalty, and a skin shade premium.
First movers trust lighter skinned second movers more than those with
darker skin shades. Darker skinned second movers never get the
opportunity to show they are trustworthy. Lighter skinned second movers
expect less from darker skinned first movers, but they are wrong. Darker
skinned first movers trust more than is expected of them. Positively
surprised second movers reward them, creating a skin shade premium.
Photos, Expectations, and Trust
These two studies taken together allow us to draw several tentative
conclusions. The decision to trust is based on expectations, and
expectations can be wrong. Our design allows us to distinguish between
discrimination based on expectations (productivity) and discrimination
based on taste. Attractive people are trusted more, in part because more
is (incorrectly) expected of them. Darker skinned second movers are
trusted less, despite only a small difference in expected return,
indicating that the lack of trust stems from something other than
expected return. Expectations also play an important role for second
movers, who expect less trust from unattractive and darker skinned
counterparts. Both of these expectations are based on stereotypes and
are biased, resulting in a beauty penalty as attractive people are
punished for dashing expectations, and a skin tone premium, as darker
skinned people are rewarded for trusting more than expected.
We see that subjects condition their behavior on what they see--and
what they believe about what they see--in the photos of their
counterparts. Should they? In one study we compare the trust game with
information only (subjects are told gender, ethnicity, and a few other
things about their counterparts) and with photos. Correlation between
expected return and actual amount returned is 0.10, and not
significantly different from zero for the information condition. But in
the photo condition, the correlation coefficient is 0.22, and
significantly different from zero at p = 0.05. Seeing the photo improves
subjects' ability to forecast their counterparts' behavior.
This led us to ask, will subjects pay to see the photos, and how
much? In Eckel and Petrie (2006) we report the results of experiments to
find out whether and how much subjects will pay to see the photos of
their counterparts. Subjects play six trust games for $10, with the
amount sent tripled, each with a different counterpart. One game is
selected at random for payment. The subjects are given the option to
purchase photos of their counterparts at prices ranging from $0.20 to
$8. About half of the subjects are willing to pay something--$0.20--to
see a counterpart's photo, and 7% will pay more than $2. We find
that first movers--especially white first movers--are more likely to buy
the photos (about 70% of them do so), and then they discriminate based
on the information, trusting whites 50% more often than African
Americans. (African Americans also trust whites more, but the difference
is not statistically significant.) In addition, white second movers who
buy the photos send back less to their black counterparts, in contrast
to African American second movers. People will pay to learn about their
counterparts, and act according to the information they acquire.
7. Conclusion
Experiments can be used to measure preferences, and to assess the
magnitude and consequences of heterogeneity in behavior. People behave
differently: There are systematic differences in average behavior across
identifiable types or categories of people. For example, women are more
altruistic, and attractive people are less trusting. More importantly,
people discriminate. Experimental games can be used to measure
discrimination without calling attention to it. In contrast to the
typical survey approach, subjects make decisions naturally, without
knowing what the study is about. This is especially important for
behaviors that are socially sanctioned, such as discrimination against
African Americans. If a subject is aware of the purpose of the study, it
is easy and costless to misrepresent preferences in a survey, but in an
experimental game such misrepresentation incurs a financial cost. The
experimental approach allows us to observe latent or unconscious
discrimination.
What about economic man? Three observations can be made from our
studies. First, people differ. The common modeling assumption of a
single representative agent is not accurate; we see considerable
heterogeneity across individuals along the dimensions we studied:
altruism, trust, and trustworthiness. Second, people treat each other
differently. Behavior is conditional on the setting. In our studies,
nearly all subjects vary their behavior depending on the situation they
find themselves in. Third, the factors that people condition on are not
just costs and benefits, but also include social elements. Our research
highlights the importance of social considerations in economic decision
making. Economic agents pay attention to social factors and condition
their decisions on everything they know about the decision, including
the characteristics of their counterparts. Behavior is conditional,
based on expectations that are in turn based on stereotypes and that are
biased, but better than chance at predicting the behavior of others.
Economic man would not exhibit systematic biases. Perhaps a more
accurate model of behavior can be developed by giving economic man a
social identity.
Thanks to the Association for the opportunity to stand up in front
of everyone and talk about my work. Thanks to the many people who
provoked and improved this research through their skepticism and their
encouragement: You know who you are! Special thanks for financial
support go to the John D. and Catherine T. MacArthur Foundation and
directors of the Network on Preferences and Norms, Herb Gintis, and Rob
Boyd; and to the National Science Foundation and their wonderful program
officers, especially Dan Newlon, Lynn Pollnow, Jon Leland, Laura
Razzolini, Bob O'Connor, and Frank Seioli. Thanks most of all to
the best coauthors anyone could ask for, including but not limited to:
Sheryl Ball, Phil Grossman, Ragan Petrie, and Rick Wilson. Finally, to
Genia Toma and Kathy Hayes, my fellow steel magnolias.
References
Andreoni, James, and Ragan Petrie. 2006. Beauty, gender and
stereotypes: Evidence from laboratory experiments. Working Paper,
University of California, San Diego.
Andreoni, James, and Lise Vesterlund. 2001. Which is the fair sex?
Gender differences in altruism. Quarterly Journal of Economics 116:293-312.
Ashraf, Nava, Iris Bohnet, and Nikita Piankov. 2006. Decomposing
trust and trustworthiness. Experimental Economics 9(3):193-208.
Berg, Joyce E., John W. Dickhaut, and Kevin McCabe. 1995. Trust,
reciprocity, and social history. Games and Economic Behavior 10(1):122-42.
Bertrand, Marianne, and Sendhil Mullainathan. 2004. Are Emily and
Greg more employable than Lakisha and Jamal? A field experiment on labor
market discrimination. American Economic Review 94(4):991-1013.
Biddle, Jeff, and Daniel Hamermesh. 1998. Beauty, productivity and
discrimination: Lawyers' looks and lucre. Journal of Labor
Economics 16:172-201.
Burks, Stephen, Jeff Carpenter, and Eric Verhoogen. 2007. Fairness
and freight-handlers: A test of fair-wage theory in a trucking firm,
with Stephen Burks and Eric Verhoogen. Industrial and Labor Relations
Review. In press.
Burns, Justine. 2005. Race and trust in post-apartheid South
Africa. Working paper, Center for Social Science Research, University of
Cape Town.
Couch, Kenneth, and Mary C. Daly. 2002. Black-white wage inequality
in the 1990s: A decade of progress. Economic Inquiry 40(1):31-41.
Eckel, Catherine C., and Philip J. Grossman. 1996a. Altruism in
anonymous dictator games. Games and Economic Behavior 16:181-191.
Eckel, Catherine C., and Philip Grossman. 1996b. The relative price
of fairness: Gender differences in a punishment game. Journal of
Economic Behavior and Organization 30(2): 143-158.
Eckel, Catherine C., and Philip J. Grossman. 1998. Are women less
selfish than men? Evidence from dictator games. The Economic Journal
108(448):726-35.
Eckel, Catherine C., and Philip J. Grossman. 2003. Rebates versus
matching: Does how we subsidize charitable contributions matter? Journal
of Public Economics 87(3-4):681-701.
Eckel, Catherine C., Philip J. Grossman, and M. Johnston. 2005. An
experimental test of the crowding out hypothesis. Journal of Public
Economics 89(8):1543-60.
Eckel, Catherine C., Cathleen A. Johnson, and Duncan Thomas. 2006.
Altruism and resource sharing within families and villages in Mexico.
Presented at the Economic Science Association Regional Meetings, Tucson,
AZ, October.
Eckel, Catherine C., and Ragan Petrie. 2006. Face value.
Unpublished paper, Georgia State University.
Eckel, Catherine C., and Rick K. Wilson. 2005. Detecting
trustworthiness: Does beauty confound intuition? Unpublished paper,
University of Texas at Dallas.
Eckel, Catherine C., and Rick K. Wilson. 2006. Internet cautions.
Experimental Economics 9(1):53-66.
Eckel, Catherine C., and Rick K. Wilson. 2007. Initiating trust:
The conditional effects of sex and race among strangers. Unpublished
paper, Rice University.
Fershtman, Chaim, and Uri A. Gneezy. 2001. Discrimination in a
segmented society: An experimental approach. Quarterly Journal of
Economics 116(1):351-77.
Forsythe, R., J. L. Horowitz, N. E. Savin, and M. Sefton. 1994.
Fairness in simple bargaining experiments. Games and Economic Behavior
6:347-69.
Glaeser, Edward L., David Laibson, Jose A. Scheinkman, and
Christine Soutter. 2000. Measuring trust. Quarterly Journal of Economics
115(3):811-46.
Hamermesh, Daniel, and Jeff Biddle. 1994. Beauty and the labor
market. American Economic Review 84:1174-94.
Harris, Misty. 2006. Here's a problem we'd all like to
have. Leader Post, 25 September, C1-2.
Hoffman, Elizabeth, Kevin McCabe, Keith Shachat, and Vernon L.
Smith. 1994. Preference, property rights and anonymity in bargaining
games. Games and Economic Behavior 7(3):346-80.
Holt, Charles A. 2003. Economic science: An experimental approach
for teaching and research. Southern Economic Journal 69(4):755-71.
Langlois, Judith H., Lisa Klakanis, Adam J. Rubenstein, Andrea
Larson, Monica Hallam, and Monica Smoot. 2000. Maxims or myths of
beauty? A meta-analysis and theoretical review. Psychological Bulletin
126(3):390-423.
Mulford, Matthew, John Orbell, Catherine Shatto, and Jean Stockard.
1998. Physical attractiveness, opportunity, and success in everyday
exchange. American Journal of Sociology 103(6):1565-92.
Ottati, Victor C., and Megan Deiger. 2002. Visual cues and the
candidate evaluation process. In The social psychology of politics,
edited by V. C. Ottati et al. New York: Kluwer Academic/Plenum
Publishers, pp. 75-87.
Rhodes, Gillian, Fiona Proffitt, Jonathon M. Grady, and Alex
Sumich. 1998. Facial symmetry and the perception of beauty. Psychonomic
Bulletin and Review 5(4):659-69.
Solnick, Sara J., and Maurice E. Schweitzer. 1999. The influence of
physical attractiveness and gender on ultimatum game decisions.
Organizational Behavior and Human Decision Processes 79(3):199-215.
Stewart, J. E. 1984. Appearance and punishment: The
attraction-leniency effect in the courtroom. Journal of Social
Psychology 125:373-8.
Webster, Jr. M., and J. E. Driskell, Jr. 1983. Beauty as status.
American Journal of Sociology 89:140-65.
Whitt, Sam, and Rick K. Wilson. 2007. The dictator game, fairness
and ethnicity in postwar Bosnia. American Journal of Political Science 51(3). In press.
Wilson, Rick K., and Catherine C. Eckel. 2006. Judging a book by
its cover: Beauty and expectations in a trust game. Political Research
Quarterly 59(2):189-202.
Zebrowitz, Leslie A., and Gillian Rhodes. 2004. Sensitivity to bad
genes and the anomalous face overgeneralization effect: Cue validity,
cue utilization, and accuracy in judging intelligence and health.
Journal of Nonverbal Behavior 28(3):167-85.
(1) For example, Eckel, Grossman, and Johnston (2005) find very
large differences in crowding out with small differences in context.
(2) Not every study replicates this result. For example Andreoni
and Vesterlund (2001) vary the price of giving and find it is only true
for women at high prices. They find that lowering the price of giving
makes men but not women more generous.
(3) Anonymity is desirable. If anonymity is breached, we cannot be
certain that subjects are playing the game we intend them to play.
Instead, the possibility of postgame interaction, positive or negative,
may influence decisions.
(4) This photo and others in this paper are of family members of
the author.
(5) To get this distribution, we take each rater's ratings and
standardize them, centering the distribution at zero and creating
z-scores. For each photo, these are then averaged across raters to get a
photo score. The distribution across photos is what you are looking at
in the figure.
(6) Note these do not average to the percentages in Table 3 because
Table 3 includes all first movers, and Table 2 includes only the top and
bottom quartile pairings.
(7) Consider the beauty queen, Venessa Fisher, crowned Miss
Universe Canada 2004: "It's pretty difficult to live up to
everybody's standards. People want to criticize everything about
you, looking for the littlest things to tear apart," she says.
"They're either completely intimidated or have this completely
distorted image of who I am." Fisher, now studying broadcast
journalism in California and interning at the Dr. Phil show, describes a
constant battle to thump the superficial stereotypes people have about
beauty and its relationship to personality. "I grew up exactly the
same as everyone else ... I'm really normal," she says.
"But I guess that doesn't come through until people start to
know me." (Ledger Post, Regina, Saskatchewan, 2006).
(8) As it turned out, this combination of choices was not ideal,
and we wouldn't repeat it. The combination of loan, one amount
sent, and doubling, meant that our second mover data has a large spike
at a return of $10, which represents equal splitting of the amount
received by the second mover, and just repaying the loan.
(9) In practice, since most of our African American subjects were
at NCAT, most pairings that include African Americans are black/white
pairings. We have very few black/black pairings, a shortcoming we plan
to address in future studies. Therefore, this is largely a study of how
white people treat a diverse set of counterparts.
Catherine C. Eckel *
* Catherine C. Eckel is Professor of Economics, School of Economic,
Political and Policy Sciences, University of Texas at Dallas. This paper
was delivered as her presidential address on November 19, 2006, in
Charleston, SC, at the annual meeting of the Southern Economic
Association.
Table 1. Dollar Contribution to Charities
Endowment Men Women p-Value (Means Test)
$6.00 $2.63 $3.11 0.06
$10.00 $4.52 $5.34 0.05
Source: Eckel and Grossman (2003).
Table 2. Average Amounts Sent to and Percentage Returned by Players,
Contingent on the Attractiveness Ratings of the Decision Makers and
Counterparts (Upper and Lower Quartiles of Attractiveness Ratings)
Unattractive Attractive
Decision Maker Decision Maker
(First Mover) (First Mover)
Amount sent to second
mover who is
Unattractive
(lower quartile) $5.29 (3.06) $3.79 (2.17)
Attractive (upper
quartile) $6.12 (3.19) $4.46 (3.07)
t-test t = 1.33, d.f. = 100, t = 1.22, d.f. = 112,
p = 0.09 p = 0.11
Unattractive Attractive
Decision Maker Decision Maker
(Second Mover) (Second Mover)
Percentage returned to
first mover who is
Unattractive
(lower quartile) 35% (21.2) 40% (19.9)
Attractive (upper
quartile) 29% (22.2) 31% (19.2)
t-test t = 0.89, d.f. = 98, t = 2.24, d.f. = 95,
p = 0.19 p = 0.01
Source: Eckel and Wilson (2005). Standard deviation in parentheses.
Table 3. First Mover Trust, Expected Return, and Return Received
Conditional on the Attractiveness of the Second Mover
Average Average
Second Mover Average Amount Expected Percentage
Attractiveness Sent Return (a) Returned (b)
Unattractive (lower
quartile) $4.49 (3.02) $8.06 ($7.22) 34.52 (20.47)
Attractive (upper
quartile) $5.15 (3.28) $9.5 ($6.59) 35.74 (17.52)
t-test t = 2.08 t = 2.06 t = 0.626
d.f. = 390 d.f. = 390 d.f. = 382
p = 0.04 p = 0.04 p = 0.53
(a) Includes all first mover expectations.
(b) Includes only second movers who received something
from the first mover.
Table 4. Beauty Confounds Intuition: Second Mover Expectations,
Amount Received, and Percentage Returned, Conditional on the
Attractiveness of the First Mover
Second Mover
First Mover Who Was Expected from Got from Returned to
Unattractive 5.15 5.70 35%
Attractive 5.50 4.12 30%
Source: Eckel and Wilson (2005).
Table 5. First Movers: Percentage Trusting Moves
by Counterpart Ethnicity
Caucasian 81.9 (68/83)
Total minority 63.0 (29/46)
African American 60.9 (14/23)
Asian 61.5 (8/13)
Hispanic/other 70.0 (7/10)
Source: Eckel and Wilson (2005).
Table 6. Loans and Returns by Skin Shade of Second Mover
Average
Second Mover Percentage Expected Average
Skin Shade Loan Made Return (a) Return (b)
Lightest quartile 84.4% (27/32) $7.66 ($3.96) $9.72 (4.34)
Darkest quartile 53.3% (16/30) $8.00 ($5.51) $7.97 ($3.79)
(a) Includes all first mover expectations. Standard deviations
in parentheses.
(b) Includes only second movers who received the loan. Standard
deviations in parentheses.
Table 7. Second Mover Expectations and Returns
Second Mover Skin Tone
First Lighter (below Median Rating) Darker (above Median Rating)
Mover
Skin % Expect % Trusted % Expect % Trusted
Tone Trust (n/N) (n/N) Trust (n/N) (n/N)
Lighter 77.1% (27/35) 82.9% (29/35) 76.7 (23/30) 66.7% (20/30)
Darker 65.5 (19/29) 86.2% (25/29) 71.4 (25/35) 65.7% (23/35)