首页    期刊浏览 2025年03月03日 星期一


  • 标题:An experimental investigation of research tournaments.
  • 作者:Fullerton, Richard ; Linster, Bruce G. ; McKee, Michael
  • 期刊名称:Economic Inquiry
  • 印刷版ISSN:0095-2583
  • 出版年度:1999
  • 期号:October
  • 语种:English
  • 出版社:Western Economic Association International
  • 摘要:Research tournaments have played an important role in the economic growth of nations since the earliest stages of the Industrial Revolution. For example, the golden age of steam locomotion was spawned by a research tournament sponsored by the Liverpool and Manchester Railway in 1829.(1) More recently, research tournaments have been used to create a variety of products ranging from fuel-efficient refrigerators (Langreth [1994]) and digital televisions (Economist [1993]), to high-tech fighter aircraft for the military (Schwartz et al. [1991]). Today, scientists and lawmakers are even considering the use of a research contest to propel the development of the first manned space mission to Mars.(2)
  • 关键词:Industrial research

An experimental investigation of research tournaments.

Fullerton, Richard ; Linster, Bruce G. ; McKee, Michael 等


Research tournaments have played an important role in the economic growth of nations since the earliest stages of the Industrial Revolution. For example, the golden age of steam locomotion was spawned by a research tournament sponsored by the Liverpool and Manchester Railway in 1829.(1) More recently, research tournaments have been used to create a variety of products ranging from fuel-efficient refrigerators (Langreth [1994]) and digital televisions (Economist [1993]), to high-tech fighter aircraft for the military (Schwartz et al. [1991]). Today, scientists and lawmakers are even considering the use of a research contest to propel the development of the first manned space mission to Mars.(2)

Despite the recurring popularity of research tournaments over the last two hundred years, the first theoretical model for evaluating their efficiency was not published until Taylor's [1995] seminal work. Taylor's model provides a theoretical basis for evaluating the effect the number of competitors and tournament duration have on the amount of effort expended by contestants in a research tournament. Taylor proved that, by limiting the number of competitors in a research tournament and charging each competitor an entry fee, research tournament sponsors can induce an efficient amount of innovative effort. Fullerton and McAfee [1999] extended research tournament theory to include competitions with heterogeneous contestants, showing for a large class of contests the optimal number of competitors is two and sponsors can induce the best qualified competitors to enter the tournament by holding specialized all-pay entry auctions.

Although the economic intuition behind these research tournaments is straightforward, the empirical calculations required to compute their equilibrium strategies are very complex, and it is an empirical question as to whether individuals are able to compute these strategies. Therefore, to investigate the predictive power of research tournament models, we conducted a series of laboratory experiments to test Taylor's seminal research tournament theory by examining whether subjects in a controlled economic laboratory setting can be induced to expend the predicted amount of research and development (R&D) effort in an essentially unregulated environment. Specifically, we investigated whether the effort expended by experimental subjects approximates the amount of effort predicted by the unique Nash equilibrium.

Despite the complexity of computing the Nash equilibrium research strategy, we find the average behavior of subjects in our experiments is remarkably close to the predictions of Taylor's model. The majority of the experimental subjects do appear to adopt stopping-rule research strategies, although they differ significantly in their individually chosen stopping values. Despite the wide variation in individual research effort, however, the overall level of research expended and the average value of the winning innovation for the various treatments are consistently within a few percentage points of the levels predicted by the Nash equilibrium. As a consequence, the R&D tournaments achieve very high levels of efficiency in the laboratory.


For explanatory ease, we retain Taylor's original notation, and readers may refer to his article for details of the model not discussed here. By assumption, there are M risk-neutral competitors who compete in a research tournament to win the prize offered by the tournament's sponsor. The tournament lasts T periods, and each period competitors have an opportunity to pay research cost C to obtain a single independent draw, x, from the distribution of innovations, F(x), on support [0, [Mathematical Expression Omitted]]. All competitors start the tournament with worthless innovations; x = 0. Each new innovation is drawn, with recall, from the distribution of innovations allowing each competitor to retain the best draw across all T periods of the tournament. At the end of T periods, competitors deliver their best draw to the tournament sponsor, who evaluates each innovation and awards the prize, P, to the competitor offering the best innovation.

Building on the results of search theory, Taylor proved the equilibrium strategy of a competitor in this research tournament is to draw a new innovation each period until drawing an innovation greater than or equal to some cutoff value, z, then stop. According to Taylor, the unique z-stop cutoff value for firms engaged in a research tournament is implicitly defined by the following equation:

(1) [Mathematical Expression Omitted],

where [Phi] is the date-zero cumulative density function (CDF) for the value of a firm's best innovation:

(2) [Mathematical Expression Omitted].

One can see from equations (1) and (2) that a competitor's effort level (z) is an increasing function of the prize, P, and a decreasing function of the cost of research draws, C. However, the equilibrium z-stop is also a function of the number of other competitors involved in the tournament, M-l, as well as the length (number of draws permitted) of the tournament, T.

Given the complexity of formulating this equilibrium z-stop strategy, it is an interesting empirical question as to whether economic agents will adopt Taylor's predicted strategies. For example, agents may instead employ simple "rule-of-thumb" strategies like taking a predetermined number of draws each tournament. Taylor's tournament was designed with the objective of maximizing efficiency. The sponsor is concerned with the value of the winning innovation, whereas the R&D firms will be concerned with the ratio of total research expenditures to prize payments since this reflects a contestants' expected payoff from entering the tournament. Further, it may not be in the sponsor's long term interest to have R&D firms going bankrupt. Therefore, the statistics that are central to our investigation are the value of the winning innovation, the overall level of effort expended on research, and the level of market efficiency.


To test Taylor's model, we designed a series of experiments to determine whether subjects individually, or as a market, provide results similar to those predicted by the theory. The experiments were conducted at the University of New Mexico's computerized experimental economics laboratory with subjects recruited from undergraduate social science classes. Each subject was assigned a computer terminal, and the laboratory is designed to limit the subject's view to her own terminal. This helps to ensure that each subject's response is independently determined. Computerization of the experiments allowed for immediate feedback for the subjects, and this feedback should enhance the subjects' understanding of the payoff function. After each round, the subjects' computer screens displayed the results of the round and how their payoffs were calculated. Specifically, the subjects were told their maximum draw as well as that of their group. If hers was the highest draw, the subject was informed that she had won the prize and the round balance was calculated as the initial endowment minus the cost of the draws plus the prize. Otherwise the round balance was the initial endowment minus the cost of the research draws. At the end of the session, the subjects' scores and payments were displayed on their screens, and they were paid in cash.

As the experiment began, subjects received a set of written instructions explaining that they would be participating in a market where the task was to decide whether to pay for a draw of a random number in an effort to win a prize. At the start of each round, subjects were given an endowment of francs (the laboratory currency) sufficiently large to ensure they could take a draw every period of the round without exhausting their endowment. Each draw generated a value between 0 and 999, with each number equally likely. Subjects were told the maximum number of draws in each round that could be taken, the cost of taking a draw, and the number of competitors in their group. At the end of each round, the player in each group with the highest draw was awarded the specified prize. A subject's total payoff at the end of the experiment was equal to the sum of the prizes won in each round plus all unspent francs remaining from the endowment. The subjects did not know how many rounds would be conducted during the session. Finally, they were told that they would be assigned to a different group each round and that at the end of the session their francs would be converted to dollars at a stated exchange rate.

In the context of a research tournament, choosing to make a draw corresponds to conducting research at a constant cost per unit. Beforehand, the outcome of the research process is unknown, but the distribution from which the research results will be drawn is common knowledge in Taylor's model. Each draw corresponds to the realized level of research for that period, and the group high draw is the level of the winning innovation for that round. Again, a round is comprised of several periods in which research can be conducted, but each round is a separate, independent research tournament.

Treatment Parameters

Experimental sessions covering five treatments were conducted. The treatment structure is shown in Table I. A session refers to having the subjects in the laboratory, whereas a treatment refers to the specific parameters that subjects face in a given session. In Table I, the number in parentheses refers to the treatment, while the other values in the cells refer to the particular parameters of the session. A total of 103 subjects participated in these experiments and no subject participated in more than one session.

In addition to the number of competitors in each group and the maximum number of draws, the subjects were given a set of other parameters useful for refining their decisions. These treatment parameters are reported in Table I, where P denotes the prize, in francs, awarded to the competitor with the largest draw in each group for each round. C denotes the cost per draw, E represents the endowment of francs given subjects prior to each round, and R denotes the number of R&D tournaments (rounds) conducted with each treatment.

Treatment Parameters

Number of
Competitors (M) Number of Draws Possible Per Round (T)

 2 4 6

 (1) (2)
 P = 120 P = 120
 C = 10 C = 10
 E = 25 E = 65
 R = 339 R = 75

3 (3)
 P = 103
 C = 10
 E = 45
 R = 500

5 (4) (5)
 P = 120 P = 120
 C = 10 C = 10
 E = 25 E = 65
 R = 195 R = 90

Each of the treatments shown above gives rise to theoretical predictions about the expected value of the tournament's winning innovation and the amount of research competitors will conduct. These predictions are shown in Table II. The baseline scenario is treatment 3, for which the predicted value of each group's winning innovation is 906. In this baseline treatment, M = 3 competitors conducted research at a cost of 10 francs per draw, with the opportunity to take up to T = 4 draws per round. Two sessions were run using this baseline treatment with a total of 30 subjects, each participating in 50 rounds. For treatment 3, this provided us with a data set consisting of 500 tournaments, 1,500 individual tournament performances, and 6,000 opportunities for the competitors to conduct a draw. In treatments 2 and 4, we altered the prize, the number of competitors (T = 2 or 6) and the number of periods (M = 2 or 5) in a manner that would generate virtually identical expected winning draws along the southwest to northeast diagonal of Table II. In treatments 1 and 5, we varied these parameters in order to generate steadily increasing expected winning innovations as one moves along the northwest to southeast diagonal of Table II. Thus, the five different treatments enable us to check for both consistency and trend in the model's theoretical predictions.

Another important concern is the total amount of money expended on research relative to the prize value during a tournament. Therefore, in Table II we have also listed the total amount of research dollars the theory predicts will be expended by all M competitors per prize dollar. For example, in treatment 3, Taylor's model predicts, the combined research expenditures of all three competitors will sum to just over 78 cents for each dollar of prize money awarded. In contrast, in treatment 5 the research-to-prize ratio is nearly equal to one, suggesting even a short run of [TABULAR DATA FOR TABLE II OMITTED] unlucky draws by a firm could drive it out of business.


In this section, we subject our data to various tests at the market and individual level. We find that most subjects do employ stopping-rule research strategies; however, their individually chosen stopping values offer differ significantly from the symmetric Nash equilibrium stopping value predicted by Taylor. Some individuals choose z-stop values well below the predicted level, while others choose stopping values above the predicted level. However, we find the aggregate behavior in each tournament treatment is generally consistent with the predictions of Taylor's theory.

Aggregate Tests

In Table II, the actual means we observed in the experimental data are presented in italics, for ease of comparison with predicted values. Despite variances in individual behavior, in every cross-comparison of treatments the mean winning innovation and the mean research-to-prize ratio moved in the direction predicted by Taylor's theoretical model. Particularly notable are the data from treatments 2, 3, and 4. The theory predicts virtually identical levels of winning innovations and mean research-to-prize ratios that increase from treatment 2 to 3 but decrease from treatment 3 to 4 across the diagonal. This is precisely what we observed. At the market level, the data are qualitatively consistent with the predictions of Taylor's model.

In Figure 1, we have plotted the theoretical CDFs of the predicted winning innovations in each group of competitors for treatments 2, 3, and 4. We see that, as well as having virtually identical expectations for their winning innovations, these three treatments also have CDFs which, by visual inspection, are quite similar. Because the theoretical distributions are so closely matched, one would expect the experimental data from these treatments to also look very similar. To make this comparison with our data, we conducted Wilcoxen-Mann-Whitney Rank tests and Kolmogorov-Smirnov tests on each combination of treatment pairs to check whether our experimental data also generate nearly identical distribution functions. For the Wilcoxen-Mann-Whitney Rank test, we reject the null hypotheses (at 0.10) that the distributions are identical if our test statistic is greater than 1.29. For the Kolmogorov-Smirnov test, we reject the null if our test statistic is larger than 1.23. Our results are presented in Table III.

Note that the only pairs of treatments for which we do not reject the null hypotheses, are comparisons of treatments 2, 3, and 4. Thus, statistically, it appears that Taylor's model is internally consistent. By this we are implying that relative to other treatments, when the mean level of the winning innovation was predicted to rise as a function of changing one of the parameters, our experimental data are consistent with the prediction. Moreover, changes in the distribution of winning innovations across experimental treatments are not only in the proper direction, but they are also statistically significant.

In Figure 2 we offer a graphical representation of the evidence presented in the previous statistics. We have plotted the actual CDFs of our experimental observations for the winning innovations across all five treatments. From this graph, it is quite obvious the "diagonal" treatments 2, 3, and 4 all generated winning innovation distributions which were very similar since these three CDFs lie practically on top of each other in the graph. On the other hand, treatment 1 generated significantly smaller winning innovations and treatment 5 generated significantly larger winning innovations.

Taylor's theoretical predictions arise from the argument that the competitors adopt the z-stop strategies constituting the symmetric Nash equilibrium. Using our data, we can estimate the implicit stopping rule that is generated by the subjects' observed behavior. For example, if a subject uses a stopping strategy, the imputed z-stop may be estimated from the expression: expected number of draws = [1 F[(z).sup.T]]/[1 -F(z)]. Therefore, by calculating the average number of draws in our experimental sessions we can estimate the z-stop that would generate the same number of experimental draws. These predicted and imputed average z-stops (averaged over subjects) are reported in Table IV.

As shown, the imputed z-stop strategies are reasonably close to the predicted levels for all treatments except number 5.(3) As for our parameter consistency check, the predicted z-stop values increase from treatment 1 to 2 and from 1 to 3, as do the observed values. Treatment 5 does not satisfy the theoretically predicted comparative statics results. This treatment has both a large number of competitors as well as a long tournament. As we shall see below, this combination of conditions exhibits more violations of the theory than do settings in which the tournament is short lived or in which there are fewer competitors.

In addition to predicting a stopping rule, the theory also predicts a level of draw activity. In Table V, we compare the actual number of draws taken with the theoretical prediction. In all treatments but one, the subjects made slightly fewer draws than the level predicted by Taylor's theory. This result supports the conjecture that a large tournament (many players invited) that is permitted to continue for several periods may lead to excessive expenditures on R&D.

Nonparametric Test Results

Treatments W-M-W K-S

(1) vs (2) 6.05 2.85
(1) vs (3) 10.77 5.17
(1) vs (4) 8.27 4.07
(1) vs (5) 10.47 4.97
(2) vs (3) 0.21 0.51
(2) vs (4) 0.28 0.85
(2) vs (5) 4.85 2.36
(3) vs (4) 0.68 0.98
(3) vs (5) 6.75 3.01
(4) vs (5) 5.56 2.45

Individual Behavior Tests

To this point, we have shown that our experimental data is generally consistent, in aggregate, with Taylor's predictions. To test whether subjects are individually employing stopping rule strategies, we counted the frequency that each subject exhibited the following behavior. If a subject drew a value X, then drew again getting a value Y(Y[less than]X), and then stopped when a further draw was possible, this behavior was defined as inconsistent. A stopping-rule strategy implies that, if an additional search was justified given X, it would also be justified given Y. Such violations may be indicative of the use of a simple rule-of-thumb decision strategy (for example, always make two draws) rather than the use of a stopping rule.

This metric requires at least three draws be possible, since we must observe the pattern described above and the subject must still be able to take a draw. To test for such inconsistencies, we report the frequency of such behavior in treatment 3, where we count the number of violations of a stopping rule strategy and test whether the frequency of such behavior is statistically significant. The latter involves more than a simple count of the number of observed violations since we must control for the frequency of opportunity for such violation. In Table VI we report the actual violations in treatment 3 and whether the incidence is statistically significant.(4)

The results in Table VI illustrate several points. First, the absolute number of violations is small and is typically concentrated among a few subjects. Second, it is important to correct for the frequency of the opportunity to commit a violation and not simply count the actual number of violations. Subjects with low absolute violation counts may still have a significantly high rate of violation, since they had few opportunities in which to commit a violation. For example, subjects 6 and 13 had less than 5 rounds in which they committed violations (out of 50 rounds) but their rate of violation was significantly greater than zero because they had few such opportunities during the session. In contrast, subject 15 had a large absolute number of violations but the rate was not statistically significant, because of the large number of rounds in which the subject could have violated a stopping rule strategy. Of the 30 subjects participating in treatment 3, only 8 committed statistically significant numbers of violations. Considering the stringency of the test applied, this is a low rate of inconsistent behavior and suggests widespread use of some stopping rule, although the rule used appears to be below that predicted by the theory.(5)

Finally, we examine a payoff measure for the subjects. The individual agents are interested in maximizing their return, while the sponsor wants to maximize the value of the winning innovation. The research-to-prize ratio (R/P) addresses important concerns of both parties since it contains both profit estimate as well as a measure of research effort. What we observe is that the average R/P ratio in our experimental data are quite close to the theoretical predictions. The individual R/P ratios and the theoretical predictions are reported in Table VII. As before, the treatment that stands out as most significantly violating the theory is treatment 5 (M5T6). While the remaining treatments show considerable variance in individual behavior, the average R/P ratios are still close to the theoretical levels.

There was also a large variance across individual stopping strategies as some competitors engaged in more aggressive research than predicted by the model, while other competitors were more passive than predicted. Of course, the complexity of determining the equilibrium z-stop makes a variance in stopping strategies virtually inevitable, moreover, if one competitor does engage in an excessive amount of research by employing too large of a z-stop, the equilibrium strategy for the other competitors is actually to reduce their z-stops.(6) Since we randomly matched competitors in different groups for each new round, we believe the competitors who employed the large z-stop strategies simply overestimated the equilibrium stopping value as opposed to implementing some sort of bullying behavior to deter competition.

The payoff data for treatment 3 provide further evidence there was probably not much bullying behavior because choosing a larger than predicted z-stop did not result in larger-than-average payoffs to the aggressive competitors. While there was a substantial range in payoffs from $11 to $19, there was no significant correlation between the total number of draws and the payoff. Subjects employing excessively large z-stops did not appear to benefit from their aggressive research strategies. Because competitors were assigned to different groups for each successive tournament, individuals were unable to bully other players consistently into reducing their research effort.

Predicted and Imputed z-Stops

Treatment (number) Predicted z-Stop Imputed z-Stop

M2T2 (1) 684 584
M2T6 (2) 783 746
M3T4 (3) 738 733
M5T2 (4) 746 734
M5T6 (5) 599 849

One element not accounted for so far is each subject's "luck of the draw." Over the course of 50 rounds, our data for treatment 3 generated a wide variance in the research "luck" of individuals as measured by the average draw of each subject. Although the average draw across all competitors was 499.57, we observed substantial variation in the average draw across subjects. For example, the data in Table VIII show that subject number 16 drew 176 times and obtained an average draw of 547, while subject number 29 drew 103 times and had an average draw of only 443. Clearly, the payoffs of individual subjects were affected by their "luck of the draw" during the experiment. Individual payoffs must be a function of both a subject's research strategy and his or her luck, for even a subject that makes very few draws could win many tournament prizes if he were unusually lucky. Therefore, we felt it was important to test Taylor's optimal Nash equilibrium strategy against the actual play of our experimental subjects to determine whether luck was an overwhelmingly important factor in tournament success. To directly test the success of Taylor's predicted strategy against our experimental subjects, we generated more than 21,000 Monte Carlo simulations of the equilibrium strategy to compete against the high draws of every possible combination of subjects.

If the Nash equilibrium strategy enjoyed only average success, the Monte simulation should win 33% of the time. But, in fact, the Monte Carlo simulation won the prize more than 40% of the time and generated an average of 2,990 francs over the course of 50 rounds - a sum greater than 20 of the 30 experimental subjects. Over the course of more than 21,000 simulations, the Nash-equilibrium Monte Carlo player was neither unusually lucky nor unlucky. In contrast, nine of the ten winners in treatment 3 who earned more than 2,990 francs benefited from better-than-expected draws. Thus, since the biggest winners in our experiments also tended to be the "luckiest" contestants, a strong argument can be made that playing Taylor's theoretically predicted strategy is strategically advantageous, even when one is playing against a set of untrained opponents.(7)

To this point we have shown that, with the possible exception of treatment 5, the average behavior of our experimental subjects was very close to the Nash equilibrium predicted by Taylor's research tournament model. On [TABULAR DATA FOR TABLE V OMITTED] the other hand, we observed wide variances in the research levels employed individually, which we believe can be largely ascribed to uncertainty on the part of individual competitors as to the equilibrium stopping value. Using our Monte Carlo simulation, we have also shown that if a subject actually knew the equilibrium stopping value, employing that strategy would probably have resulted in a larger payoff than two-thirds of all other subjects. The deviation from theory we have not accounted for to this point is the systematic bias, which shows up in our data as treatments 1-4 undershot the mean winning innovation, while treatment 5 overshot the mean winning innovation. However, this phenomenon can also be ascribed to individual uncertainty about the equilibrium stopping value.

Individual Violations (Treatment 3)

Subject ID Violations Significant

1 5 No
2 0 No
3 2 No
4 20 Yes
5 2 No
6 1 Yes
7 5 No
8 0 No
9 11 Yes
10 0 No
11 2 No
12 4 No
13 5 Yes
14 8 Yes
15 9 No
16 1 Yes
17 5 No
18 0 No
19 7 Yes
20 10 Yes
21 6 No
22 2 No
23 0 No
24 0 No
25 0 No
26 0 No
27 1 No
28 3 No
29 14 No
30 0 No


Subjects' Draw Experiences (Treatment 3)

Subject # of Draws Average Draw

1 110 538.00
2 163 481.65
3 147 527.07
4 134 532.52
5 169 473.15
6 54 544.81
7 163 491.05
8 4 386.75
9 136 504.61
10 138 506.33
11 72 564.83
12 155 460.95
13 114 468.23
14 67 548.70
15 101 482.17
16 176 547.10
17 119 519.54
18 140 478.70
19 159 510.59
20 117 489.98
21 95 513.80
22 120 508.21
23 146 503.82
24 100 516.57
25 79 526.53
26 66 428.61
27 159 484.52
28 91 448.91
29 103 442.98
30 150 486.41

In Taylor's model, the unique symmetric Nash equilibrium requires all agents to employ the same stopping rule. Though our aggregate data support Taylor's predictions, given the complexity of computing the Nash equilibrium it is not surprising that we observed substantial variation in individual stopping behavior. To quantify this variation in individual behavior two other measures suggest themselves: the smallest high draw and the highest nonstopping draw. The smallest high draw is defined as the smallest draw value an individual obtains without continuing to draw when a draw is possible. The highest nonstopping draw is defined as the highest value an individual obtains and continues to draw. We computed these measures and report them for treatment 3 in Table IX. As can be seen, these measures vary significantly, and we are reluctant to draw too much inference from these results because in each case the metric applies to only one round of the 50 in the experiment. Within an experimental setting there is frequently some instance of the subjects "trying out" alternative strategies. In any case, the key measure for the current discussion is the standard deviation across subjects, which is especially large for the smallest high draw, indicating that there was considerable differences in behavior across subjects. Such differences may be characteristic of bimodality in subject behavior with some subjects employing more aggressive research strategies than others.

The variance in subject behavior is more likely to create problems for the R&D industry when the tournament is permitted to continue for several periods. With longer tournaments, the potential exists for subjects who aggressively overshoot the theoretical z-stop to overwhelm those who undershoot. This would result in excessive levels of aggregate research in lengthy tournaments with many competitors as well as excessively high research-to-prize ratios and larger than predicted winning draws as we observed in treatment 5. This overshooting phenomenon has some serious implications for the R&D industry, and thus for research tournament sponsors, if it bears out in real-world research tournaments. In particular, sponsors may risk driving some of their R&D firms to bankruptcy if they sponsor tournaments with "too long" a time horizon or "too many competitors." In particular, in treatment 5 with six periods for research and five competitors, the tournament yields research/prize ratios well above one, which simply cannot be sustained by the industry over the long term.


The focus of our experiments was to evaluate the fixed prize mechanisms as a means to obtain a given quality of research at as low a cost as possible under various market conditions. Overall, the results of our experiments appear to support the theory. At the market level, the winning research product and level of research effort tended to be close to the theoretical prediction. In addition, the majority of our subjects appear to employ stopping rule strategies rather than playing simple rule of thumb strategies which does suggest a certain level of sophistication on their behalf. However, instead of observing a uniform level of research effort across all competitors, as the symmetric Nash equilibrium would predict, the research strategies we observed varied significantly across subjects.(8)

This variance tended to affect the aggregate results of our experiments. When there were only two periods for research, there tended to be less total research than predicted because there was not enough time for those who did the most research to make up for those who did very little. In the longer research tournaments with several competitors, we tended to see levels of research at or above the predicted amounts. Here, the high-effort competitors had ample research time to make up for the low-effort players and the result was higher levels of winning innovations and in some instances "excessive" levels of aggregate research which reduced the tournament efficiency.

The effect of additional participants and more research periods can potentially be substantial. The evidence supports the intuitive notion that if these parameters are increased arbitrarily, participants in long research tournaments may lose money because of excessive research competition. For example, without question the most prolific sponsor of research competitions is the federal government and, in particular, the Department of Defense (DOD). Each year DOD awards millions of dollars worth of contracts to winners of competitive R&D competitions.(9) Recent General Accounting Office (GAO) studies have identified acquisition reform as one of the Pentagon's highest priorities (GAO [1997], 17]). One of the most common complaints about DOD acquisition efforts is the extensive time required to field new systems. Our experimental data suggests that lengthy research competitions, by themselves, may inadvertently induce contractors in these competitions to conduct excessive amounts of research leading to cost overruns, quite apart from the extra costs normally associated with schedule delays. In the long run, this kind of behavior would normally be self-correcting because either the competitors would adjust their levels of research effort they would be eventually be driven out of the market and aggregate research would decline because of fewer competitors. However, since there is a clear national security incentive to prevent some defense contractors from exiting the industry, the effects of long research competitions on the cost of building weapons may be particularly detrimental in the defense industry. Therefore, judicious selection of the time horizon and the number of competitors seems to be indicated.

We are thankful to David Cooper for helpful comments on an earlier version of this paper. Shaul Ben-David provided programming assistance. William Neilson and two reviewers provided comments that led to substantial improvements in the analysis and exposition. Partial funding for this study was provided by the Defense Systems Management College and the Institute for National Security Studies.

1. The contest, known as the Rainhill Trials, was used to select an engine for the first-ever passenger railroad in Britain. The [pounds]500 first prize was won by George and Robert Stephenson, who built the Rocket, which attained a top speed of 46 km/hour. See Day [1971] for details about the evolution of steam locomotives.

2. The mission to Mars contest was worked up for a member of Congress by the executive chairman of the National Space Society, Robert Zubrin. The proposal is a series of contests with prizes in the $1 billion range, culminating in a $20 billion first prize. See Zubrin [1996].

3. For these tests, we eliminated the rounds in which individuals did not make any draws. The justification for this is that these are simultaneous move games. That is, when an individual chooses a strategy, it is based on the expectation that the group is of the announced size. Thus, the strategy choice is unaffected by whether one or more competitors has decided to drop out.

4. This can be characterized as a binomial process, After drawing Y, the subject either violates the stopping-rule strategy by stopping or does not violate the strategy by continuing to draw Thus, the test is whether the frequency of violations is statistically greater than zero (at the 95% confidence level).

5. To test for significance, a binomial test was conducted against the random prediction that violations would occur one half of the time. The subjects were reported to have a statistically significant level of violations if the rate was significantly greater than the null prediction at the 0.10 level.

6. See Taylor [1995, Prop. 2 and Fig. 1] for a graph of the Best-Response Projections for any two contestants.

7. We also estimated an ordinary least squares regression of subjects' payoffs against their "luck of the draw" relative to their opponents (AvDiff) and the number of times their draw decisions deviated from the theoretical equilibrium z-stop strategy, relative to one's opponents (ErrorDiff):

Payoff (francs) = [[Beta].sub.0] + [[Beta].sub.1] AvDiff + [[Beta].sub.2] ErrorDiff

One would expect the coefficient on AvDiff to be positive, reflecting a greater payoff for luckier average draws. The coefficient on ErrorDiff is predicted to be negative, reflecting lower payoffs for more deviations from the theoretically optimal strategy. The subject data yields the following results:

Payoff (francs) =

2784.8 + 5.256 AvDiff - 14.457 ErrorDiff (52.077) (4.435) (3.138)

The intercept predicted by Taylor's model and our parameters (endowment income and expected earnings) is 2675 - not statistically different from our estimate. The other coefficients are statistically significant with the predicted sign and reasonable magnitudes. The overall fit is quite strong ([R.sup.2] = 0.513).

8. Such variance in behavior has been observed in many other individual decision settings. Camerer [1995] reports on several examples of experiments in which subjects systematically overstated risk while others understated risk. It is an interesting question whether markets correct such behavior or whether aggregating market observations merely masks it.

9. By law, federal agencies are required to conduct competitive procurements whenever practicable. See U.S. Code Annotated Title 41, Section 253. For example, in 1991 the Air Force held a "fly-off" competition to select the new Advanced Tactical Fighter. Lockheed won that competition with their F-22, earning a production contract estimated at the time to be worth more than $90 billion. See Schwartz [1991] for details.


Camerer, Colin. "Individual Decision Making," in The Handbook of Experimental Economics, edited by J. Kagel, and A. Roth, Princeton University Press, 1995, 587-703.

Day, John R. Trains. New York: Bantam Books, 1971.

Economist, "HDTV All Together Now," May 29, 1993, 74.

Fullerton, Richard L., and R. Preston McAfee. "Auctioning Entry into Tournaments," Journal of Political Economy, June 1999, 573-605.

Langreth, Robert. "The $30 Million Refrigerator," Popular Science, January 1994, 65-67, 87.

Schwartz, John, Douglas Waller, John Barry, and John Taliaferr. "The $93 Billion Dogfight," Newsweek, May 6, 1991, 46-47.

Taylor, Curtis R. "Digging for Golden Carrots: An Analysis of Research Tournaments." American Economic Review, September 1995, 872-90.

U.S. Code. 26 January 1998.

United States Department of Defense. Defense Acquisition Management Policies and Procedures, DOD Instruction 5000.2, 23 February 1991. Washington, D.C.: U.S. Government Printing Office, 1991.

United States General Accounting Office. Reports and Testimony: January 1997. GAO/OPA-97-4, Washington, D.C.: U.S. Government Printing Office, 1997.

Zubrin, Robert. "Mars on a Shoestring." Technology Review, November 1996, 20-31.

Richard Fullerton: Associate Professor of Economics, United States Air Force Academy, Colorado Springs, Phone 1-719-333-3080, Fax 1-719-333-2945 E-mail [email protected]

Bruce G. Linster: Professor of Economics, United States Air Force Academy, Colorado Springs, Phone 1-719-333-3080 Fax 1-719-333-2945 E-mail [email protected]

Michael McKee: Professor of Economics, University of New Mexico, Albuquerque, Phone 1-505-277-1960 Fax 1-505-277-9445, E-mail [email protected]

Stephen Slate: Associate Professor of Economics, United States Air Force Academy, Colorado Springs Phone 1-719-333-3080, Fax 1-719-333-2945, E-mail [email protected]