Flight delays and passenger preferences: an axiomatic approach.
Bishop, John A. ; Rupp, Nicholas G. ; Zheng, Buhong 等
The U.S. Department of Transportation (DOT) defines a flight as
"delayed" if it arrives 15+ minutes late. The DOT "flight
counting" delay definition is used to rank airline/airport service
quality. An obvious caveat of counting flight delays is that the
duration of delay plays no role in the delay count. The purpose of this
article is to propose an aggregate delay measure that is sensitive to
the distribution of time delayed among passengers. The importance of
this work is that our derived delay measure reflects passenger
preferences rather than the arbitrary delay cutoff established by the
DOT. We model passengers' preference ordering using the criteria
that passengers prefer fewer, shorter, and more equal delay times.
JEL Classification: L93, R42
1. Introduction
Airline flight delays, like any other form of waiting for service,
may negatively affect customers (passengers) in many ways. Delays can
increase passengers' anger, uncertainty, and dissatisfaction with
the service provided (Taylor 1994). In addition, flight delays are
costly. A recent Joint Economic Committee report estimates that domestic
flight delays cost the airline industry and passengers $40.7 billion in
2007. (1) In December 2007, U.S. airline delays reached their highest
monthly level since the Bureau of Transportation Statistics began
tracking flight delays in 1995, as 32% of domestic flights arrived late.
Furthermore, in 2007, U.S. airline delays reached their highest annual
level since 1999, as 24% of all domestic flights arrived late. To
address this problem, the Federal Aviation Administration is imposing
financial penalties of up to $25,000 per violation on chronically
delayed flights. (2)
In ranking flight delays among airlines and airports, the sole (and
official) measure used by the U.S. Department of Transportation (DOT) is
the proportion of flights delayed (i.e., a flight is counted as
"delayed" if it arrives 15 or more minutes behind schedule).
This DOT "flight-counting" measure of delays has been adopted
by the industry and is widely reported by the media as the de facto
standard with which to measure on-time performance. In fact, the
DOT's Air Travel Consumer Report provides a monthly ranking of
airlines based on the percentage of on-time arrivals. (3) The purpose of
this article is to propose an alternative aggregate delay measure based
on passenger preferences rather than an arbitrary DOT delay definition.
There are several flaws with using the DOT standard to measure
airline service quality. Foremost is the arbitrariness in assigning 15
minutes as the delay threshold. Why not 10 minutes or 20 minutes?
Second, by counting the occurrence of delays, the duration of delay
plays no role in the calculation (e.g., no distinction is made between
flights delayed 16 minutes vs. 60 minutes). Third, a discrete
designation for each flight, either "on-time" or
"delayed," ignores the distribution of flight delays. Even
carriers with identical average minutes of delay are likely to be viewed
differently if they provide some passengers with severe delays. We
believe that extreme delays are viewed as particularly upsetting for
travelers (i.e., a one-hour delay is more painful for travelers than two
30-minute delays).
Airline researchers recognize the statistical shortcomings of the
15-minute delay standard; hence, various measures of flight delays have
been considered, including the following: counting the number of flight
delays (Brueckner 2002), calculating the minutes of travel time on a
route in excess of the monthly minimum (Mayer and Sinai 2003), and
determining the minutes of arrival (Mazzeo 2003) and departure delay
(Rupp 2009). Moreover, Bratu and Barnhart (2006) show that when factors
such as flight cancellations and missed connections are factored in,
actual passenger waiting times are nearly two-thirds higher than the
minutes of aircraft arrival delay (the DOT-reported measure). The unique
contribution of our article is that we derive a delay measure based on
passenger preferences, not simply based on a measure's statistical
properties or arbitrary delay standards. Of course, any measure of
airline delays must assert a passenger preference ordering; we model
passengers as preferring fewer, shorter, and more equal delay times.
The article is organized as follows. Section 2 provides the
axiomatic framework for measuring aggregate flight delays. We examine
the notion of flight delay and propose a set of axioms governing the
measurement of flight delays for a group of airline (or airport)
passengers. We then propose a class of decomposable measures of flight
delays as well as a partial dominance condition for the rankings of
flight delays. In section 3, we apply the proposed measures and
dominance condition to measure and rank flight delays of two major U.S.
airlines. Section 4 provides some extensions and discussion.
2. Measuring Aggregate Flight Delays
Consider a group of N passengers with possibly different delay
times, [x.sub.i], where i = 1, 2 ..., n. Here the group can be viewed as
all passengers of an airline or an airport. Clearly, not all passengers
have their flights delayed; some may even depart and arrive early. In
this sense, [x.sub.i], can be positive (delayed), negative (arrived
early), or zero (on time). For the group as a whole, we denote X =
([x.sub.1], [x.sub.2] ..., [x.sub.N]) as the flight-delay profile of the
group.
For the passengers as a group, we want to construct a summary
measure of delays so that comparisons and rankings among different
groups of passengers are feasible. To this end, we
define a measure of flight delays as a single value function,
D=D([x.sub.1], [x.sub.2], ..., [x.sub.N]), that reflects the aggregate
level of flight delays for the group as a whole. To characterize D(.),
we follow the axiomatic approach that Sen (1976) pioneered in poverty
measurement. The similarity between these two measurements indicates
that much of the calibration crafted to measure poverty can be applied
when measuring flight delays. (4) In this approach, we first lay out the
basic ideal properties that an index of flight delays should possess and
then generate satisfactory flight-delay measures within the boundaries
of the axioms.
Axioms on D(*)
We first require that the flight-delay index be a continuous
function of all flight-delay times.
CONTINUITY. D(*) is a continuous function of X = ([x.sub.1],
[x.sub.2], ..., [x.sub.N]).
The second axiom is the anonymity axiom, which states that the
identities of the passengers play no role in the computation of D(*): If
two populations have the same flight-delay profile, then the two groups
should have the same level of flight delays. Profiles X = ([x.sub.1],
[x.sub.2], ..., [x.sub.N]) and Y = ([y.sub.1], [y.sub.2], ...,
[y.sub.N]) have the same level of flight delay if Y = PX for some
permutation matrix P. A permutation matrix is a square matrix with
elements 0 and 1 where each row and column sums to 1. Formally, the
anonymity a[x.sub.i]om is stated as follows:
ANONYMITY. D(Y) = D(X) if Y = PX for some permutation matrix P.
The next axiom is the focus axiom, which states that an index of
flight delays is concerned only with delays; hence, arriving early by 20
minutes or by two hours makes no difference for the calculation of D(*).
That is, recalling that early arrival means [x.sub.i] < 0, in the
following statement an increase in the early arrival time [x.sub.i] by
some [[epsilon].sub.i] to [y.sub.i] = [x.sub.i] - [[epsilon].sub.i] (and
thus [y.sub.i] < [x.sub.i]) has no effect on D(*).
Focus. D(Y) = D(X) if Y is obtained from X via [y.sub.i] =
[x.sub.i] for all [x.sub.i] > 0 and [y.sub.i] [less than or equal to]
[x.sub.i] for all [x.sub.i] [less than or equal to].
Contrary to an early arriving flight, if a flight has been delayed,
then any further delay will increase the level of aggregate delays. This
is the monotonicity axiom to which we alluded earlier in the
Introduction. In the following statement, a passenger's delay time
increases from [x.sub.i] to [y.sub.i] = [x.sub.i] + [[epsilon].sub.i].
MONOTONICITY. D(Y) > D(X) if Y is obtained from X via [y.sub.i]
= [x.sub.i] + [[epsilon].sub.i] for some [x.sub.i] > 0 with some
[[epsilon].sub.i] > 0 and [y.sub.i] = [x.sub.i] for all other
[x.sub.i] > O.
While an index D(*) that satisfies the monotonicity axiom reflects
the length of a passenger's delay, it may not address the
distribution of delays among passengers. To put the necessity of this
concern into perspective, consider a total delay of one hour between two
flights with an equal number of passengers on a route. In one case,
every flight is delayed by 30 minutes, whereas in the other case the
outcome alternates between arriving on time and arriving one hour late.
Which case should be considered to have a higher level of passenger
flight delays?
A passenger may not mind a delay of 10, 20, or even 30 minutes, but
anger, anxiety, uncertainty, and boredom mount at an increasing rate as
a delay prolongs. In this sense, the overall problem of delays in the
first case may be considerably smaller compared to the problem in the
second case. For example, in February 2007, JetBlue Flight 751 was
stranded at JFK Airport for more than 10 hours. This flight delay would
never have become front-page news if JetBlue had evenly distributed 10
hours of delay over 10 JetBlue flights. Stranded passengers become
particularly unhappy when they have to make tight connections or, even
worse, when they miss their connecting flights.
The general idea that spreading the total delay time more evenly
across all passengers (or flights) leads to a lower level of aggregate
delay can be imposed as an axiom on D(*). In the following statement,
passenger s experiences a longer delay than passenger t ([x.sub.s] >
[x.sub.t] > 0), and from X to Y passenger s's delay is shortened
by e, while t's delay is prolonged by [epsilon] (all other
passengers' delays are not affected).
DISTRIBUTION SENSITIVITY. D(Y) < D(X) if Y is obtained from X
via (i) [y.sub.s] = [x.sub.s] - [epsilon], and [y.sub.t] = [x.sub.t] -
[epsilon] for some [x.sub.s] > [x.sub.t] > 0 and for some
[epsilon] > 0, such that [y.sub.s] [greater than or equal to]
[y.sub.t] > 0; and (ii) [y.sub.i] = [x.sub.i] for all i [not equal
to] s, ,t.
The next axiom that we will impose on D(*) enables the comparison
of flight delays between different airlines (or airports), where the
number of passengers may differ. The following axiom states that if an
airline expands through a simple replication, then the level of flight
delays remains unchanged.
REPLICATION INVARIANCE. D(Y) = D(X) if Y is obtained from X via a
simple replication [i.e., Y = (X, X, ..., X)].
Finally, we introduce a consistency requirement that enables the
ranking of flight delays to be independent of the measuring units of
time, (e.g., minutes vs. hours).
UNIT CONSISTENCY. If D(Y) > D(X), then D([theta] Y) >
D([theta]X) for all [theta] > 0.
This last axiom says that if the flight-delay profile Y exhibits
more aggregate delay than X when time is measured in minutes, then the
conclusion (ranking) remains the same if time is measured in hours or
any other units.
The Implications of the Axioms and Some Examples of D(*)
The anonymity axiom implies that we can consider an ordered profile
of flight delays [i.e., for each X = ([x.sub.1], [x.sub.2] ...,
[x.sub.N]) we can assume that [x.sub.1] [greater than or equal to]
[x.sub.2] [greater than or equal to] ... [greater than or equal to] Xs].
The focus axiom implies that for those passengers whose flights are not
delayed (i.e., [x.sub.i] [less than or equal to] 0), D(*) does not
depend upon the specific values of [x.sub.i]. It follows that we can set
all those negative values of [x.sub.i] to zero--D(*) does not
distinguish between those passengers who arrived early and those
arriving on time. For each profile X, the anonymity axiom and the focus
axiom together allow us to consider the censored profile [MATHEMATICAL
EXPRESSION NOT REPRODUCIBLE IN ASCII], which sets every negative
[x.sub.i] to zero [i.e., [[??].sub.i] = max([x.sub.i], 0) for i = 1,2
..., N, and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].
Using our notation, the official measure of aggregate flight delays
is
[D.sub.1](X) = -1/N [N.summation over (i=1)]I([x.sub.i]), (2.1)
where I([x.sub.i]) is an indicator function that equals 1 if
[x.sub.i] [greater than or equal to] 15 and equals 0 otherwise. This
flightcounting index satisfies only anonymity, replication invariance,
and unit consistency. It violates continuity at the point [x.sub.i] = 0,
since for any flight with delay--no matter how slight (i.e., [x.sub.i]
is close to zero)--it is counted as 1 in [D.sub.1](X); however, if the
delay time is zero then the flight is counted as 0. This problem may be
even more intensified with the ambiguity about what constitutes a
"delay" (i.e., how many minutes must the flight be late to be
considered "delayed"?).
More importantly, the flight-counting measure violates the
monotonicity axiom and the distribution sensitivity axiom. As mentioned
in the Introduction, the violation of monotonicity implies that once a
flight is deemed "delayed" the airline has no incentive to
shorten the delay as far as minimizing [D.sub.1](X) is concerned. In
fact, the airline may have an incentive to prolong the flight delay in
order to get other flights on time so that [D.sub.1](X) becomes smaller.
The violation of the distribution sensitivity means that whether the
total delay time is spread evenly among passengers (flights) or is
concentrated among a few passengers/flights matters little to the
picture that [D.sub.1](X) portrays.
A measure of flight delays that is a modest improvement over
[D.sub.1](X) would be the following average-time-delayed measure:
[D.sub.2](X) = 1/N [N.summation over
(i=1)][x.sub.i]I([x.sub.i])=1/N[N.summation over (i=1)][[??].sub.i (2.2)
Compared with [D.sub.1](X), the (normalized) average-time-delayed
measure [D.sub.2](X) satisfies continuity, anonymity, monotonicity, and
replication invariance; however, it violates the distribution
sensitivity axiom. To allow any prolonged delay (i.e., the JetBlue JFK
case) to be weighted more than just another delay in the calculation of
aggregated delays, D(X) must reflect the axiom of distribution
sensitivity.
The Appendix provides an example to illustrate the differences
between the [D.sub.1] and [D.sub.2] measures. We rank the on-time
performance of 20 U.S. carriers from July 2005 using 15-minute delay
rates ([D.sub.1]) and delays gaps ([D.sub.2]). We find that a sufficient
re-ranking occurs when the intensity of delay, rather than the delay
rate, is considered. (5) Hence, these data support our contention that
delay rankings vary by the delay measure.
A measure that satisfies all aforementioned axioms is easy to
construct. In fact, we propose a class of such measures. (6) Consider a
continuous, increasing, and convex function [phi](x), with [phi](0) = 0,
a member of the class is
[D.sub.[phi]](X) = 1/N [N.summation over
(i=1)][phi][[x.sub.i]I([x.sub.i])]. (2.3)
It is easy to verify that [D.sub.[phi]](X) satisfies all axioms
examined above except the unit consistency axiom. To satisfy unit
consistency, function [phi](x) must also be homogeneous (Zheng 2007). An
example of the satisfactory [phi] - function is [phi](x) =
[x.sup.[alpha]], with [alpha] > 1.
[FIGURE 1 OMITTED]
The measures defined in Equation 2.3 are decomposable in the sense
that the overall level of flight delays can be written as a (weighted)
average of all subgroups' level of delays. (7) This decomposability
property is very useful in that it identifies the contribution of the
delay from each subgroup (an airline or an airport) to the overall delay
of the industry.
Flight-Delay Dominance
For each [phi](x), we can calculate the corresponding flight-delay
measure for each airline or airport. Then we can compare these
flight-delay measures among airlines and airports to rank them from the
most to the least delayed services. Clearly, the choice of the function
[phi](x) is consequential: Different functions may lead to different
rankings. A natural and important question is under what conditions can
we rank one airline as having a higher level of flight delays than
another airline for all possible functions [phi](x)? In this section, we
establish a partial ordering condition and provide a device to enable
this unanimous comparison.
Recall that if all measures satisfy anonymity and the focus axiom,
then we can consider a censored and decreasingly ordered version of each
flight-delay profile. Relying on a censored and sorted flight-delay
profile [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], where r is
the number of passengers delayed, we can construct a flight-delay curve
as follows. For each passenger i in the sorted profile, we first
calculate
C(X; i) = 1/N [i.summation over (j=1)] [[??].sub.j]. (2.4)
That is, C(X;i) cumulates the first i longest delays: [MATHEMATICAL
EXPRESSION NOT REPRODUCIBLE IN ASCII], ..., Next, we plot the sequence
{C(X;i)} against the corresponding cumulative passenger proportion {i/N}
in a graph with i/N on the horizontal axis and C(X;i) on the vertical
axis. Figure 1 depicts such a curve, which is referred to as the
flight-delay curve. Our flight-delay curve has an earlier analog in
Jenkins and Lambert's TIP curve of poverty. Here TIP stands for the
"three i's of poverty, incidence, intensity, and
inequality." Our flight-delay curve in Figure 1 is concave up to
the point {r/N, [D.sub.2](X)}, and then it becomes flat, because
[x.sub.i] = 0 for i > r. With the flight-delay curve, we can define
our partial flight-delay dominance relationship as follows: For two
flight-delay profiles X and Y with the same number of passengers N, X
flight-delay dominates Y if (8)
C(X; i) [less than or equal to] C(Y; i) (2.5)
for all i = 1, 2 ..., N, and the strict inequality holds for some
i. Graphically, Equation 2.5 says that the flight-delay curve of X lies
nowhere above that of Y and strictly below over some range.
The important result of this section is the following equivalence
between the partial flight-delay dominance and the rankings by all
members of the flight-delay class of Equation 2.3.
PROPOSITION 1. For any two flight-delay profiles X and Y, the
following two conditions are equivalent:
(i) [D.sub.[phi]](X) [less than or equal to] [D.sub.[phi]](Y) for
all members of [D.sub.[phi]](*), and [D.sub.[phi]](X) <
[D.sub.[phi]](Y) for some members of [D.sub.[phi]](*); and
(ii) The flight-delay curve of X dominates that of Y.
PROOF. See Jenkins and Lambert (1998a), where the context is of
poverty gaps and the curve is known as the TIP curve.
This proposition also has an important implication for ranking
flight delays when different cutoffs are used to define what is
considered "being delayed." Up to this point in our
theoretical calibration of measurement, we have assumed that a flight is
delayed as long as it is later than scheduled. Now suppose that there
are two definitions of delay: One is s minutes behind schedule and the
other is t minutes behind schedule, with 0 < s < t. For example,
in our empirical illustration below we consider both five-minute and
15-minute delay cutoffs. An interesting question to ask is the
following: If one airline has less aggregate delay than another airline
when an s-minute delay cutoff is used, will the airline also have less
delay when a t-minute delay cutoff is used instead? The following
corollary provides a useful guideline for delay comparisons with
different delay cutoffs.
COROLLARY 1. For any two flight-delay cutoffs s and t with s <
t, and two pairs of flight-delay profiles ([X.sub.s], [Y.sub.s]) and
([X.sub.t], [Y.sub.t]), if the flight-delay curve of [X.sub.s] dominates
that of [Y.sub.s] then the flight-delay curve of [X.sub.t] dominates
that of [Y.sub.t].
PROOF. The proof of this result can also be found in poverty
ordering literature (again, see Jenkins and Lambert [1998a]). Note that
increasing the delay cutoff has the same effect as lowering the poverty
line in poverty measurement. It is a known result in poverty measurement
that if one distribution has less poverty than another distribution for
all poverty measures at a given poverty line, then the conclusion holds
for all lower poverty lines.
From this corollary, it follows that if JetBlue has less aggregate
delay than US Airways (i.e., the flight-delay curve of JetBlue lies
below that of US Airways) for the five-minute delay
cutoff, then we can be certain, without checking, that JetBlue will
also have less delay than US Airways for any higher delay cutoffs (10
minutes, 15 minutes, ...).
A Gini-Type Measure of Flight Delays
The flight-delay curve lends directly to a Gini-type measure of
flight delays. (9) The measure is simply equal to the area beneath the
flight-delay curve, which is
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (2.6)
Note that this measure is not decomposable in the sense that we
defined above. The Ginitype measure reflects a unique passenger
preference about flight delays. In this measure, a passenger cares not
only about his/her time delayed but also about the relative position in
the delay profile (i.e., how many people have less delay time than the
passenger). See Lambert (2001, pp. 122-3) for more detailed discussion
on the Gini-type preference in social welfare measurement.
3. An Illustration of the Flight-Delay Curve
In this section we apply the flight-delay curve developed above to
actual flight-delay data from July 2005. (10) To illustrate our approach
we use Bureau of Transportation Statistics on time performance data for
every domestic flight for two carriers, JetBlue and US Airways, during
the first week of July 2005. (11) Both of these operated predominately
on the East Coast in 2005, with US Airways having hubs in Charlotte and
Philadelphia, while JetBlue's hub is at JFK airport.
We begin by plotting the distribution of arrival delays for every
JetBlue and US Airways flight in July 2005 (see Figure 2a, b). These
figures reveal a wider distribution of arrival delays for JetBlue. The
three leading causes of flight delay in July 2005 were late-arriving
aircraft, weather, and air carrier delay. (12) External factors such as
bad weather may affect carriers differently, especially when bad weather
events occur at a carrier's hub airport.
Table 1 provides simple delay counts (standard errors and test
statistics) for the two carriers for two time periods in 2005 (July 1-7
and July 1-4) and six alternative delay cutoffs. We begin with the DOT
definition of a flight "delay" (i.e., flights arriving 15 or
more minutes later than scheduled). To address concerns that our
analysis relies on flight-level data rather than passenger-delay data,
we re-estimate Table 1 using data weighted by potential passengers
(i.e., seating capacity) and find little difference between flight
delays and passenger delays. (13) For the seven-day period we find that
JetBlue (29.37%) has significantly fewer official delays than US Airways
(31.84%) (z-score = 2.24). The delay rates for JetBlue and US Airways
are very representative, since across all carriers for July 2005, 29% of
all domestic scheduled flights were either delayed or canceled. For the
four-day sample we find no significant difference (30.86% vs. 29.96%) in
the official delay rate (z-score = 0.62).
[FIGURE 2 OMITTED]
The natural question to ask is the following: Do these official
delay rates accurately describe the two carriers' delay
distributions? Our answers are "perhaps" and "not at
all." To arrive at these conclusions we must first examine the test
statistics at all possible delay times. In the seven-day case (see Table
1), US Airways has significantly higher delay rates than JetBlue for all
delays that exceed 10 minutes. We note that for five- and 10-minute
delay thresholds the two carriers have delay rates that are not
significantly different.
Figure 3 illustrates the July 1-7, 2005, delays, for which 10
minutes serves as the delay threshold. This figure provides the
flight-delay curves for JetBlue and US Airways. On the x-axis we plot
the cumulative proportion of passenger flight delays, beginning with the
longest delay. The incidence of delay is given by the length of the
flight-delay curve's non-horizontal section. As noted in Table 1,
using a 10-minute definition for flight delays, the delay rate for both
carriers is slightly over 36% during the first week of July 2005. After
this point, both curves in Figure 3 become horizontal, denoting an
on-time arrival using the 10-minute standard.
[FIGURE 3 OMITTED]
On the y-axis we plot the intensity of delay. The vertical
intercept at p = 1 is the aggregate delay gap, [D.sub.2](X), averaged
across all of a carrier's flights. The average delay gap would then
be equal to the slope of the ray from the origin to the point where the
flight-delay curve initially goes horizontal (here at 0.36). Figure 3
shows that JetBlue has a smaller aggregate (and average) delay rate
(0.047) than does US Airways (0.051) for the period July 1-7.
The inequality dimension of flight delays is summarized by the
degree of concavity of the non-horizontal section of the flight-delay
curve. If there is equality of delays among the delayed flights (i.e.,
if the delay gaps were equal), then the ray from the origin would be a
straight line with slope equal to z (10 minutes, in this case) minus the
average delay time. As noted above, the flight-delay curve combines all
three elements: delay rate, delay gap, and delay inequality. Returning
to Figure 3 we see that the JetBlue flight-delay curve dominates US
Airways since its flight-delay curve (the solid line) lies everywhere
inside the equivalent curve for US Airways (the dashed line). Thus, in
this case the industry's 15-minute delay standard (US Airways =
31.84% vs. JetBlue = 29.37%) gives the correct ordinal delay ranking of
these two carriers for all delay measures above 10 minutes.
To further illustrate the usefulness of the flight-delay curve we
consider an alternative time frame for our sample of flights: July 1
through July 4. Recall that for the 15-minute delay standard we find no
significant difference in delay rates between JetBlue and US Airways.
Using a 10-minute delay threshold, however, we find that US Airways has
a smaller delay rate than JetBlue at the 10% significance level (z-score
= 1.83). Furthermore, for a five-minute delay threshold, US Airways has
a significantly lower delay rate (z-score = 3.02). In contrast, as the
delay window is expanded (beyond 20 minutes) we find that JetBlue now
has significantly lower delay rates. In sum, in the above case, the
15-minute standard reveals no difference between carriers and does not
adequately describe the distributions of flight delays.
Figure 4 presents the flight-delay curves for July 1-4 using five
minutes as the delay threshold. The first dimension of flight-delay
preferences, the delay rate, is shown on the horizontal axis. We observe
that the US Airways flight-delay curve (the dashed line) becomes
horizontal at a lower delay rate than does JetBlue's flight-delay
curve, which reflects US Airways' lower delay rate at five minutes.
[FIGURE 4 OMITTED]
The second dimension of flight-delay preferences, the intensity of
flight delays (i.e., the slope of the ray from the origin where the
flight-delay curve becomes horizontal), is shown on the vertical axis of
Figure 4. Here we see that JetBlue has the lower aggregate delay rate
(0.139 vs. 0.148). This example provides a clear conflict between the
preference for fewer versus shorter delays. The third dimension of delay
preferences, the inequality among flight delays, is reflected in the
greater concavity of the flight-delay curves. In this example the US
Airways flight-delay curve shows a larger degree of delay inequality
(i.e., greater concavity). In sum, any conflict between passenger
preferences (for fewer, shorter, and more equal delays) will result in
crossing flight-delay curves, as is clearly seen in Figure 4. Crossing
flight-delay curves prohibit an ordinal ranking of carrier flight
delays.
There are several possible solutions to the delay ambiguity shown
in Figure 4. The first approach is to propose a cardinal delay
preference function that specifies a trade-off between the number of
flight delays, the length of delays, and the equality of delays. An
example of a cardinal preference function is the well-known Gini index
of inequality described above. The Gini-type indexes, which reflect the
area under the flight-delay curves, are reported in Table 1 and the
figure notes. For Figure 4, the Gini-type indexes are 0.0519 for JetBlue
and 0.0505 for US Airways. Thus, passengers with Gini-type preferences
will prefer US Airways to JetBlue. (14) A second solution is to expand
the delay window and check for an ordinal ranking of carriers. Figure 5
illustrates the second option using a 10-minute (instead of a
five-minute) delay window. In this case, JetBlue's flight-delay
curve lies everywhere below US Airways' flight-delay curve,
implying that passengers will prefer JetBlue to US Airways. Finally, if
measures of flight delays are required to satisfy an additional axiom,
then a refined condition similar to that proposed in Jenkins and Lambert
(1998b) can be checked. The refined condition involves the comparison of
variance of flight delays for the entire flight-delay curve and up to
the crossing point of the curve (for details, see Jenkins and Lambert
[1998b]).
[FIGURE 5 OMITTED]
4. Conclusion
Airline economists are well aware of the caveats involved in using
15 minutes as a delay standard; hence, a variety of alternative
flight-delay measures have been used in the literature. The unique
contribution of our article is the derivation of a delay measure that is
based on passenger preferences, not an arbitrary cut-off decided by the
DOT. We propose a delay ordering based on three widely acceptable
preferences--passengers will prefer a carrier that provides fewer,
shorter, and more equal delay times. Based on these three preference
assumptions we propose the flight-delay curve and identify the
conditions under which an unambiguous ordering of carriers can be
identified. Given the generality of our preference assumptions, the
flight-delay curve provides only a partial ordering of carriers. In the
case of 'crossing' flight-delay curves, we offer several
possible solutions.
We illustrate the flight-delay curves using actual flight-delay
data for July 2005. One limitation of this research on passenger delay
preferences is that we employ aircraft-delay data, rather than the
preferred measure of actual passenger-delay data. For passengers
travelling on nonstop itineraries, these two delay measures are
equivalent. For passengers who make connections, however, an aircraft
delay can lead to a missed connection. We are limited to the publically
available DOT data, which provide information only on aircraft delays,
rather than passenger delays. Thus, one avenue for future research is to
estimate flight-delay curves from actual passenger delays.
Our empirical findings indicate that for longer time frames (i.e.,
a week or a month) aggregate measures of flight delays like the DOT
delay definition (proportion of flights delayed by 15 minutes or more)
are fairly representative of on-time performance. When we examine
shorter time periods, however, we find that the DOT delay definition is
less representative of the distribution of flight delays, and therefore,
the flight-delay curves provide valuable information that reflects
passenger preferences.
Appendix
Arrival Delay Statistics--July 2005
Airline Delay Count (a) Rank Delay Gap (b)
Hawaiian 0.038 1 0.473
Skywest 0.141 2 0.494
Frontier 0.189 3 0.457
Comair 0.193 4 0.542
ATA 0.217 5 0.619
America West 0.219 6 0.509
Southwest 0.235 7 0.339
United 0.254 8 0.595
American Eagle 0.255 9 0.588
Northwest 0.273 10 0.543
Expressjet 0.281 11 0.603
Continental 0.294 12 0.605
US Airways 0.298 13 0.593
Delta 0.303 14 0.594
Independence 0.310 15 0.600
American 0.313 16 0.607
ASA 0.314 17 0.602
Alaska 0.354 18 0.522
JetBlue 0.375 19 0.559
AirTran 0.387 20 0.665
Airline Rank Change in Rank
Hawaiian 3 2
Skywest 4 2
Frontier 2 1
Comair 8 4
ATA 19 14
America West 5 1
Southwest 1 6
United 13 5
American Eagle 10 1
Northwest 7 3
Expressjet 15 4
Continental 17 5
US Airways 11 2
Delta 12 2
Independence 14 1
American 18 2
ASA 15 2
Alaska 6 12
JetBlue 9 10
AirTran 20 0
(a) Delay count is the proportion of scheduled flights arriving 15+
minutes late.
(b) Delay gap is a normalized average-time-delayed measure that
reflects the intensity of delay (see Eqn. 2.2).
References
Bratu, Stephane, and Cynthia Barnhart. 2006. Flight operations
recovery: New approaches considering passenger recovery. Journal of
Scheduling 9:279-98.
Brueckner, Jan K. 2002. Airport congestion when carriers have
market power. American Economic Review 92:1357-5.
Jenkins, S., and P. Lambert. 1998a. Three I's of poverty
curves and poverty dominance: TIPs for poverty analysis. Research on
Economic Inequality 8:39-56.
Jenkins, S., and P. Lambert. 1998b. Ranking poverty gap
distributions: Further TIPs for poverty analysis. Research on Economic
Inequality 8:31-8.
Lambert, P. 2001. The distribution and redistribution of income.
3rd edition. New York: The Manchester University Press.
Mayer, Christopher, and Todd Sinai. 2003. Network effects,
congestion externalities, and air traffic delays: Or why all delays are
not evil. American Economic Review 93:1194-215.
Mazzeo, Michael J. 2003. Competition and service quality in the
U.S. airline industry. Review of Industrial Organization 22:275-96.
Rupp, Nicholas G. 2009. Do carriers internalize congestion costs?
Empirical evidence on the internalization question. Journal of Urban
Economics 65:24-37.
Sen, A. 1976. Poverty: An ordinal approach to measurement.
Econometriea 44:219-31.
Shorrocks, Anthony F. 1995. Revisiting the Sen poverty index.
Econometrica 63:1225-30.
Taylor, Shirley. 1994. Waiting for service: The relationship
between delays and evaluations of service. Journal of Marketing
58:56-69.
Zheng, Buhong. 1997. Aggregate poverty measures. Journal of
Economic Surveys 11:123-62.
Zheng, Buhong. 2007. Unit-consistent poverty indices. Economic
Theory 31:113-42.
(1) See http://jec.senate.gov report, released on May 22, 2008.
(2) For more details, see DOT press release No. 123-07:
http://www.dot.gov/affairs/dot12307.htm.
(3) The Air Travel Consumer Report is available online at
http://airconsumer.ost.dot.gov/.
(4) A survey on poverty measurement can be found in Zheng (1997).
(5) Comparing the delay rate and delay gaps, we find a maximum
change of 14 positions, an average change of 3.95 positions, a median
change of two positions, and 6 out of 20 cases moved at least five
positions.
(6) This discussion of consumers' preferences implicitly
assumes that airlines operate a single size aircraft at maximum
capacity. This assumption, however, could easily be relaxed by weighting
the data by aircraft capacity and/or load factors.
(7) Ideally, the weights used to implement [D.sub.[phi]](X) would
be based on underlying passenger preferences.
(8) Since C(X;i) satisfies the replication invariance axiom,
dominance relation (Eqn. 2.5) can be defined similarly for flight-delay
profiles with different numbers of passengers.
(9) In the poverty context, Jenkins and Lambert (1998a) note that
the preference trade-offs embodied in the TIP Gini (our flight-delay
Gini) are equivalent to the modified-Sen index proposed and discussed by
Shorrocks (1995).
(10) We select this month since it had the highest proportion of
flight delays in 2005.
(11) We exclude both diverted and canceled flights since the length
of flight delay is ambiguous. Just 2.2% of JetBlue and US Airways
domestic flights in July 2005 were diverted or canceled.
(12) Bureau of Transportation Statistics, Airline Service Quality
Performance, July 2005; www.transtats.bts.gov.
(13) For example, from July 1-7, 2005, JetBlue's 15-minute
flight-delay rate and potential passenger-delay rates were 0.2937 and
0.3044, respectively. US Airways also had nearly identical delay rates
as well: 0.3184 (flight delays) and 0.3089 (passenger-delay rates).
(14) Crossing flight-delay curves, however, implies that an
alternative index can be proposed that reverses this ranking.
John A. Bishop, Economics Department, East Carolina University,
Greenville, NC 27858, USA; E-mail
[email protected].
Nicholas G. Rupp, Economics Department, East Carolina University,
Greenville, NC 27858, USA; E-mail
[email protected]; corresponding author.
Buhong Zheng, Economics Department, University of Colorado-Denver,
Denver, CO 80217-2264, USA; E-mail buhong.zheng@ ucdenver.edu.
We are grateful to three anonymous referees, Volodymyr Bilotkach,
and Chia-Mei Liu, along with participants of the 2008 Southern Economic
Association Meeting.
Received December 2008; accepted December 2009.
Table 1. Proportion of Flights Delayed and Gini Coefficients
(Standard Deviation in Parentheses)
Data Period
July 1-7, 2005
Minutes US JetBlue US Airways
Late JetBlue Airways z-Score Gini Gini
5 0.4541 0.4363 1.43 0.0505 0.0531
(0.0111) (0.0056)
10 0.3636 0.3681 -0.38 0.0180 0.0209
(0.0104) (0.0054)
15 0.2937 0.3184 -2.22 0.0089 0.0116
(0.0098) (0.0053)
20 0.2326 0.2821 -4.75 0.0051 0.0073
(0.0091) (0.0051)
30 0.1548 0.2250 -7.71 0.0022 0.0036
(0.0078) (0.0047)
45 0.1026 0.1694 -9.37 0.0008 0.0016
(0.0066) (0.0027)
No. of
flights 2145 7791
Data Period
July 1-4, 2005
Minutes US JetBlue US Airways
Late JetBlue Airways z-Score Gini Gini
5 0.4580 0.4091 3.02 0.0519 0.0505
(0.0142) (0.0077)
10 0.3747 0.3460 1.83 0.0188 0.0199
(0.0138) (0.0074)
15 0.3086 0.2996 0.62 0.0087 0.0117
(0.0132) (0.0071)
20 0.2392 0.2609 -1.55 0.0055 0.0069
(0.0122) (0.0068)
30 0.1657 0.2111 -3.67 0.0022 0.0036
(0.0106) (0.0064)
45 0.1200 0.1591 -3.58 0.0008 0.0016
(0.0093) (0.0057)
No. of
flights 1225 4116