Comparing apples to apples: insuring consistency of measurement with the balanced scorecard.
Sale, Martha Lair
ABSTRACT
Performance evaluation is one of the most complex measurement
issues in management accounting. Despite the assertion that measurement
of the performance of the individual be divorced from measurement of
performance of the business unit, complex interrelations make such a
measure extremely difficult. The task of evaluation is further
complicated by various questions and concerns about who should do the
evaluating and questions regarding what should be evaluated-outcomes,
behaviors, or competency levels. The combination results in an
overwhelmingly complex set of measurement criteria. Balanced Scorecard (Kaplan & Norton, 1992) has introduced a systematic approach to the
measure of qualitative dimensions for business performance, but actually
producing an internally consistent quantified score that is suitable for
comparisons between periods and between subjects continues to be
extremely difficult. This paper illustrates the use of the Analytic
Hierarchy Process (Saaty, 1994) as a mechanism for dealing with this
highly complex scoring problem.
INTRODUCTION
Performance evaluation is perhaps the most complex measurement
issue in management accounting. Despite the assertion that measurement
of the performance of the individual be divorced from measurement of
performance of the business unit, complex interrelations make such a
measure extremely difficult. Add to this, an environment in which teams
are increasingly common. Personnel performance evaluations should focus
on individual contributions sufficiently to prevent social loafing, but
not to an extent that ignores the synergistic properties that make
groups work.
The task of evaluation is further complicated by various questions
and concerns about who should do the evaluating-supervisors, peers,
customers, or subordinates. Add to this questions regarding what should
be evaluated-outcomes, behaviors, or competency levels. All these
dimensions result in an amazingly complex performance evaluation milieu.
Although the Balanced Scorecard, or BSC (Kaplan & Norton, 1992) has
introduced a systematic approach to the measure of qualitative
dimensions for performance, actually producing an internally consistent
quantified score that is suitable for comparisons between periods and
between subjects continues to be extremely difficult.
Fortunately, a mechanism for dealing with this type of highly
complex scoring problem already exists in a decision science technique
known as the Analytic Hierarchy Process or AHP (Saaty, 1994). AHP is a
widely acclaimed multicriteria decision-making technique that allows
analysts not only to grasp a problem of this magnitude but, by using a
mathematically rigorous process, can also offer a measure of internal
consistency (Saaty, 1996). The AHP has become one of the most popular
aids to decision-making and has been popularized and made widely
available through "Expert Choice" software (Expert Choice,
Inc., 2000; Forman et al., 1983). This paper presents a format for using
a Balanced Scorecard approach and AHP to produce an internally
consistent, comprehensive measure of personnel performance.
The first question that must be addressed in development of an
effective performance evaluation system is the question of its purpose.
Evaluation systems are used to give employees useful feedback, coaching,
and guidance to help them improve their performance and develop job
related skills (Peiperl, 2001). Evaluations can be used as part of a
formal goal setting system (Scott & Einstein, 2001). They can also
be used for making staffing decisions such as which employees will
receive raises, or conversely, they may provide legal documentation
protecting an organization from suit in the event that an employee must
be dismissed. Depending on how the evaluation system is designed, it
will work better for some of these purposes than for others (Goldstein,
1998).
Increasingly, companies are using the Balanced Scorecard (BSC) as a
mechanism to link the performance of individuals in the company to
accomplishment of the strategic goals of the company (Kaplan &
Norton, 1992). The BSC is developed to provide an integrated set of
performance criteria that provides a successful measure of how well each
individual and sub-unit within an organization supports all the goals
associated with the Critical Success Factors (CSF) of the company. The
strength of the BSC is the use of a variety of measures to link the
performance of all the elements of the organization to the strategic
performance of the organization. The BSC incorporates both financial and
non-financial measures that may be either qualitative or quantitative
and provide both outcome (or lagging indicators) and performance drivers
(or leading indicators) (Kaplan & Norton, 1992).
Some version of the Balanced Scorecard is now in use in
approximately half the Fortune 1,000 companies in the United States and
about forty percent of the European counterparts (Gambus & Lyons,
2002). According to Gambus and Lyons (2002) Philips Electronics has
committed to the BSC as a tool that is both effective and enduring.
Philips has involved its 250,000 employees in more than 150 countries
around the world in developing a BSC with three levels. The current
levels, the strategy review card, operations review card, and the
business unit card, are expected to be enhanced by a fourth level, the
individual employee card, right away. At Phillips, the corporate quality
department provided comprehensive guidelines for metric linkage between
these different levels to assure coordination. They address the
robustness of the measurement system by addressing the fact that goals
set at a lower level must support the goals of higher level to assure
that it is possible to meet or exceed all Critical Success Factor goals
(Gambus & Lyons, 2002).
Add to this emphasis on a variety of strategic measures that are
applied consistently through different levels of the organization, the
difficulty of accurately measuring the performance of individuals in an
environment that is turning increasingly to group activities and the
problem becomes almost intractable. For the BSC to assume its place as
an effective measure of the strategic performance of all elements of the
organization it must be practical to administer and the results must be
accepted and relied upon by decision makers. Research shows that
managers often have difficulty incorporating the results of subjective
measures into the strategic decision-making process because they
perceive such measures to be unreliable (Liberatore & Miller, 1998).
Some adopters of the BSC have already abandoned its use because managers
could not deal with the complexity and ambiguity inherent in the
measures (Gambus & Lyons, 2002). Despite the adoption of BSC by
increasing numbers of companies, little has been published about how
managers deal with the issues of complexity and ambiguity in the
measures.
To develop a Balanced Scorecard that effectively links the
performance of all levels of the organization to the organizational
strategy a number of evaluation problems must be addressed. The
organization must develop an evaluation that consistently measures the
contribution of subjects who work in a variety of situations involving
various teams and individual efforts. In addition, this evaluation must
consider the degree to which the evaluator is able to judge the
performance of the subject or the weight which should be given different
evaluators opinions of the subject's performance. Whether the
subject of evaluation is an individual, a team or an entire business
unit, the mechanics of the evaluation process may be the same, however
the choice of evaluators and the weights given the contributions of
individuals would differ. When evaluating teams, such as work or service
teams, whose members all perform the same or similar tasks, the
contributions of each individual would receive roughly the same weight.
In network teams where each individual is an expert in a different
field, the contributions of the expert in that area would be given a
higher weight for the criteria that fall into their expertise. For
example, individuals who participate in multiple teams, or work alone
and in teams, first would have their evaluators grouped into various
"evaluation groups" based on their connection to the
evaluation subject. For example, three separate evaluations groups would
assess an individual evaluation subject who belonged to Team A, Team B,
and who also performed as an individual. One group would consist of
constituents of each team and the third group would be made of
constituents of the individual. These evaluation groups would then be
subdivided into "evaluator classifications" based on their
relationship to the subject, such as manager, team leader, team member,
and coworker and the relative importance of the opinion of each
classification determined. Finally, each individual evaluator would
complete an evaluation, and these evaluations will be combined using
weighting factors that recognized the relative importance of the
individual evaluator's input. For example, perhaps the team
leader's opinion is more important than the opinion of a coworker.
Incorporation of these complex relationships results in a single
evaluation that can be looked at in terms of individuals or groups.
The same outcome, behavior, and competency level measures can be
used to evaluate various subjects whether individuals or teams. However,
it should be pointed out that although what is good for two subjects may
be the same, what is important for the same two subjects might be very
different. For example, creative problem solving skills would probably
be considered a good thing for any person or team to possess without
regard to the type of team or individual job function. However, creative
problem solving is probably much more important to research and
development teams than it is to work or service teams that deal with
repetitive, routine tasks. Obviously, the use of a generic
one-size-fits-all performance-evaluation that ignores important
differences between various individuals and teams is inappropriate, but
that does not mean that the same format cannot be used for all of the
evaluation systems (Scott & Einstein, 2001).
Use of a formal multiattribute decision analysis process makes it
possible to incorporate the myriad elements of a multi-level Balanced
Scorecard into a decision process that is capable of handling the
complexity of the interaction between the elements and provides the
degree if internal validity requisite to assure that performance
measures are consistent from one evaluation period to the next and from
one subject to the next. One such multiattribute decision analysis
process is the Analytic Hierarchy Process, or AHP (Saaty, 1994; Saaty,
1996).
AHP is a widely used method for analyzing complex decision-making
problems. It breaks the decision problem down into small easily
understandable parts, organizes these component parts into a hierarchy
of levels, and provides a mechanism for evaluating the
interrelationships among the components of the hierarchy (Saaty, 1994).
The AHP is a process designed to facilitate the formalization of
multicriteria decision-making. It allows the decision maker to
incorporate both "hard data" and less quantifiable elements
such as judgments, feelings, and experiences. AHP has been widely used
in a variety of decision-making applications (Saaty, 1996).
Using AHP the decision problem is structured hierarchically from
criteria to lower level subcriteria. The resulting model is called a
value tree or a hierarchy of criteria and objectives. Users of AHP make
a series of pair-wise comparisons of these criteria. If the criteria
being compared are objective, the numeric values for the criteria are
compared. If, however, the criteria are wholly or partially subjective,
then the comparisons are made on the basis of relative preference
between the two on a scale of one to ten where one indicates no
preference and ten indicates an overwhelming preference. Once these
comparisons have been established for each criterion an n x n matrix of
comparisons, where n equals the number of criteria, is constructed. In
this matrix, the elements are arrayed where the Aij element is always
the reciprocal of the Aji element. That is, if the first criterion is
preferred over the second criterion by a factor of four then the A12
element of the matrix is three and the A21 element is 0.25. The
principle eigenvector of the matrix is then calculated and normalized.
This eigenvector represents the complete set of the relative importance
of the criteria. This results in a dependable, mathematically rigorous,
quantitative approach that overcomes the complexities and difficulties
inherent in measuring unlike elements and delivers a system that can be
trusted and relied upon by managers (Saaty, 1996). For a more complete
discussion of the process of AHP see Saaty (1994). Harker and Vargas
(1987) provide a discussion of the inherent theoretical strengths and
weaknesses of AHP.
METHODS
Using the AHP to structure the Balanced Scorecard system into a
single measure requires the decision maker first structure the problem
as a hierarchy. The elements of that hierarchy are then prioritized by
responses to questions about the dominance, or importance, of one
element over another (Liberatore & Miller, 1998). The first, and
perhaps most creative, step in the AHP process is structuring the
problem as a hierarchy. A useful approach is to start with the goal and
decompose it into the most general and easily controlled factors at the
simplest or most basic level possible. The decision maker then works
back up through the hierarchy starting with the simplest sub-criteria
that must be met and combining the sub-criteria into generic higher
level criteria until the various measurements are linked in such a way
that comparisons between unlike elements are possible (Liberatore &
Miller, 1998).
Many possible criteria can be measured or subjectively judged in
association with any position. Abernathy (2001) suggests the measurement
design process start with ideal criteria and then compromise from this
ideal to develop criteria that can be defined and rated. An example of
the types of criteria that may be used is provided in Table 1; it has
been divided according to outcomes, behaviors, and competencies.
Although the number of possibilities that may be easily considered
dramatically increases with use of an AHP software package, it is
suggested that the number of performance measures be limited for other
reasons. When the number of measures expands beyond those few essential
to measure the most important criteria, the added complexity tends to
cause the focus of the process to be misdirected from the importance of
the criteria being measured to the measurement process itself. The
number of measures is best left to fewer than twenty-five (Horngren,
Foster & Datar, 2000, p. 446-453). The criteria used should mirror
the real world as closely as possible, and should measure "as
though the participants are franchised or in business for
themselves." (Abernathy, 2001) Also, care should be taken only to
measure controllable outputs. This means that broad measures that are
affected by events out of the control of the subject should be avoided
(Abernathy, 2001). Many of the measurements listed above can relate to
inter-group interaction, but perhaps this is not enough. Care must be
taken so that subjects do not optimize their own output at the cost of
the organization as a whole. The addition of specific "linked"
measures may help aid cooperation. If the performances of two subjects
affect each other greatly, cooperation can be increased by including one
or more of one group's criteria as a measurement of the other
group. For instance, if Group A supplies parts to Group B, one of Group
A's criteria could be the percentage of the time Group B has enough
available stock on hand (Abernathy, 2001).
Not all subjects should be rated in every way by every one who
could rate them. For instance, the outcome criteria of a work or service
team can be measured, but due to the "tight interdependencies among
team tasks" the individual members should only be measured
according to their behaviors and competencies (Scott & Einstein,
2001). In addition, undue focus on the individual can undermine the
performance of the group by encouraging finger pointing and discouraging
cooperation (Abernathy, 2001).
Although every subject could be measured for the same criteria, the
weight assigned to these criteria will be different between the
individuals and groups participating. If it is determined that the
weight of a particular criteria for a given subject is near zero, that
criteria should be left off the evaluation. A group of key stakeholders,
including managers, members of the personnel department, financial and
operational experts, information technology professionals, and employee
representatives must be involved in the design of the evaluation system.
Their expertise is needed to refine the hierarchy to match the job
requirement for each subject (Olveira, 2001). Although an agreement must
be reached as to the exact weights for each subject's criteria,
some general guidelines are available in the existing literature.
Subjective or qualitative and objective or quantitative criteria must be
present and weighted to be balanced appropriately. In other words, one
should not concentrate entirely on outcomes or behaviors; both
quantitative and qualitative criteria should be used where appropriate
(Abernathy, 2001). Remember that not all effects are under the control
of the subject. The weaker the link between effort and performance, the
more weight should be placed on qualitative criteria. Qualitative
criteria should also be more heavily weighted when there is a great need
for organizational citizenship and teamwork. If a team works primarily
independently of the rest of the organization, quantitative criteria
that are more results oriented would be weighted heavier (Peiperl,
2001).
The balance of this paper provides an example of the performance
evaluation of one hypothetical individual who functions as a member of
two teams, and works independently. The performance evaluation is
developed using AHP. What follows is a detailed example of use of AHP to
link the performance evaluation based on the Balanced Scorecard to the
overall mission and objectives of the organization. A panel brought
together for this purpose and consisting of various key stakeholders in
the organization would make almost all of the decisions made in this
section. Throughout the example, these decisions were made concerning
one specific hypothetical subject and made by the authors for
illustrative purposes only. The reader should keep in mind that although
one particular software program was used in this example, the same
process might be accomplished using other software, and that the
illustration would be equally valid.
At this point, several words of caution are in order. As the number
of objects to be compared increase, the number of pair-wise comparisons
necessary to rank them rapidly becomes unwieldy. In addition, it is not
just necessary that the comparisons rate one of each pair of choices as
more important than the other, a determination must be made as to what
degree one is more important. This may be accomplished a number of ways,
including calculation with a hand-held calculator, spreadsheet software,
or math software. However, AHP software is available to facilitate the
comparison and ranking operations of the AHP as well as providing the
numeric solution. Calculations for this example were done using
Web-HIPRE, a publicly available Web based software package provided by
the Systems Analysis Laboratory of Helsinki University of Technology (Web-HIPRE, 1998). Other software packages are available that do the
same calculations. One such package is Expert Choice. Expert Choice is a
commercially available package that it is used by a large group of major
companies (Expert Choice, Inc., 2000). These software packages makes it
possible to easily manipulate a large number of variables, keeping track
of comparisons, rankings, and weights. They also provide measures of the
consistency of the judgments and allow complete sensitivity analysis.
Using Web-HIPRE is relatively simple, and help is available for
most topics of concern. First, the various elements of the hierarchy are
placed in the working area with the leftmost column corresponding to the
first level of the hierarchy (Goal of the AHP) and the next column
consisting of the elements from the second level of the hierarchy
(Subject Functions), and the third consisting of the classifications of
evaluator and so on (Figure 2: Hierarchical Relationships). Once the
elements are located in the correct columns, they must be connected to
depict their relationships. All of the elements in the second column
representing functional areas for which the subject is being evaluated
should be connected to the element in the first column. Then the
elements in the third column representing the various evaluator
classifications should be connected to the appropriate elements in the
second column. For the individual subject in this example, the
evaluation by customers is only applicable to the subject's
function in Team A. In fact all the evaluator groups in the third column
are deemed important in varying degrees to the individuals
responsibilities as a member of Team A. Management's evaluation of
the subject is considered important in all three of the subject's
functional areas, but the evaluation provided by team leaders is
applicable to the subject's function in Team A and Team B, not the
subject's function as an individual. The process continues for each
classification of evaluator. Next, the various elements in column four
that represent the individual evaluations will be attached to the
appropriate element in column three. Note that this might represent
individuals that will fill out evaluations for the subject or it might
represent the score on an examination or a survey. In this example there
are two individual customer scores by which the subject will be rated.
These could represent actual evaluations performed by individual
customers, or they could represent the results of data collected by
observing the subjects interaction with customers. One manager score
will provide the evaluation for management. Again, this could represent
a single evaluation by the subject's supervisor, or it could be a
composite score derived from several sources. Leaders of each team
provide input into the team leader evaluation of the subject. Thus, all
the individual evaluations are linked to a class of evaluator. When this
process is complete, the three elements in column five representing
outcomes, behaviors, and competencies, will all be attached to the
elements in column four to which they are considered important. Not
every individual evaluator will evaluate the subject on outcomes,
behaviors, and competencies all three. For instance, in this example,
the subject is evaluated by two co-worker evaluations; however, the
co-worker evaluations evaluate only behaviors and competencies, not
outcomes. Similarly, the leader of the dependent team does not evaluate
the individual on behaviors or competencies, only on outcomes. This
recognizes that all three types of criteria are not important to all
evaluators. Next, the elements in column six representing each
performance measure will be attached to the appropriate element in
column five.
[FIGURE 2 OMITTED]
Once these relationships are established, the weighting procedure
is performed to compare the importance of each of the element in its
relationship to the other elements to which it is attached. Beginning,
in the first column, the user is prompted by the menu to answer a series
of questions as to "how many times more important" one element
is as compared to another. This individual subject is being evaluated as
a member of two teams and as an individual. This subject's function
as a member of each of the teams and as an individual was compared in a
pair-wise comparison of the type, "Team A is how much more or less
important that Team B?" By using a mechanical indicator and
weighing the relative importance of each of the functions against the
other the software program leads the user through the process to compare
the three functions. The user is prompted to complete this comparison
process for each set of elements or criteria. When this process is
complete, the program automatically generates the matrices and
eigenvector needed to accomplish AHP. The weighting factors for each of
the sets of elements or criteria as generated by WebHIPRE are displayed
with the element or criterion in Figure 2. In this example, note that
the weight for the subject's function as a member of the two teams
and as an individual are 0.55, 0.24, and 0.21 respectively. This means
the 55% of the evaluation of the subject is related to being a member of
Team A, 24% is related to being a member of Team B, and 21% is related
to functioning as an individual. Users of the program may choose to
enter these percentages directly instead of going through the pair-wise
comparison process, if they so desire. Continuing the illustration, the
55% of the subject's evaluation that is related to Team A is
composed of evaluation material from all the evaluator classes in column
2, but only select evaluator classes provide information for the portion
of the evaluation that is related to Team B and to the subject's
work as an individual. At this level, the user could not simply enter
percentages because the importance of each class of evaluator is weighed
in its relationship to each function.
The software program leads the user to compare individual elements
representing evaluator classes in relationship to the other evaluator
classes that provide input for each function. For instance, note that
each of the six classes of evaluator provides input to the performance
evaluation of the subject's function as a member of Team A. The
user is led to determine the importance of each evaluator class's
contribution to every other. The program automatically offers each
pair-wise comparison for consideration. The user might begin by
indicating that the importance of the opinion of the customer and the
manager are equal and that the opinion of the customer's input
should be considered to be twice as important as that of the team leader
and so on until comparisons are made between each possible pair of the
six elements or criteria. The program will then calculate and assign
weights to each of the six as they relate to only Team A. These six
weights total to one and represent the percentage of weight given to
each evaluator class in the evaluation of the subjects performance as a
member of Team A. The user then moves on to the next element or
criterion, in this case Team B, and is led through the process of
comparing the importance of each of the four evaluator classes that
provide input into the evaluation of the subjects performance as a
member of Team B. Again, the user may eschew this comparison process and
enter the percentages directly if so desired. Using the matrix algebra process described above, the software program considers that each
evaluator class provides different levels of input to multiple elements
at the proceeding level and assigns the relative weight that should be
given the opinion of that evaluator class. The resulting score such as
.096 for Team Member means that 9.6% of the weight of all the evaluator
classes should be assigned to the evaluation of team members. This
determination is made considering the relative importance of team member
evaluation to the subject's function as a member of Team A and Team
B and the relative importance of the subject's function in each
team to the subject's overall performance evaluation. This process
is repeated for each level down to the lowest level to which the
evaluation is being decomposed. In this case, the lowest level consists
of the eleven criteria in the final column of Figure 2. When the process
is complete, the software package provides the user with the weights for
this lowest level, which will then be used in scoring of the individual
subject's performance.
This process is, admittedly, time consuming. However, many parts of
the hierarchy will be similar for classes of individuals and could be
duplicated. Once this process is complete a performance evaluation
template can be constructed for each individual and the process would
not need to be repeated unless there were substantive changes in the
subjects job description.
The weights as generated for each of the individual job performance
criteria (Figure 1 and Figure 2) have been incorporated into the weight
column in Table 2. After the weights have been determined, goal values
must be entered as a standard of measure. For instance if the individual
evaluator is rating the subject on a scale of 1 to 10 where 10 is the
highest evaluation for this criterion, then the goal value would be 10.
On the other hand, for another criterion that has a quantitative
observable outcome, such as a defect rate or success rate, the goal
value might be 0 (number of defects) or 100 (percentage of good parts).
This goal would be compared with the actual results for that criterion.
In order to determine the index for each criterion, the difference
between the achieved value and the goal value is determined. This is
then divided by the goal value and one is added. This number times 100
is the Weighted Score. The formula to accomplish this calculation is:
Index = 100 * (actual - goal / goal + 1)
In situations where the object is a low goal number rather than a
high goal number the following alteration is necessary:
Index = 100 * ((100 - actual - (100 - goal) / 100 - goal + 1)
The Weighted Score for each criterion is then summed to provide the
total score for the subject. If the subject accomplishes 100% of all set
goals, the score is 100.
The labor-intensive task of constructing this goal template must be
done only one time for each subject. A spreadsheet program was used to
construct the template reproduced in Table 2 and subsequent evaluations
would require only changing the numbers that represent the actual
results.
In this example, the subject performed at a level slightly lower
than the goal, but perhaps more important than the overall ranking are
the individual indices. A glance at the various indices shows which
areas need work, and which areas are being performed at or below goal
levels. However, using the calculations described here, note that if
scores above the goal level were entered it would result in a score that
represented over 100% on those criteria. Users who wished to limit the
performance rating to 100% would limit the score to the goal level for
any criterion on which the subject performed at a level above goal.
RESULTS AND DISCUSSION
Although this example illustrated the performance evaluation for an
individual the same process can easily provide performance ratings for
groups or even the business unit as a whole. Use of AHP in the
evaluation process provides the requisite level of theoretical rigor and
internal consistency to assure reliability of the measure. In addition,
even though the process is somewhat involved, it is doubtful that it
would require more time or consideration than using a less meaningful
calculation.
Although this example was performed using the publicly available
Web-HIPRE, this particular software package would be appropriate for
testing the process only. An organization that implemented this type of
evaluation system would be better served purchasing AHP software.
Because of the way Web-HIPRE is administered, all calculations are
performed and the resulting data is stored on computers maintained by
the Systems Analysis Laboratory of Helsinki University of Technology.
Much of the information that is measured during the evaluation system is
sensitive should not be transmitted outside of the organization.
REFERENCES
Abernathy. (2001). Secrets to success with balanced scorecards. HR
Focus 78(10), October: 3.
Forman, E., Saaty, T., Selly, M. & Waldron, R. (1983). Expert
Choice, McLean, VA, Decision Support Software, Inc.
Expert Choice, Inc. (2000) <http://www.expertchoice.com/>.
Accessed 140203.
Goldstein, J. (1998). The case for learning styles. Training and
Development, 52(9), September: 36.
Gambus, A. & Lyons, B. (2002). The balanced scorecard at
Philips electronics, Strategic Finance, 84(5), 45-49.
Harker, P. & Vargas, L. (1987). The theory of ratio scale
estimation: Saaty's analytic hierarchy process, Management Science
33(11), 1383-1403.
Horngren, C., Foster, G. & Datar, M. (2000). Cost Accounting: A
Managerial Emphasis, Upper Saddle River, NJ: Prentice Hall.
Kaplan, R. & Norton, D. (1992). The balanced
scorecard--Measures that drive performance. Harvard Business Review, 70,
January-February, 71-79.
Liberatore, M. & Miller T. (1998). A framework for integrating
activity-based costing and the balanced scorecard into the logistics
strategy development and monitoring process, Journal of Business
Logistics, 19(2), 131-55.
Olveira, J. (2001). The balanced scorecard: an integrative Approach
to performance evaluation, Healthcare Financial Management, 55(5), May:
42-47.
Peiperl, M. (2001). Getting 360 degree feedback right, Harvard
Business Review, 79(1), January: 142-47.
Saaty, T. (1994). How to make a decision: the analytic hierarchy
process, Interfaces 24(6), November-December: 19-43.
Saaty, T. (1996). The analytic hierarchy process, Pittsburg, PA:
RWS Publications.
Scott, S. & Einstein, W. (2001). Strategic performance
appraisal in team-based organizations: one size does not fit all. IEEE Engineering Management Review, 4th Quarter: 7-15.
Web-HIPRE. (1998).M <http://www.hipre.hut.fi/>. Accessed
140203.
Martha Lair Sale, University of South Alabama
Table 1: Types of Criteria
OUTCOMES BEHAVIORS COMPETENCIES
Accurate Accepts criticism Adaptable
Continuous learning Adaptable Appraisal
Customer service Altruistic Collaboration
Job skill development Big picture oriented Communication
Neatness Conscientious Conflict resolution
Process improvement Cooperative Coordination
Professional appearance Courteous Delegation
Punctuality Cynical Goal setting
Rework required Dependable Identify needs of others
Work quality Detail oriented Knows when to seek help
Work quantity Good sport Leadership
Work timeliness Helpful Long range planning
Listens Organizational
Team player Planning
Problem solving
Time management
Table 2: Individual Subject Performance Score
Goal Actual
Customer Rating on Customer Survey 100% 94%
Work Quality Percentage of Good Output 99% 97%
Work Quantity Percent of On-time 95% 92%
Completion
Job Skill Hours of training 50 50
Professional Rating (1 to 10) 9 8
Conscientiousness Rating (1 to 10) 10 8.5
Flexibility Rating (1 to 10) 9 7
Consideration of Rating (1 to 10) 9 9
Others
Planning and Goal Rating (1 to 10) 9 9
Setting
Conflict Rating (1 to 10) 9 8
Resolution
Collaboration Rating (1 to 10) 10 9.5
Raw Index Weighted
Score Score
Customer 94.00 0.073 6.86
Work Quality 97.98 0.105 10.29
Work Quantity 96.84 0.186 18.01
Job Skill 100.00 0.151 15.10
Professional 88.89 0.032 2.84
Conscientiousness 85.00 0.120 10.20
Flexibility 77.78 0.069 5.37
Consideration of 100.00 0.053 5.30
Others
Planning and Goal 100.00 0.067 6.70
Setting
Conflict 88.89 0.081 7.20
Resolution
Collaboration 95.00 0.065 6.18
Total Score 94.05
Figure 1: Hierarchical Levels
Hierarchical Level 1: Goal of the AHP
Determine the relative importance of various measures of the Critical
Success Factors in following the strategy of this organization.
Hierarchical Level 2: Subject Functions
Team A Team B Individual
Hierarchical Level 3: Evaluator Class-Groups
Manager Team Team Other Co- Customer
Leader Leader Team worker
Hierarchical Level 4: Individual Evaluator
Appraiser 1 Appraiser 2 Appraiser 3 Et cetera
Hierarchical Level 5: Classifications of Measurement Criteria
Outcomes Behaviors Competencies
Hierarchical Level 6: Performance Measurement Criteria
Customer Professional Development Planning and Goal
Orientation Setting
Work Quality Conscientiousness Conflict Resolution
Work Quantity Flexibility Collaboration
Job Skill Conflict Resolution
Development