What leads to better visitor outcomes in live interpretation?
Stern, Marc J. ; Powell, Robert B.
Introduction
Live interpretive programs can serve multiple purposes (Ham, 2013).
These include enhancing the experiences and the enjoyment of visitors to
special places (Moscardo, 1999; Stern et al., 2011), increasing
visitors' knowledge and understanding of natural and cultural
resources and places (Ham, 1992; Tilden, 1957), fostering a sense of
appreciation or other attitudes toward those resources (Powell et al.,
2009), and promoting stewardship behaviors, both on-site and after
visitors leave the site of the interpretation (Ham, 2009).
While volumes have been published outlining what might be
considered best practices for producing such outcomes, a recent review
of the empirical literature suggests that the linkage between these best
practices and visitor outcomes have only circumstantial support, despite
strong theoretical grounding (Skibins et al., 2012). This is largely due
to a lack of comparative studies, which can empirically isolate which
practices are the ones most likely causing desired outcomes. Most
research studies have evaluated the outcomes of single programs rather
than mixtures of programs with varying characteristics. While findings
of positive outcomes across multiple studies suggest the broad efficacy
of interpretation in general, no study has yet isolated the influence of
different interpretive practices and approaches upon visitor outcomes.
This study aims to close this gap in the literature through a
comparative study of live interpretive programs across the National Park
Service (NPS), by identifying which practices and approaches most
consistently lead to more positive outcomes, including visitor
satisfaction, enhancement of visitor experience and appreciation of the
park unit and its resources, and intentions to change behaviors
resulting from program attendance.
Hypothesized best practices for interpretation
Skibins et al. (2012) identified consensus-based best practices of
the field in a recent review article. Many of these practices stem from
Freeman Tilden's (1957) original six principles first identified in
1957. The principles generally highlight the importance of making
communication relevant to the audience; of telling holistic stories; of
practicing the art of revelation based on information rather than
information dissemination; of provoking the audience to want to do
something, whether it be to reflect more deeply, learn more, or act upon
new information; and of tailoring interpretation to different audiences.
Many others have expanded upon those original best practices to provide
insights into how to best craft stories; how to organize content; how to
make interpretation relevant, engaging, and entertaining; and how to
achieve particular outcomes (see Skibins et al., 2012, for a summary of
this work). We drew upon this broad body of literature to develop many
of the key program characteristics of interest in this study (see Table
3 for full list).
The role of the interpreter
In addition to characteristics of programs, the characteristics of
the interpreters and their delivery styles also likely influence program
outcomes. Passion on behalf of the interpreter, for example, has long
been recognized as an important element of successful interpretive
programs (e.g., Beck and Cable, 2002; Ham & Weiler 2002; Ward &
Wilkinson, 2006). We supplement this concept with additional theories
from education and communication to further explore the impact of the
interpreter on visitor outcomes in addition to the content and format of
the program.
The concepts of immediacy, credibility, and clarity have been
studied extensively in the communications and education fields (Finn et
al., 2009). Immediacy behaviors are those that tend to enhance the
familiarity and reduce psychological distance between the communicator
and his or her audience (Mehrabian, 1969). Such behaviors might include
friendly physical gestures, small talk, calling people by name, or the
sharing of personal information (Myers et al., 1998). These behaviors
may also be related to "affinity-seeking," or the process
through which communicators attempt to get listeners to like them
(McCroskey et al., 1986). Studies suggest that such behaviors can
enhance the openness of audiences (most studies involve students and
their teachers) to the content of lessons (Finn et al., 2009). Others
have also assumed that general likeability may be an important factor in
audience response (Ward & Wilkinson, 2006).
Credibility refers to audience members' perceptions of the
believability or legitimacy of the communicator. Credibility has been
found to be important in predicting the responses of message recipients
in multiple fields (e.g., Ajzen 1992; Rogers 1995; Stern 2008). Within
the education and communications fields, Finn and others (2009) suggest
that this credibility is composed of three dimensions: competence,
trustworthiness, and caring. Competence can be related to the apparent
knowledge, confidence, and eloquence of the communicator.
Trustworthiness can be based on multiple factors, including the
interpreter's appearance, performance, degree of comfort and/or
authority, title or position, and/or personal interactions with the
audience. Caring is primarily related to the sincerity with which the
interpreter communicates as well as his or her interactions with the
audience.
Clarity is not only related to eloquence, but also to the
consistency, or "fidelity" of the communicative experience
(Chesebro & Wanzer, 2006). Finn and others' review (2009) found
that lessons taught with any combination of these characteristics
(clarity, credibility, and immediacy) tend to be more effective for
learners than those exhibiting only one of them.
Interpreters also have the ability to assume particular roles as
communicators. These range from friend to authority figure to the
"walking encyclopedia" that Enos Mills warned future nature
guides against becoming nearly 100 years ago (Mills, 1920). Each of
these identities may be differentially appropriate in different
situations and with different audiences (Wallace & Gaudry, 2005).
Other items of interest include any apparent bias, misinformation, or
false assumptions about the audience made by the interpreter, which
could detrimentally influence audience responses.
Interpreters' planning processes and psychological states
might also influence the quality of their programs (see Stern at al.,
this issue). As noted above, interpretation can be used for many
purposes, ranging from teaching to entertainment to persuasion.
Interpreters' intentions may drive, at least to some extent,
audience responses to their programs (Ham, 2013).
Methods
Selection of sites
We aimed to select park units that reflected the diversity of
locations, types, and resources of the U.S. NPS system. Criteria for
selecting park units for the study included annual visitation numbers,
park location (region of the country and distance from population
centers), programming focus, number of programs offered to the public,
and willingness to participate in the study. In order to ensure adequate
visitor attendance at interpretive programs, we only considered parks
that received at least 35,000 annual recreation visits. Parks were
categorized as urban, urban-proximate, or remote based on their
proximity to metropolitan centers. Metropolitan areas were defined as
having an urban core of at least 50,000 residents. Urban parks were
located within the limits of these metropolitan areas. Urban-proximate
parks were located outside these cores, but within a 60-mile radius of
these areas. As such, they were typically in rural or suburban areas.
Remote parks were located at least 60 miles from any metropolitan area.
Parks were placed into one of three categories based on their primary
resource base: predominantly cultural, predominantly natural, or a mix
of the two. We aimed to have our selection of units mirror the makeup of
the NPS system and also allow us to observe at least 10 programs in each
park (or within nearby clusters of parks in cases such as Aztec Ruins
and Navajo National Monuments) in five days or less. Twenty-four park
units were selected for inclusion in the study (Table 1).
We observed programs in 14 predominantly culturally focused park
units, seven predominantly nature-focused park units, and three park
units with a mixed focus. This roughly mirrors the distribution of these
different types of park units throughout the NPS, where roughly 30% of
park units are predominantly nature-focused and roughly 60% are
predominantly culturally focused. (1) We visited 11 remote park units,
five urban-proximate parks, and eight urban park units. This variability
provides a reasonable sample from which to make generalizations to the
broader population of live interpretive programs across the NPS. Park
units were organized for logistical purposes by geographic region into
six clusters. Teams of two researchers collected data from each park
unit. One team of researchers sampled Great Smoky Mountains National
Park and the mid-Atlantic, Washington D.C., and California locations.
The other team sampled the Southwest, Midwest, and South Dakota
locations.
Sampling and data collection
Individual live interpretive programs served as the unit of
analysis for this study. Programs were selected within each park based
on variability (with regard to subject matter--natural vs. cultural--and
types of delivery--guided walks vs. campfire programs vs. hands-on
activities, etc.) and their time and location to maximize the number of
programs observed at each park unit. Regular programs were selected over
children's programs whenever possible, as adult respondents were
the targets of visitor surveys. We attempted to attend 488 scheduled
programs, of which only 376 occurred. From these 376 programs, we
collected 3,603 surveys from visitors (Table 2). Data from 312 programs
were used in the analyses contained within this paper (see
"Interpretive program sample development and data cleaning"
below for more detail).
Throughout the research, the same procedure was followed for
observing all programs. Upon arrival at the program site, a brief
interview was conducted with the interpreter. Interview questions
included interpreters' intended programmatic outcomes, questions
about program development, and others about the preparation and the
level of enthusiasm of the interpreter. The interviews also collected
basic background information about the interpreter, which included age,
gender, and interpretation experience. These interviews were conducted
on all but 15 programs. In those cases, time did not allow for the
interviews to take place. Basic information about the program itself was
recorded by the observer, including time, location, type, topic focus,
and size and age breakdown of the audience.
At the end of the program we asked visitors over the age of 15 to
complete a short survey regarding their opinions of the program and its
influence on them. For programs with fewer than 50 participants, we
attempted a census of all eligible attendees. In programs that were
particularly large (more than 50 attendees), the researchers employed
systematic sampling whenever possible--for example, selecting every nth
row to complete surveys at Ford's Theatre. In these cases, the
researchers chose the sample interval in attempt to target at least 20
respondents.
During each program, researchers maintained an unobtrusive presence
within the group, acting simply as another member of the audience. The
researchers completed observation sheets during and immediately
following each program.
Throughout the duration of all field work, researchers would
periodically attend programs together to ensure reliability and
consistency in scoring each variable. Occasional check-ins were also
completed between team members to ensure that observation techniques
were consistent, to clarify questions about scoring certain variables,
and to add variables that were deemed relevant to the research. No new
variables were added after the first week of fieldwork.
Measurement
Dependent variables: outcomes
The dependent variables in the study were composed of retrospective
assessments provided by program attendees on surveys administered
immediately following their programs. (2) While interpretation may
produce multiple outcomes, we focused primarily on visitor satisfaction
and shifts in knowledge, attitudes, and behavioral intentions relevant
to the park experience.
Overall satisfaction with the program was measured on a scale from
0 to 10, with 0=Terrible and 10=Excellent. An additional battery of
survey items provided response prompts for the following question:
"To what degree did the program you just attended influence any of
the following for you?" Response categories were composed of a
five-point Likert-type scale, with answer choices: Not at all (1), A
little (2), Somewhat (3), A moderate amount (4), and A great deal (5).
The survey items included:
* Made me think deeply
* Made me reflect on my own life
* Enhanced my appreciation for this park
* Enhanced my appreciation for the National Park Service
* Made me more likely to avoid harming park resources
* Increased my knowledge about the program's topic
* Made my visit to this park more enjoyable
* Made my visit to this park more meaningful
* Changed the way I will behave while I'm in this park
* Changed the way I will behave after I leave this park
* Made me want to tell others about what I learned
* Made me care more about this park's resources
* Made me care more about protecting places like this
These items were developed based on key literature (e.g., Ham,
1992; Moscardo, 1957; Tilden, 1957; Ward & Wilkinson, 2006) and
extensive input from NPS staff. This input included interviews and focus
groups with the NPS National Education Council; a focus group and
associated surveys conducted with NPS interpreters at the National
Association for Interpretation (NAI) National Workshop in Las Vegas,
November 2010; and two surveys conducted in 2010 and 2011 with NPS
superintendents and supervisors of interpretation, respectively (see
Stern & Powell, 2011). The resulting responses were analyzed to
reduce the items into fewer latent factors reflecting the key outcomes
of programs for visitors (see Results section).
Independent variables: predictors
Our primary independent, or predictor, variables of interest
included both interpreter characteristics and the interpretive practices
employed during a program. These practices were primarily drawn from an
extensive literature review aimed at identifying best practices in the
field (Skibins et al., 2012) as well as characteristics identified by
interpretive experts within the NPS and ranked highly by interpretive
staff in surveys (Stern and Powell, 2011). Additional items emerged as
potentially important in pilot tests (e.g., consistency of tone and
quality throughout a program) and were also measured.
Program characteristics were based in theory found in key texts
within the interpretation literature (Table 3). A subset of these
characteristics, however, were based primarily within the field of
social psychology and relate to programs that explicitly aim to
influence the behavior of participants. In short, the Theory of Planned
Behavior (Ajzen, 1991) suggests that people base their behaviors upon
three types of evaluations they make about the likely outcomes of
performing that behavior: the benefits vs. the costs of the expected
outcomes of the behavior (behavioral beliefs), what they perceive their
peers might think about the behavior (normative beliefs), and the degree
of control and/or ability they feel with regard to carrying out the
behavior (control beliefs). We translated the theory into observable
characteristics that would theoretically address these evaluations (see
"Behavioral theory elements," Table 3).
Interpreter characteristics, meanwhile, focused upon the
appearance, identity, and overall styles of the interpreters themselves,
drawn largely from the communications and education literature, though
many of these factors are also referenced in the interpretation
literature (Table 4). Citations are provided where characteristics were
drawn from the literature. Additional insights and examples can be found
in a companion article in this same issue (Stern et al., this issue).
We also collected details pertaining to the experience level and
demographics of the interpreter, their intended outcomes for their
programs, and their level of excitement about the particular program
they were about to deliver. In addition, we tracked information on the
context for the program including location (e.g., indoors vs. outdoors),
type of program, its focus (natural vs. cultural/historical vs. both),
and other unexpected circumstances that could impact program outcomes
(e.g., weather). In addition, we estimated the number of attendees at
each program and the ratio of youth (ages 15 and under) to adults. Each
of these contextual variables is examined in another article within this
issue (Powell and Stern, this issue).
Pilot testing
Extensive pilot testing aided instrument development and refinement
and enhanced the reliability of measurement across the research team.
Prior to the field research, we observed video-recorded interpretive
programs from an undergraduate interpretation class. These programs were
used to develop consistent measurement of each relevant characteristic.
Programs were viewed repeatedly and scores were compared among team
members on each characteristic. These exercises were also used to refine
the scoring of several variables.
From this testing, a preliminary assessment sheet was developed.
These assessment sheets were further pilot tested at Great Smoky
Mountains National Park in May of 2011, where the research team observed
three live interpretive programs. Extensive discussion allowed us to
further refine definitions and observation techniques for each of the
characteristics under study. For each measure, we aimed to maximize the
number of points in each scale to differentiate practices/attributes and
enhance variability in the findings. However, existing definitions from
the literature and results of pilot-testing limited most scales to four
or fewer points. Pilot testing revealed that the middle-points on larger
scales for many variables were not easily differentiated in a consistent
manner by the research team. As a result, the scoring for each item
varies to maximize the potential range of scores while maintaining
inter-rater reliability. Binary scores were used in cases where the most
appropriate measure was to indicate presence or absence.
Reliability and calibration
We built a calibration phase into the research design to ensure
that each researcher's scores of each observed characteristic were
consistent and reliable and therefore could be interpreted similarly.
This involved three steps. First, immediately upon the completion of the
field research and data entry, we carefully examined differences in the
average scores of each variable between each member of the research team
using a one-way ANOVA with posthoc tests. We identified all
statistically significant differences between the mean scores for
observations by different members of the research team. Second, through
detailed examination of field notes and group discussions, we determined
whether any of these differences might be attributed to systematic
differences in observation techniques as opposed to differences in the
unique sets of programs observed by each researcher. Two types of
systematic differences emerged. In the first case, one researcher was
systematically higher or lower than the other three on a particular
measurement scale. In these cases, scoring procedures were reviewed,
consensus definitions were refined, and that one researcher re-coded the
variable based on these definitions and their qualitative program notes.
Variables that were re-coded in this manner included comfort of the
interpreter, passion, apparent knowledge, sincerity, provocation,
holistic story, and appropriateness for the audience. In the second
case, a researcher had misinterpreted the response scale (scoring
values) of the variable being coded. Again, a consensus definition was
clarified and re-coding of that variable took place. These variables
included cognitive engagement, clear theme, and central message. In one
case, a variable was removed due to inconsistent interpretation of its
definition in the field: place-based messaging.
Data entry and cleaning
Post-program surveys and program audits were coded and entered into
Microsoft Access Database and Microsoft Excel to facilitate data entry.
Data were then transferred to SPSS for screening and analysis. The
visitor survey data were first screened for missing values and any
surveys missing more than 50% of the items per factor were removed. A
total of 118 respondents were removed as a result. Data were then
screened for univariate and multivariate outliers on outcome variables
following Tabachnick and Fidell (2007) using Mahalanobis Distance (MAH)
and studentized deleted residuals (SDRESID). A total of 58 cases were
removed for exceeding +/- 3 standard deviations, or the criterion
Mahalanobis Distance value. This reduced our sample to 3,427 individual
surveys from 376 interpretive programs.
Interpretive program sample development and data cleaning
Because the interpretive program is the unit of analysis in this
study, we aggregated individual data at the program level by calculating
the mean score of each visitor outcome for each program. To do so, we
first needed to determine how many completed surveys within a particular
program would serve as a viable reflection of the quality of that
program and its impacts on visitors. Prior research suggests that
programs with particularly small numbers of attendees may be inherently
different than programs with larger numbers of attendees (Forist, 2003;
McManus, 1987, 1988; Moscardo, 1999). In particular, programs with fewer
than five attendees may have a high likelihood of serving only a single
cohesive group (e.g., a single family). Meanwhile, programs with five or
more have a higher likelihood of being composed of multiple groups.
Moreover, a greater number of survey responses enhances the reliability
of the research findings. Based on this rationale, we separated programs
with fewer than five attendees from those with five or more attendees,
and analyzed them separately.
For groups with five or more attendees, we included in the analysis
all programs with 10 or more respondents to the surveys. We only
included those programs with fewer than 10 respondents if the number of
respondents represented at least half of the eligible respondents at the
program (those over the age of 15). This yielded a total of 272 programs
with five or more attendees for analysis.
For programs with fewer than five attendees (n = 45), we only
included those in which all eligible respondents (those over the age of
15) completed a survey. If a census was not achieved, the program was
dropped from further analysis. This resulted in the removal of five of
these smaller programs, leaving 40 in the sample for further analysis.
Results
Index development: Dependent variables
Before conducting further analyses, we conducted exploratory and
confirmatory factor analyses to explore the relationships between items
and form factors made up of multiple items that represent a concept. The
items that vary together as part of a factor can be combined to create
scales or composite indexes that represent coherent concepts for use in
subsequent analyses (DeVellis, 2003). Following procedures outlined by
DeVellis (2003) we conducted exploratory and confirmatory factor
analysis on dependent variables using the individual respondent data.
Exploratory factor analyses and reliability analyses revealed the
presence of two latent factors. Confirmatory factor analysis (CFA),
which is a form of structural equation modeling, further refined the
structure of these two factors. The resulting CFA model confirmed two
factors while also providing a more parsimonious solution. Model fit
statistics were all within the acceptable range (S-B [chi square] =
338.41; CFI = .96; RMSEA = .08). We labeled the resulting factors
Visitor Experience and Appreciation and Behavioral Intentions (Table 5).
These factors form two of the three outcomes employed in this
study. The first factor reflects an overall assessment of the impact of
the program on the individual's experience, attitudes, and
knowledge. Taken as a whole, it may be the best reflection of the first
two elements of the classic statement from an old NPS manual quoted by
Tilden (1957), "Through interpretation, understanding; through
understanding, appreciation; through appreciation, protection." The
Behavioral intentions factor relates to the third part of the classic
quote, actually influencing the behavior of visitors in some way. The
third outcome, satisfaction, was measured through a single survey item:
"On a scale of 0 to 10, 10 being the best, please rate your overall
level of satisfaction with the program you just attended."
Composite indexes were created for each of the factors by equally
weighting each item and taking the average of all items within the
index. Table 6 shows the individual items that comprise each resulting
index, as well as Cronbach's alpha scores for each. Cronbach's
alpha is a measure of internal consistency of each index and can range
from 0 to 1. Cronbach's alpha scores above 0.7 are considered
acceptable for developing indexes (DeVellis, 2003). Higher
Cronbach's alpha scores indicate greater internal consistency of
the index. Both indexes were found to be highly reliable.
Index development: Independent variables
To explore the relationships between the individual program
characteristics, we conducted exploratory factor analyses and
reliability analyses on program observations. We did not conduct
confirmatory factor in this case because program characteristics are
formative variables that are observed and represent a specific practice
or attribute that is thought to directly influence a dependent variable.
This is opposed to reflective indicators, which are thought to represent
a broader concept and are not directly observed (see Kline, 2005;
Diamantopoulis & Siguaw, 2006; Jarvis et al., 2003; Padsokoff et
al., 2003, for further explanation). Exploratory factor analyses and
reliability analyses on program level data revealed the presence of four
latent factors: two interpreter characteristics and two program
characteristics. We have named the two resulting interpreter
characteristics factors confidence and authentic emotion and charisma.
We labeled the two resulting program characteristics factors
organization and connection. The items making up each factor are
included in Table 6.
The confidence factor generally reflects the notion that the
interpreter appears in control of the program and is comfortable with
what they are presenting. We use the term authentic emotion and charisma
to denote a special sort of identity that the interpreter exudes to his
or her audience. Interpreters scoring high on this factor showed
apparent and obvious passion and care for what they were interpreting
and were generally likeable. Organization reflects many of the best
practices taught by the National Park Service's Interpretive
Development Program in addition to the writings of Sam Ham (e.g., Ham,
1992). Meanwhile, Connection strongly reflects the core elements of
Tilden's classic core principles (Tilden, 1957).
While the factor analyses revealed that confidence, authentic
emotion and charisma, organization, and connection are separate
constructs, they are also moderately correlated with each other (r
ranges from .357 to .623). This suggests that when an interpreter scores
highly on any one of these indexes, he or she is likely to score highly
on the others as well.
Visitor characteristics
All descriptive statistics reported below are calculated only from
the 312 programs that met our sampling criteria. More than half of the
respondents to the surveys were female (56.4%). The ages of respondents
ranged from 16 to 88, with a mean of 45 and a median of 46. Eighty-seven
percent of respondents described themselves as White and not of Hispanic
descent. Roughly 7% described themselves as Hispanic (3.6%) or Asian
(3.6%). Only 34 respondents (1.1%) described themselves as Black and not
of Hispanic descent; 15 respondents identified themselves as Native
American and 25 respondents identified themselves as "other."
Twenty-five respondents marked more than one category. Roughly 5% were
from a country other than the United States. For comparison, a 2009
survey of U.S. residents conducted by the National Park Service
estimated that roughly 78% of all visitors to National Park units were
White; roughly 9% were Hispanic; roughly 7% were African American;
roughly 3% were Asian; and roughly 1% were Native American (Taylor et
al. 2010). Less than 5% of survey respondents attended the program
alone. More than half (50.8%) were visiting with children. Most (59.1%)
had been in the park less than one full day when they attended the
program, and 37.4% had attended a ranger-led program in the same park
prior to the one they were attending on the day they were surveyed.
Descriptive statistics: Outcomes
Table 7 displays the means and standard deviations of each outcome
variable for programs with five or more attendees and for smaller
programs. While satisfaction and visitor experience and appreciation
consistently scored highly, items associated with behavioral intentions
were more evenly distributed. Visitor satisfaction scores ranged from 5
to 10 on the 0 to 10 scale and 95% of respondents scored above the
midpoint on the visitor experience and appreciation index. Meanwhile,
43% percent of respondents scored above the midpoint on the behavioral
change index. There were no statistically significant differences in
visitor outcome scores between larger programs and programs with fewer
than five attendees.
Descriptive statistics: Program types and attendees
We attempted to investigate 488 programs. Only 376 programs
actually occurred. Programs were cancelled for a range of reasons
including weather, no visitor attendees, or failure of the interpreter
to appear. Data from 312 programs were used for analyses in this paper.
Advertised program lengths for these programs ranged from 15 minutes to
four hours. Actual program lengths ranged from 10 minutes to three and a
half hours. The average program length was just over 48 minutes.
One-hundred and ninety-eight (64%) of the programs focused primarily on
cultural heritage; 74 (24%) had a primary focus on the natural
environment. Thirty-three (11%) had a dual focus and others had neither
central focus (for example, general orientation talks). Programs
included guided tours, talks, demonstrations, hands-on activities, and
multi-media presentations. Guided tours and stationary talks made up
over 80% of the programs we observed. Seventy-two percent of programs
took place outdoors; 20% took place indoors; and others used both indoor
and outdoor settings. The breakdowns of program lengths and types were
roughly similar for programs in the two different size classes described
above.
The number of attendees at each program ranged from one person to
approximately 600 people. The median number of attendees was 17. Only
17% of the programs had no children in their audiences. Forty programs
(13%) ended with fewer attendees than they had begun with. Forty-eight
programs (15%) were at least 20% shorter than advertised; 53 programs
(17%) were at least 20% longer than advertised. Thirteen (4%) of the
programs experienced notably bad weather. No significant differences
were noted in program length or weather-related variables when comparing
small (fewer than five attendees) with larger programs.
Descriptive statistics: Interpreter characteristics
Two-hundred and seventy-one (87%) of the observed interpreters were
park rangers; 37 were volunteers, and five were concessionaires.
Sixty-four percent were male. Nineteen percent were under the age of 25;
23% were between the ages of 25 and 34; 24% were between the ages of 35
and 50; and 34% were over 50 years old. The interpreters averaged 9.6
years of experience in the NPS and 7.1 years in interpretation at their
current park unit. Nearly one quarter of the interpreters (24.7%) had
presented the program we observed at least 100 times before. More than
one-third (36.0%) had presented the program at least 50 times before.
Nearly one-third (32.6%) had presented the program 10 or fewer times.
For seven interpreters, this was their first time presenting the program
we observed.
We asked interpreters prior to their programs to indicate their
intended visitor outcomes for that program (Table 8). The most commonly
noted intended outcome was providing the audience with new knowledge.
Most (90%) noted more than one intended outcome. We also asked
interpreters how their programs were developed (Table 9). Most reported
developing their own programs with little guidance beyond a suggested
topic.
We asked a subset of interpreters (n = 188) about their level of
excitement about the program they were about to present. The level of
excitement averaged 7.81 on a 10-point scale, with responses ranging
from 2 to 10 on the scale. Seven percent ranked their level of
excitement below the midpoint (5) on the scale; 4% selected the
midpoint; and 89% rated their level of excitement above the midpoint.
Descriptive statistics: Interpreter delivery styles
Tables 10 and 11 display descriptive statistics of each of the
interpreter delivery styles observed in the study. Table 10 contains
ordinal variables (variables that are measured on an increasing scale).
Table 11 contains binary and categorical variables, or those in which
the presence or absence of the characteristics is the essential feature
being measured. Means comparisons, chi-square tests, and effect size
calculations revealed few meaningful differences between the two size
classes of programs. Interpreters typically scored slightly lower on the
confidence index in smaller groups (t = 2.0; p = 0.042; Cohen's
d=0.38). We also more commonly observed the "friend" identity
in smaller groups ([chi square] = 8.0; p = 0.005).
Descriptive statistics: Program characteristics
Tables 12 and 13 display descriptive statistics for each of the
program characteristics observed in the study. Table 12 displays ordinal
variables, while Table 13 displays categorical variables. No
statistically significant differences were observed between the two size
classes of programs.
Which practices and approaches most consistently lead to more
positive outcomes for visitors?
Interpreter and program characteristics
Table 14 displays (in rank order) correlations between all ordinal
independent variables (program and interpreter characteristics) and
visitor outcomes for programs with five or more attendees. Statistical
significance is displayed in two ways within the table. A single
asterisk indicates that the correlation is statistically significant at
p < 0.05. A double asterisk indicates that the correlation is
statistically significant at p < 0.01. As such, the stronger
relationships are those with two asterisks. These are bolded and
italicized for ease of interpretation. Cells with no asterisks represent
no statistically significant relationships between the variables.
Behavioral theory elements were observed in 42 programs overall,
including 31 with five or more attendees. Only one behavioral theory
element showed a statistically significant correlation with the behavior
change index, "costs of action" (r = .597, p < .001). This
suggests that programs that explicitly addressed the costs of
undertaking a potential behavior were generally more successful at
influencing behavior change intentions than others.
T-tests and ANOVAs were performed to examine the relationships of
categorical variables upon visitor outcomes. These variables included
fact-based messaging, unexpected positive and negative circumstances,
pace, bias, impatience, inequitable treatment of the audience,
questionable information, use of props, and interpreter identities.
Tables 15 and 16 summarize only the statistically significant
relationships observed in the data. To facilitate interpretation of the
t-tests, we calculated Cohen's d for each of the statistically
significant associations. Cohen's d is an effect size measure that
provides an assessment of the meaningfulness of the difference between
groups. Cohen (1988) suggested that even statistically significant
differences may not be meaningful in a practical sense. They may rather
be an artifact of large sample sizes. Cohen posited that meaningful
differences begin at d = 0.2. Differences near 0.2 may be considered
small, while those approaching 0.5 are considered medium and 0.8 large.
Programs in which the interpreter outwardly expressed impatience
with the audience received lower satisfaction and visitor experience and
appreciation scores than others, as did programs with an unexpected
negative occurrence. Programs in which the interpreter employed the
"friend" identity manifested higher satisfaction scores than
others. Meanwhile, programs in which the interpreter employed the
"walking encyclopedia" identity yielded lower behavioral
intention scores than others. Paces that felt too fast or too slow
resulted in lower satisfaction scores. A too-slow pace was related to
lower visitor experience and appreciation scores, and a too-fast pace
was associated with weaker behavioral intentions. No statistically
significant differences were observed for smaller programs (fewer than
five attendees).
Program attrition and outcomes
Program attrition (people leaving a program before it was
completed) was related to both satisfaction and visitor experience and
appreciation for programs with five or more attendees (see Table 17),
suggesting that program attrition may serve as another reasonable
indicator of program quality. Thirty-six of programs with five or more
attendees experienced attrition. The best predictors of program
attrition for programs with five or more attendees included
interpreters' lack of responsiveness to the audience, inaudibility,
false assumptions about the audience, the identity of the walking
encyclopedia, inappropriate logistics, the use of props, slow pace, lack
of interpreter confidence, a lack of organization of the program, and an
unexpected negative circumstance (see Tables 17 and 18). (3) No other
interpreter or program characteristics exhibited any statistically
significant relationship with program attrition at p < 0.05.
Relationship between interpreter and program characteristics and
outcomes in programs with fewer than five attendees
Fewer statistically significant correlations (p < 0.05) were
observed in programs with fewer than five attendees. In rank order, they
included:
Correlated with Satisfaction:
* Connection index: r = .492, p = .001
* Organization index: r = .420, p = .007
* Appropriate for the audience: .337, p = .033
* Humor quality: r = .323, p = .045
Correlated with Visitor experience and appreciation:
* Connection index: r = .438, p = .005
* Organization index: r = .368, p = .020
* Appropriate for the audience: .348, p = .028
Correlated with Behavioral intentions:
* Novelty: r = .408, p = .009
Thus, a subset of the variables that predicted positive outcomes in
larger programs predicted similar outcomes in smaller programs. Because
only four programs within this sample experienced attrition, no
additional analyses were conducted pertaining to attrition.
Interpreters' background, excitement, and intentions
For the smaller program sample (those with fewer than five
attendees), no statistically significant relationships were observed
between interpreter backgrounds, level of excitement, program origin, or
intended outcomes and visitor outcomes. Some differences were noted,
however, in the larger sample.
For larger group sizes (five or more attendees), program outcomes
were not related to the age, gender, or experience of interpreters, nor
their degree of autonomy in program development. The interpreters'
degree of excitement about the program was positively correlated with
visitor satisfaction (r = .186; p = 0.013) and visitor experience and
appreciation (r = .153; p = 0.041). Interpreters expressing higher
degrees of excitement also exhibited higher levels of confidence (r =
.324, p < .001) and authentic emotion and charisma (r = .475; p <
.001). Volunteers tended to achieve lower degrees of visitor
satisfaction than did park rangers (means: 8.70 vs. 8.98; t = -2.4; p =
.019; Cohen's d = 0.42).
We examined the relationships between interpreters' intended
outcomes and visitor-reported outcomes by conducting independent samples
t-tests, which compare the means of two groups. In these cases, groups
were defined by the presence of an intended outcome or not. Table 19
summarizes only the statistically significant relationships between
interpreters' intended outcomes and visitor survey responses.
Cohen's d statistics are also provided as effect size estimates.
Visitor experience and appreciation was the most sensitive to
interpreters' intended outcomes, with five different desired
outcomes related to more positive visitor responses. Satisfaction was
related to a subset of these items. Only one intention was negatively
related to visitor outcomes. Interpreters who were aiming to increase
visitors' knowledge as a primary outcome of their program generally
achieved lower visitor experience and appreciation scores. Two intended
outcomes were positively related to reported behavioral intentions by
visitors: increasing the audience's level of concern and changing
visitors' behaviors.
Discussion
The study sought to determine which practices and approaches most
consistently lead to more positive outcomes for live interpretive
programs' attendees. In this manuscript, we have limited our
analyses to bivariate relationships between practices and outcomes
rather than employing multivariate statistics. We did this for two
reasons. First, we wished to examine the individual relationship of each
observed practice and interpreter characteristic with visitor outcomes.
Second, multivariate analyses are used to provide the most parsimonious
statistical model of observed phenomenon. In multivariate processes,
certain observed characteristics may be removed from the best
explanatory model if they explain a similar portion of the variance as
another variable, despite being an important part of influencing a
particular outcome (Byrne, 2006). As a result, the multivariate approach
may lead to misinterpretation of the importance (or lack thereof) of
particular practices and program characteristics. If one were to focus
only on the variables contained in the multivariate statistical model,
at the expense of others that covaried with those same variables, there
would be a danger of inappropriately assuming that practices not in the
model are unimportant. In a companion piece, we use structural equation
modeling to develop more parsimonious causal models (see Powell and
Stern, this issue). These multivariate analyses help to illuminate the
inter-relationships of different interpreter and program characteristics
and their roles in influencing outcomes. However, they do not negate the
bivariate relationships shared in this article.
Understanding outcomes
Live interpretive programs across the NPS generally seem to produce
consistently high levels of satisfaction in their attendees. Eighty-five
percent of the analyzed sample rated the program as an 8 or better on
the 0 to 10 satisfaction scale. Such satisfaction skewness is common in
customer satisfaction surveys, and the modal response is typically the
most positive response allowed by the scale (Peterson & Wilson,
1992). The mode in our case was a 9 out of 10. Prior research suggests
that satisfaction assessments may be influenced by social desirability
bias or acquiescence (Peterson & Wilson, 1992). In our case, such
social factors might include some degree of gratitude or sympathy toward
the interpreter regardless of the program quality, leading respondents
to check a positive response. High satisfaction scores might also be
attributed in part to what is known as assimilation effects (Sherif
& Hovland, 1961). In the context of tourism, this means that
expectations are often a stronger driver of satisfaction ratings than
the quality of the actual experience (del Bosque & San Martin,
2008). In other words, if visitors strongly expect an experience to be
positive, they have a high tendency to rate it as such regardless of its
specific qualities. This may of course be the case with visitors to
national parks. Still, the particularly high satisfaction values
observed in this study suggest that few visitors were dissatisfied with
their interpretive experiences. Visitor experience and appreciation also
showed similar trends.
Despite the skewness of the data, we observed significant
statistical relationships between certain program characteristics and
visitor outcomes. The positively skewed dependent variables, however,
suggest that our findings do not necessarily identify the practices that
separate good programs from bad programs. Rather, the findings
illuminate which characteristics most commonly move programs along a
scale from good to better from a visitor's standpoint (see Stern et
al., this issue).
The behavioral intentions outcome was centered closer to the
midpoint of the five-point scale. This is likely due to widely varying
baselines in terms of visitors' behaviors prior to programs (some
visitors wrote on the survey cards things like "I already respect
the parks"). For example, if a visitor is a major park supporter
and an environmentally sensitive visitor, we might expect them to report
no change, despite experiencing what may have been an outstanding
program. Meanwhile, an inexperienced visitor to the same program might
have reported a great deal of change. As such, we might expect muted
results regarding program and interpreter characteristics'
associations with the behavioral intentions outcome. This may in part
explain the smaller number of independent variables associated with
intentions to change behaviors. Other authors have also expressed
concern when measuring intentions and behavior change, especially in
nature-based settings (see Beaumont, 2001; Powell et al, 2008).
What leads to better outcomes?
Interpreters who expressed that a primary goal of their program was
to increase the knowledge of the audience about their program's
topic achieved lower visitor experience and appreciation scores than
others. Those aiming to change their audience's attitudes,
appreciation, understanding, and/or desire to learn achieved more
positive attitudinal outcomes. Interpreters who explicitly aimed to
increase their audience members' levels of concern or change their
behavior were more likely to achieve more positive post-program
behavioral intentions than others.
The best predictors of positive outcomes varied somewhat for
different outcomes. In programs with at least five attendees, the
outcomes Satisfaction and visitor experience and appreciation were
correlated with a similar list of program and interpreter
characteristics, including: confidence, authentic emotion and charisma,
appropriateness for the audience, organization, connection, humor
quality, consistency, a clear message, responsiveness, verbal
engagement, audibility, and appropriate logistics and pace. Multisensory
engagement and fact-based messaging (negative relationship) were
additionally related to satisfaction.
Behavioral theory suggests that interpretation (and other
communication/ educational experiences) should not be expected to change
behavior unless a specific behavior is explicitly targeted and
communication is designed to address attitudes relevant to that behavior
(e.g., Ajzen, 1991; Ham et al., 2007). Programs in which the interpreter
explicitly targeted behavior change as an intended outcome (7%) were
more successful at doing so. Programs of this nature that explicitly
addressed the costs of taking that action were the most successful,
supporting Ajzen's (1991) emphasis on both ability and trade-offs
in predicting behavior. Moreover, confidence, authentic emotion and
charisma, a clear message, verbal engagement, and appropriate logistics
showed the strongest statistically significant correlations with the
behavioral intentions outcome. These items mirror theoretical constructs
from multiple disciplines known to be predictive of behavior change,
including credibility and trust in the communicator (Rogers, 1995;
Stern, 2008), empowerment of the message recipient and verbal engagement
(Ajzen, 1991; Stern, 2008), and the elimination of distraction and clear
orientation to place (Moscardo, 1999). For a broader discussion of
behavior change and interpretation see Ham et al. (2007) and Ham (2009).
Figure 1. Best practices for live interpretive programs observed in
the study.
1. Confidence
* Comfort, eloquence, apparent knowledge
2. Authentic emotion and charisma
* Passion, sincerity, charisma
3. Appropriateness for audience
4. Organization
* Quality of introduction, appropriate sequence, effective
transitions, holistic story, clear theme, link between introduction
and conclusion
5. Connection
* Links to intangibles and universal concepts, cognitive
engagement, relevance to audience, affective messaging, provocation
6. Consistency
7. Clear message
8. Responsiveness
9. Audibility
10. Appropriate logistics
11. Verbal engagement
12. Multisensory engagement
13. Appropriate pace
14. Avoid focusing on knowledge gain as the program's central goal
and communicating solely factual information
15. Avoid making uncertain assumptions about the audience
A smaller subset of interpreter and program characteristics were
correlated with outcomes for smaller programs (those with fewer than
five attendees). Connection, organization, and appropriateness for the
audience were each correlated with satisfaction and visitor experience
and appreciation. Humor quality was additionally correlated with
satisfaction. Only novelty was correlated with post-program behavioral
intentions for these smaller programs.
Implications for live interpretation
The study carries implications for both the practice of live
interpretation as well as future research pertaining to best practices.
Figure 1 provides a list of the program characteristics most strongly
associated with the outcomes measured in this study. These "best
practices" cut across multiple contexts (see Powell & Stern,
this issue) and constitute elements of interpretation that could inform
interpretive training both within the National Park Service and beyond.
While humor quality also was positively related to outcomes, we
don't list it as a best practice, as not all programs should
necessarily be funny.
Although each of the practices listed in Figure 1 was statistically
correlated with better outcomes, variability within the sample suggests
that the entire suite of best practices is not a necessary precursor to
a high-quality program. Rather, each of these practices in various
combinations was found to enhance outcomes across a majority of programs
in which they were practiced. A wide range of diverse approaches led to
positive visitor outcomes. As such, we recommend maintaining the freedom
for interpreters to be creative and innovative in their presentations.
This is further supported by correlations between interpreters' own
excitement about a program and positive visitor outcomes.
While many of the "best practices" in Figure 1 speak to
specific interpretive techniques, some, at first glance, appear to exist
outside of the famous "interpretive equation" used in NPS
trainings (Lacome, 2013). The interpretive equation is presented as a
"foundation" for NPS interpretive training and as a tool for
identifying "the elements of successful interpretation" and
the relationships between them. In its simplest form, the equation
states that an interpretive opportunity (IO: "one that provides a
favorable set of circumstances for a meaningful moment of connection
between audience and resource," p. 5) is brought about by knowledge
of the resource (KR), knowledge of the audience (KA), and appropriate
techniques (AT).
The Interpretive Equation: KR + KA x AT = IO
Many of the "best practices," in particular confidence,
authentic emotion and charisma, and avoiding a focus on knowledge gain,
do not clearly constitute "knowledge of the resource,"
"knowledge of the audience," or "appropriate
techniques" directly. They are rather the observable manifestations
of internal states specific to individual interpreters during their
programs. Their significance speaks to the importance of the appropriate
translation of the interpretive equation into action. While knowledge of
the resource is critical, it should not necessarily be the focus of
communications within an interpretive setting. Rather, knowledge of the
resource may play a more important role in enhancing the confidence of
the interpreter and allowing his or her own positive emotions and
connections to the resource to show through. Presenters who are more
familiar with their topics generally experience less anxiety (Daly et
al. 1989). When coupled with knowledge of the audience and appropriate
techniques, feelings of self-confidence and freedom to express oneself
might be instrumental in moving from good, or adequate, visitor outcomes
toward more powerful ones. This also suggests that the general
organizational culture in which the interpreter finds herself is likely
important as well. More supportive and empowering cultures may lead to
better performance (Pearce & Sims, 2002; Rafferty & Griffin,
2006). The particular roles of interpreter characteristics vs. program
characteristics are examined in greater detail in a companion article
within this issue (Powell & Stern, this issue).
Implications for future interpretive research
This research suggests that certain interpretive practices are
statistically linked to desired outcomes across a range of contexts.
Without the ability to compare a large sample of programs, this
identification would not have been possible. We thus urge others to
undertake similar forms of research and to learn from our shortcomings.
Even comparative research of just a few programs can shed additional
light on what practices and approaches are linked to more positive
visitor outcomes (see Ballantyne & Packer 2002, for example).
Our limitations and shortcomings were many in this effort,
including both controllable and uncontrollable factors. Those most
relevant to future research involve the selection and measurement of the
key independent and dependent variables of the study. The treatment (an
interpretive program in a national park setting) is a complex phenomenon
that is influenced by an interaction between the resource and its
qualities, the social environment, including the makeup of social
groups, the characteristics of the interpreter and the individual
attendees, and the topic and characteristics of the program (Powell et
al, 2009). This research focuses on the relationships between visitor
outcomes and selected interpreter and program characteristics. As such,
other potential influences are not accounted for.
Our experience revealed that it required considerable and iterative
training, feedback, and adjustment for our team to produce consistent
and reliable monitoring results. This is a well-known challenge in any
research using a team of human observers, who have a tendency to cling
to their own personal biases or sometimes idiosyncratic interpretations
of similar events (Jacobs et al., 2012). In an ideal situation,
additional pilot testing and assimilation of the team toward consistent
definitions could take place and programs would be consistently observed
in pairs, rather than by individuals.
Our selection of dependent variables was quite challenging due to
the wide diversity of program content and formats included in this
study. Visitor survey items were designed to be rather general in their
content so as to be appropriate and relevant to all programs. The
general nature of outcome measures may have also contributed to a
"ceiling effect," which describes the phenomenon when
individuals (in this case, NPS visitors) come into an experience with
already high scores on the outcomes considered (in this case the
specific attitudes and intentions measured in the study). As such, some
respondents would report little to no change for an outcome measure
because their attitudes or intentions may already be at the high end of
the spectrum for the outcome in question. In these cases, the survey
items may not be sensitive enough to detect the influence of a program.
We urge future researchers to develop more sensitive dependent
variables, and, if possible, include a control group. In particular,
other researchers have found that multiple measures of satisfaction with
both positive and negative wording can produce more variability
(Peterson & Wilson, 1992). A rigorous approach to control group
sampling might involve a similar design as our own (see endnotes) with a
larger sample of non-participants. Alternatively, researchers might
consider comparison groups exposed to similar interpretation with the
exception of only a few variables (or ideally one experimental variable)
at a time.
Conclusions
Overall, our analysis suggests that Tilden (1957), writing over 50
years ago, was right about a lot of things. Programs that are relevant
to the audience, tell holistic stories, provoke the audience to reflect,
and move beyond facts into the realm of revelation tend to produce
better visitor outcomes than programs that are fact-based and detached
from the audiences' lives. It also suggests that more recent
interpretive texts and training programs include numerous ideas that can
enhance the interpretive experience, including the passion of the
interpreter (e.g., Beck & Cable, 2002; Ward & Wilkinson, 2006),
the organization of the material (e.g., Ham, 1992; Larsen, 2003), the
importance of a central message (e.g., Ham, 1992; Jacobson, 1999), the
connection of tangible objects to intangible meanings and universal
concepts (NPS, 2003), and multiple forms of engagement and
responsiveness (Beck & Cable, 2002; Knudson et al., 2003; Lewis,
2005; Moscardo, 1999). The study also revealed some factors that appear
less regularly in existing training programs, but are certainly not
surprising. In essence, the study revealed the importance of the
sincerity, passion, confidence, and delivery style of individual
interpreters, as much as the planning and content of the program itself.
We echo Tilden (1957) in believing that "interpretation is an art
... and that any art is in some degree teachable." We hope that the
results of this study can contribute to the learning process of the
committed individuals around the world who care deeply enough about our
world to call themselves "interpreters."
Literature Cited
Ajzen, I. (1991). The theory of planned behavior. Organizational
Behavior and Decision-Making Processes, 50(2), 179-211.
Ajzen, I. (1992). Persuasive communication theory in social
psychology: a historical perspective. In: Manfredo, M. (Ed.).
Influencing Human Behavior. Theory and Applications in Recreation,
Tourism, and Natural Resources Management. Champaign, IL: Sagamore
Publishing, Inc.
Babbie, E. (2007). The Practice of Social Research. Belmont, CA:
Wadsworth.
Ballantyne, R. and J. Packer. (2002). Nature-based excursions:
School students' perceptions of learning in natural environments.
International Research in Geographical and Environmental Education 11,
no. 3: 218-236.
Beck, L., & Cable, T. T. (2002). Interpretation for the 21st
century: Fifteen guiding principles for interpreting nature and culture
(2nd ed.). Champaign: Sagamore.
Beaumont, N. (2001). Ecotourism and the conservation ethic:
Recruiting the uninitiated or preaching to the converted? Journal of
Sustainable Tourism, 9(4):317-341.
Brochu, L., & Merriman, T. (2002). Personal interpretation:
Connecting your audience to heritage resources. Fort Collins, CO:
InterpPress.
Chesebro, J. L., & Wanzer, M. B. (2006). Instructional message
variables. Handbook of instructional communication: Rhetorical and
relational perspectives (pp. 89-116). Boston: Allyn & Bacon.
Cohen, J. (1988). Statistical Power Analysis for the Behavioral
Sciences (second ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
Daly, J.A., Vangelisti, A.L., Neel, H.L., & Cavanough, P.D.
1989. Pre-performance concerns associated with public speaking anxiety.
Communication Quarterly, 37(1); 39-53.
del Bosque, I. R., & San Martin, H. (2008). Tourist
satisfaction: A cognitive-affective model. Annals of Tourism Research,
35(2): 551-73.
DeVellis, R.F. 2003. Scale development: Theory and applications
applied social research methods. 2nd ed. Thousand Oaks, CA: Sage
Publishing.
Diamantopoulos, A., & Winklhofer, H.M. (2001). Index
construction with formative indicators: An alternative to scale
development. Journal of Marketing Research, 38(2); 269-277.
Eppley Institute for Parks and Public Lands. (2012). "About
interpretation." http://idp. eppley.org/about-interpretation. Last
accessed July 16, 2012.
Finn, A. N., Schrodt, P., Witt, P. L., Elledge, N., Jernberg, K.
A., & Larson, L. M. (2009). A meta-analytical review of teacher
credibility and its associations with teacher behaviors and student
outcomes. Communication Education, 58(4), 516-537.
Forist, B. (2003). Visitor Use and Evaluation of Interpretive
Media. A Report on Visitors to the National Park System. National Park
Service Visitor Services Project. http://nature.nps.gov/socialscience/
docs/Visitor_Use_and_Evaluation.pdf. Accessed May 24, 2012.
Frauman, E., & Norman, W.C. (2003). Managing visitors via
"mindful" information services: One approach in addressing
sustainability. Journal of Park and Recreation Administration, 21(4):
87-104.
Ham, S. H. (1992). Environmental interpretation: A practical guide
for people with big ideas and small budgets. Golden: Fulcrum.
Ham, S. (2009). From interpretation to protection: is there a
theoretical basis? Journal of Interpretation Research 14(2): 49-57.
Ham S. H. (2013). Interpretation: Making a difference on purpose.
Golden, CO: Fulcrum.
Ham, S. H., Brown, T. J., Curtis, J., Weiler, B., Hughes, M., &
Poll, M. (2007). Promoting persuasion in protected areas: A guide for
managers. Developing strategic communication to influence visitor
behavior. Southport, Queensland, Australia: Sustainable Tourism
Cooperative Research Centre.
Ham, S H. & Weiler, B.M. (2002). Tour guide training: a model
for sustainable capacity building in developing countries. Journal of
Sustainable Tourism 10(1): 52-69.
Interpretive Development Program. (2008). National Park Service:
Professional Standards for Learning and Performance. U.S. Department of
the Interior.
Jacobs, W.J., Sisco, M., Hill, D., Malter, F., and Figueredo, A.J.
2012. On the practice of theory-based evaluation: information, norms,
and adherence. Evaluation and Program Planning 35, no. 3: 354-369.
Jacobson, S. K. (1999). Communication skills for conservation
professionals. Washington, DC: Island Press.
Jarvis, C.B., MacKenzie, S.B., & Podsakoff, P.M. (2003). A
critical review of construct indicators and measurement model
misspecification in marketing and consumer research. Journal of Consumer
Research, 30(2); 199-218.
Kline, R.B. (2005). Principles and practice of structural equation
modeling. New York: The Guilford Press.
Knapp, D., & Benton, G. M. (2004). Elements to successful
interpretation: A multiple case study of five national parks. Journal of
Interpretation Research, 9(2), 9-25.
Knapp, D. & Yang, L. (2002). A phenomenological analysis of
long-term recollections of an interpretive program. Journal of
Interpretation Research, 7(2), 7-17.
Knudson, D. M., Cable, T. T., & Beck, L. (2003). Interpretation
of cultural and natural resources (2nd ed.). State College: Venture
Publishing.
Lacome, B. (2013). The Interpretive Equation.
http://idp.eppley.org/IDP/sites/default/files/ EquationEssay.pdf.
Accessed April 6, 2013.
Larsen, D. L. (2003). Meaningful Interpretation: How to Connect
Hearts and Minds to Places, Objects, and Other Resources. Eastern
National.
Lewis, W. J. (2005). Interpreting for park visitors (9th ed.). Fort
Washington: Eastern National.
Madin, E. M. P., & Fenton, D. M. (2004). Environmental
interpretation in the Great Barrier Reef Marine Park: An assessment of
programme effectiveness. Journal of Sustainable Tourism, 12(2), 121-137.
McCroskey, J.C., Richmond, V.P., & Stewart, R.A. (1986). One on
one: The foundations of interpersonal communication. Englewood Cliffs,
NJ: Prentice-Hall.
McManus, P.M. (1987). It's the company you keep ... The social
determination of learning-related behaviour in a science museum. The
International Journal of Museum management and Curatorship, 6: 263-270.
McManus, P.M. (1988). Good companions. More on the social
determination of learning-related behaviour in a science museum. The
International Journal of Museum Management and Curatorship 7: 37-44.
Mehrabian, A. (1969). Attitudes inferred from non-immediacy of
verbal communications. Journal of Verbal Learning & Verbal Behavior,
6, 294-295.
Mills, E. 1920. Adventures of a Nature Guide. New York: Doubleday,
Page & Company.
Moscardo, G. (1999). Making visitors mindful: Principles for
creating quality sustainable visitor experiences through effective
communication. Champaign: Sagamore.
National Association for Interpretation (NAI). 2012. Mission,
Vision and Core Values.
http://www.interpnet.com/about_nai/mission.shtml. Accessed May 24, 2012.
National Park Service 2003. Module 101: Fulfilling the NPS mission:
The process of interpretation. Washington, DC: National Park Service.
Pearce, C.L. & Sims, H.P. (2002). Vertical versus shared
leadership as predictors of the effectiveness of change management
teams: an examination of aversive, directive, transactional,
transformational, and empowering leader behaviors. Group dynamics:
theory, research, and practice, 6(2), 172-197.
Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., & Podsakoff, N.P.
(2007). Common method biases in behavioral research: A critical review
of the literature and recommended remedies. Journal of Applied
Psychology, 88(5) 879-903.
Powell, R.B., Skibins, J.C. & Stern, M.J. (2010) Linking
interpretation best practices with outcomes: A review of literature.
Clemson University and U.S. National Park Service, National Education
Council.
Powell, R.B., Kellert, S.R., & Ham, S.H. (2008). Antarctic
tourists: Ambassadors or consumers? Polar Record, 44(230): 233-241.
Powell, R. B., Kellert, S. R., & Ham, S. H. (2009).
Interactional theory and the sustainable nature-based tourism
experience. Society & Natural Resources, 22(8), 761-776.
Rafferty, A.E. & Griffin, M.A., (2006). Refining individualized
consideration: distinguishing developmental leadership and supportive
leadership. Journal of Occupational and Organizational Psychology,
79(1), 37-61.
Regnier, K., Gross, M., & Zimmerman, R. (1992). The
Interpreter's Guidebook: Techniques for Programs and Presentations.
Interpreter's Handbook Series. Stevens Point: UWSP Foundation
Press.
Rogers, E. M. (1995). Diffusion of Innovations (Fourth Edition
ed.). New York: Free Press.
Skibins, J.C., Powell, R.B., & Stern, M.J. (2012). Exploring
empirical support for interpretation's best practices. Journal of
Interpretation Research.
Sharpe, G.W. (1976). Interpreting the environment. New York: John
Wiley & Sons.
Sherif, M., and C. Hovland (1961). Social Judgment: Assimilation
and Contrast Effects in Communication and Attitude Change. New Haven:
Yale University Express.
Stern, M.J. (2008). Coercion, voluntary compliance, and protest:
the role of trust and legitimacy in combating local opposition to
protected areas. Environmental Conservation, 35(3): 200-210.
Stern, M.J. & Powell, R.B. (2011). The Views of National Park
Service Superintendents and Interpretation and Education Supervisors on
Interpretation and Education in the National Park Service. Submitted to
the National Education Council of the U.S. National Park Service.
October, 2011.
Stern, M.J., R.B. Powell, and K.S. Hockett (2011). Why do they
come? Understanding interpretive program attendance at Great Smoky
Mountains National Park. Journal of Interpretation Research 16(2):
35-52.
Tabachnick, B.G. and L.S. Fidell. (2007). Using multivariate
statistics (5th ed.) Needham Heights, MA: Allyn & Bacon.
Taylor, P.A., Grandjean, B.D., Gramann, J.H. (2011). National Park
Service Comprehensive Survey of the American Public 2008-2009: Racial
and Ethnic Diversity of National Park System Visitors and Non-Visitors.
Natural Resource Report NPS/NRSS/SSD/NRR-2011/432. Wyoming Survey &
Analysis Center, University of Wyoming, Laramie, Wyoming. Online:
http://nature.nps.gov/
socialscience/docs/CompSurvey2008_2009RaceEthnicity.pdf.
Tilden, F. (1957). Interpreting our heritage (3rd ed.). Chapel
Hill: The University of North Carolina Press.
Veverka, J. A. (1998). Interpretive master planning: the essential
planning guide for interpretive centers, parks, self guided trails,
historic sites, zoos, exhibits and programs (2nd ed.). Tustin: Acorn
Naturalists.
Wallace, G. N., & Gaudry, C. J. (2002). An evaluation of the
"authority of the resource" interpretive technique by rangers
in eight wilderness/backcountry areas. Journal of Interpretation
Research, 7, 43-68.
Wearing, S., & Wearing, B. (2001). Conceptualizing the selves
of tourism. Leisure Studies, 7, 11-23.
Ward, C.W., & Wilkinson, A. E. (2006). Conducting meaningful
interpretation: A field guide for success. Golden: Fulcrum.
Notes
(1.) Based on a review of web pages of all park units at the time
of the research (www.nps.gov).
(2.) Our original research design also included administering
shorter pre-experience surveys at different, but similar programs across
the parks in our sample. These surveys contained two batteries of survey
items that could be compared to the post-experience surveys to create a
control group against which to compare outcomes. Unfortunately, an
insufficient number of these surveys were administered at most parks to
create a reliable control group. As a result, we did not include these
data in further analyses.
(3.) Our field observations suggest that the association between
the use of props and increased attrition may be influenced by cases in
which not all visitors were able to engage with the prop(s). This may
have motivated their departure.
Marc J. Stern
Department of Forest Resources and Environmental Conservation,
Virginia Tech
Robert B. Powell
Department of Parks, Recreation and Tourism Management and School
of Agricultural
and Forest Environmental Sciences, Clemson University
Table 1. Park units included in the study.
Annual
Resource Recreation
Park Unit Focus Park Location Visits (a)
Aztec Ruins National Cultural Remote 37,437
Monument
Badlands National Park Natural Remote 977,778
Bryce Canyon National Park Natural Remote 1,285,492
Chaco Culture National Cultural Remote 34,226
Historical Park
Ford's Theater National Cultural Urban 662,298
Historic Site
Fort McHenry National Cultural Urban 611,582
Monument and Historic
Shrine
Gettysburg National Military Cultural Urban- 1,031,554
Park Proximate
Grand Canyon National Park Natural Remote 4,388,386
Urban-
Great Smoky Mountains Mix Proximate 9,463,538
National Park
Harpers Ferry National Cultural Urban- 268,822
Historical Park Proximate
Independence National Cultural Urban 3,751,007
Historical Park
Jefferson National Expansion Cultural Urban 2,436,110
Memorial
Jewel Cave National Monument Natural Remote 103,462
Lincoln Home National Cultural Urban 354,125
Historic Site
Manassas National Cultural Urban- 612,490
Battlefield Park Proximate
Mesa Verde National Park Mix Remote 559,712
Mount Rushmore National Cultural Remote 2,331,237
Memorial
National Mall Cultural Urban 1,363,389
Navajo National Monument Mix Remote 90,696
Point Reyes National Natural Urban- 2,067,271
Seashore Proximate
San Francisco Maritime Cultural Urban 4,130,970
National Historical Park
Ulysses S. Grant National Cultural Urban 39,967
Historic Site
Wind Cave National Park Natural Remote 577,141
Yosemite National Park Natural Remote 3,901,408
(a) Annual visitation from 2010 (http://www.nature.nps.gov/stats/)
Table 2. Programs observed and total number of surveys collected.
Park unit Programs Programs Surveys
attempted observed collected
Aztec Ruins National Monument 4 2 4
Badlands National Park 22 19 157
Bryce Canyon National Park 12 12 133
Chaco Culture National Historical 9 8 85
Park
Ford's Theater National Historic 20 20 519
Site
Fort McHenry National Monument 23 14 133
and Historic Shrine
Gettysburg National Military Park 26 21 206
Grand Canyon National Park 30 30 384
Great Smoky Mountains National 19 14 96
Park
Harpers Ferry National Historical 21 15 100
Park
Independence National Historical 36 22 156
Park
Jefferson National Expansion 22 16 146
Memorial
Jewel Cave National Monument 20 20 190
Lincoln Home National Historic 18 14 89
Site
Manassas National Battlefield 20 17 88
Park
Mesa Verde National Park 14 14 301
Mount Rushmore National Memorial 23 19 171
National Mall 47 22 65
Navajo National Monument 8 3 23
Point Reyes National Seashore 12 9 34
San Francisco Maritime National 20 16 69
Historical Park
Ulysses S. Grant National 15 9 40
Historic Site
Wind Cave National Park 18 18 215
Yosemite National Park 29 22 199
Totals 488 376 3,603
Park unit Used in analyses
Programs Surveys
Aztec Ruins National Monument 2 4
Badlands National Park 14 118
Bryce Canyon National Park 12 127
Chaco Culture National Historical 7 70
Park
Ford's Theater National Historic 18 448
Site
Fort McHenry National Monument 11 113
and Historic Shrine
Gettysburg National Military Park 18 186
Grand Canyon National Park 28 363
Great Smoky Mountains National 12 86
Park
Harpers Ferry National Historical 12 79
Park
Independence National Historical 17 122
Park
Jefferson National Expansion 14 135
Memorial
Jewel Cave National Monument 18 177
Lincoln Home National Historic 10 72
Site
Manassas National Battlefield 15 80
Park
Mesa Verde National Park 14 290
Mount Rushmore National Memorial 9 101
National Mall 16 49
Navajo National Monument 3 23
Point Reyes National Seashore 8 32
San Francisco Maritime National 14 64
Historical Park
Ulysses S. Grant National 8 36
Historic Site
Wind Cave National Park 13 175
Yosemite National Park 19 172
Totals 312 3,122
Table 3. Program characteristics observed in the study, their
definitions, and operationalization.
Program characteristic Definition Scoring
Introduction quality Degree to which the 3 = Oriented audience
(Brochu and Merriman, introduction captured and captured
2002; Ham, 1992; the audience's attention
Jacobson, 1999) attention and
oriented (or pre/ 2 = Minimally
disposed) the oriented audience;
audience to the did not necessarily
program's content capture attention
and/or message.
1 = Poorly executed
Appropriate Degree to which basic 4 = Well planned and
logistics (Jacobson, audience and program appropriate
1999; Knudson et needs were met (i.e.,
al., 2003) restrooms, weather, 3 = Audience/program
technology, needs mostly
accessibility, shade, addressed
etc.).
2 = Needs marginally
addressed
1 = Needs not met
Appropriate for Degree to which the 5 = Very appropriate
audience (Beck and program aligned with
Cable, 2002; audience's ages, 4 = Appropriate
Jacobson, 1999; cultures, and level
Knudson et al., of knowledge, 3 = Moderately
2003) interest, and appropriate
experience.
2 = Only slightly
appropriate
1 = Not appropriate
Appropriate sequence Degree to which the 4 = Enhanced
(Beck and Cable, program followed a messaging
2002; Ham, 1992; logical sequence.
Jacobson, 1999; 3 = Appropriate
Larsen, 2003)
2 = Choppy
1 = Detracted from
messaging
Transitions Degree to which 4 = Enhanced
(Beck and Cable, program used messaging and were
2002; Brochu and appropriate smooth
Merriman, 2002; Ham, transitions that kept
1992; Jacobson, the audience engaged 3 = Appropriate
1999; Larsen, 2003) and did not detract
from the program's 2 = Forced or
sequence. irrelevant
1 = Detracted from
messaging or not
present
Links to intangible Communication 5 = Extensively
meanings and connected tangible developed; powerful
universal concepts resources to concepts
(NPS Module 101; intangible meanings
Beck and Cable, and universal 4 = Well developed
2002; Brochu and concepts.
Merriman, 2002; Ham, Intangibles: stories, 3 = Present but weak
1992; Knudson, et ideas, meanings, or
al., 2003; Larsen, significance that 2 = Difficult to
2003; Lewis, 2005; tangible resources detect or slightly
Moscardo, 1999; represent used
Tilden, 1957; Ward Universals: concepts
and Wilkinson, 2006) that most audience 1 = Clearly not
members may identify present
with
Multisensory (Beck Degree to which the 3 = Explicit/
and Cable, 2002; program intentionally purposeful inclusion
Knudson et al., and actively engaged of two sense beyond
2003; Lewis, 2005; more than just basic sight and sound
Moscardo, 1999; sight and sound.
Tilden, 1957; 2 = Actively
Veverka, 1998; Ward incorporated a sense
and Wilkinson, 2006) beyond passive use of
sight and sound, or
actively focused upon
either of these and
senses as a vehicle
for conveying the
message (e.g., "close
your eyes and sound
listen")
1 = Primarily a talk
in which the ranger
did not explicitly
use multiple sense
beyond passive use of
sight (scenery/
objects) and sound
(words)
Physical engagement Degree to which the 4 = Central
(Beck and Cable, program physically programming element
2002; Knudson, et engaged audience
al., 2003; Lewis, members in a 3 = Occurred multiple
2005; Moscardo, participatory times
1999; NPS Module experience; i.e.,
101; Sharpe, 1976; through touching or 2 = Minimal effort to
Tilden, 1957) interacting with engage
resource.
1 = No efforts
Verbal engagement Degree to which the 5 = Central
(Knudson, et al., program verbally programming element
2003; Moscardo, engaged audience
1999; Sharpe, 1976; members in a 4 = Occurred multiple
Tilden, 1957; participatory times
Veverka, 1998) experience; i.e.,
dialogue (a two-way 3 = Modestly engaged
discussion).
2 = Minimal effort to
engage
1 = No efforts
Cognitive engagement Degree to which the 5 = Central
(Knudson, et al., program cognitively programming element
2003; Moscardo, engaged audience
1999; Sharpe, 1976; members in a 4 = Occurred multiple
Tilden, 1957; participatory times
Veverka, 1998) experience beyond
simply listening; 3 = Modestly engaged
i.e. calls to imagine
something, reflect, 2 = Minimal effort to
etc. engage
1 = No efforts
Multiple activities Degree to which the 4 = 2+ primary
(Knapp and Benton, program consisted of activities included
2004; Moscardo, a variety of
1999; Ward and activities and 3 = 2+ secondary
Wilkinson, 2006) opportunities for activities included
direct audience
involvement (not 2 = One secondary
including dialogue). activity included
1 = One activity only
Props (Jacobson, A visual aide beyond 1 = Prop(s) used
1999; Knapp and a screen-based
Benton, 2005; Ham, slideshow. 0 = Not used
1992; Ward and
Wilkinson, 2006)
Relevance to Degree to which the 5 = Major focus of
audience (Beck and program explicitly messaging
Cable, 2002; Brochu communicated the
and Merriman, 2002; relevance of the 4 = Well developed
Ham, 1992; Jacobson, subject to the lives efforts
1999; Knapp and of the audience.
Benton, 2004; Lewis, 3 = Moderate efforts
2005; Moscardo,
1999; NPS Module 2 = Minimal efforts
101; Sharpe, 1976;
Tilden, 1957; 1 = No efforts
Veverka, 1998)
Affective messaging Degree to which the 5 = Central
(Jacobson, 1999; program communicated programming element
Lewis, 2005; Madin emotion (in terms of
and Fenton, 2004; quantity, not 4 = Frequent and
Tilden, 1957; Ward quality). repeated messages
and Wilkinson, 2006)
3 = Occasional
messages
2 = Minimal effort to
include messages
1 = Messages absent
Fact-based messaging Degree to which the 1 = Messaging was
(Frauman and Norman, program communicated solely fact-based
2003; Jacobson, factual information.
1999; Lewis, 2005; 0 = Messaging was not
Tilden, 1957; Ward solely fact-based
and Wilkinson, 2006) (incorporated
affective messaging)
Surprise (Beck and Degree to which the 3 = Major element
Cable, 2002; program used the
Moscardo, 1999) element of surprise 2 = Minor element
in communication.
This could include 1 = Not used
"aha" moments or
unexpected or
contrasting messages.
Novelty (Beck and Degree to which the 3 = Major element
Cable, 2002; Frauman program presented
and Norman, 2003; novel ideas, 2 = Minor element
Knapp and Benton, techniques, or
2004; Moscardo, viewpoints as an 1 = Not used
1999) element of
communication; i.e.,
using a device not
usually associated
with or related to
resource.
Provocation (Beck Degree to which the 4 = Powerful and
and Cable, 2002; program explicitly explicit inclusion
Brochu and Merriman, provoked participants
2002; Knudson, et to personally reflect 3 = Occasional
al., 2003; Tilden, on content and its inclusion
1957) deeper meanings.
2 = Isolated or vague
inclusion
1 = No attempt made
Multiple viewpoints Degree to which the 3 = Multiple
(Beck and Cable, program explicitly viewpoints developed;
2002; Brochu and acknowledged multiple none given clear
Merriman, 2002; perspectives or priority
Tilden, 1957) uncertainty within a
theme or message. 2 = Primarily one
(Primarily for viewpoint, with some
controversial focus on others
messaging; when an
argument is made, was 1 = No effort
a relevant counter-
argument provided?) NA = not applicable
Holistic Degree to which the 5 = Holistic story
storytelling (Beck program aimed to used throughout; all
and Cable, 2002; present a holistic messaging tied to
Larsen, 2003; story (with story
Tilden, 1957) characters and a
plot) as opposed to 4 = Holistic story
disconnected pieces present; some info
of information. did not relate to
story
3 = Equal mix of
storytelling and
factual information,
no single, holistic
story
2 = Factual
information primarily
used; some stories
used to create
relevance.
1 = Facts and
information
primarily; no attempt
at storytelling.
Place-based Degree to which the 5 = Central focus of
messaging (Beck & program emphasized messaging
Cable, 2002; the connection
Knudson, et al., between the visitor 4 = Well-developed
2003; Lewis, 2005; and the site/ connection through
Moscardo, 1999; NPS resource. repetition and
Module 101; Sharpe, engagement
1976)
3 = Moderately
emphasized through
repetition or
engagement
2 = Slightly
developed verbally
1 = Not developed
Introduction and Degree to which 4 = Intro and
conclusion linkage program connected conclusion were
(Beck and Cable, conclusion back to linked in a cohesive
2002; Brochu and the introduction in way that enhanced
Merriman, 2002; an organized or messaging
Larsen, 2003) cohesive way (i.e.,
program "came full 3 = Intro and
circle.") conclusion were
linked, but didn't
necessarily enhance
messaging
2 = Intro and
conclusion were
weakly linked
1 = Intro and
conclusion were
disconnected from
each other
Clear theme (Beck Degree to which the 4 = Theme is clearly
and Cable, 2002; program had a clearly developed and
Brochu and Merriman, communicated communicated
2002; Ham, 1992; theme(s). A theme is
Jacobson, 1999; defined as a single 3 = Easy to detect,
Knudson, Cable, and sentence (not but not well
Beck, 2003; Larsen, necessarily developed
2003; Lewis, 2005; explicitly stated)
Moscardo, 1999; that links tangibles, 2 = Difficult to
Sharpe, 1976; intangibles, and detect, present but
Veverka, 1998; Ward universals to at least somewhat
and Wilkinson, 2006) organize and develop ambiguous
ideas.
1 = Unclear/not
present
Central message Degree to which 4 = Clearly
(Beck and Cable, program's message(s) communicated and well
2002; Brochu and was clearly developed
Merriman, 2002; Ham, communicated; i.e.,
1992; Jacobson, the "so what?" 3 = Easy to detect,
1999) element of the but not well
program. developed
2 = Difficult to
detect, ambiguous
1 = Unclear/not
present
Consistency (Beck Degree to which the 3 = Consistent
and Cable, 2002; program's tone and
Ham, 1992) quality were 2 = Some shift in
consistent throughout either tone or
the program quality during the
program
1 = Shift in both
tone and quality
Pace (Jacobson, Degree to which the Categorical:
1999) pace of the program
allowed for clarity Too fast
and did not detract
from the program. Too slow
Just fine
Quality of the Degree to which the 3 = Contextually
resource resource where iconic or grandiose
program took place is
awe-inspiring or 2 = Pleasant but not
particularly iconic. iconic
1 = Unimpressive/
generic
Unexpected negative Were there any 1 = Yes
circumstance unexpected
interruptions or 0 = No
emergencies during
the program, such as
a sudden change in
weather, medical
emergency, technical
difficulties, or
hazardous conditions
that detracted from
the quality of the
program?
Unexpected positive Was there an 1 = Yes
circumstance unexpected experience
that occurred during 0 = No
the program, such as
seeing charismatic
wildlife or other
unique phenomena that
added significantly
to the quality of the
experience?
Behavioral theory elements
The following were only measured for programs in which a behavioral
change was expressed by the interpreter as a desired program outcome.
Benefits of action Degree to which the 4 = Explicitly/
(Ajzen, 1991; Ham program emphasized purposefully
et. al., 2007; the potential emphasized
Jacobson, 1999; benefits resulting
Knudson, et al., from performing a 3 = Mentioned a
2003; Moscardo, particular action(s). moderate amount
1999; Peake et. al,
2009) 2 = Explained a
little
1 = No mention
NA = not applicable
Costs of action Degree to which the 4 = Explicitly/
(Ajzen, 1991; Ham program emphasized purposefully
et. al., 2007; the potential costs emphasized
Jacobson, 1999; resulting from
Knudson, et al., performing a 3 = Mentioned a
2003; Moscardo, particular action(s). moderate amount
1999; Peake et. al,
2009) 2 = Explained a
little
1 = No mention
NA
Norms of action Degree to which the 4 = Explicitly/
(Ajzen, 1991; Ham program emphasized purposefully
et. al., 2007; the social emphasized
Jacobson, 1999; acceptability of
Knudson, et al., performing a 3 = Mentioned a
2003; Moscardo, particular behavior moderate amount
1999) or desired action.
2 = Explained a
little
1 = No mention
NA
Ease of action Degree to which the 4 = Explicitly/
(Ajzen, 1991; Ham program communicated purposefully
et. al., 2007; the ease (or emphasized
Jacobson, 1999; difficulty) of
Knudson, et al., performing a 3 = Mentioned a
2003; Moscardo, particular behavior moderate amount
1999; Tilden, 1957) or desired action.
2 = Explained a
little
1 = No mention
NA
Demonstrates action Degree to which the 4 = Majority of
(Ajzen, 1991; Beck program provided audience engaged
and Cable, 2002; examples of, or
Knudson, et al., opportunities for, 3 = Demonstration by
2003; Moscardo, performing a desired ranger or small
1999; Sharpe, 1976; action. proportion of
Widner Ward and audience
Wilkinson, 2006)
2 = Verbal
description
1 = No mention/
demonstration
NA
Table 4. Interpreter characteristics observed in the study, their
definitions, and operationalization.
Interpreter Definition Scoring
characteristic
Professional The extent to which 0 = Interpreter
appearance the interpreter appears disheveled or
appears properly unkempt and is not
dressed and groomed. professionally
dressed
1 = Interpreter
appears well-groomed
and is professionally
dressed
Comfort of the Degree to which the 1 = Interpreter seems
interpreter interpreter scared, nervous, or
(Lewis 2008; presenting the unable to lead the
Moscardo, 1999; Ward program seems program
and Wilkinson, 2006) comfortable with the 2 = Interpreter seems
audience and capable nervous and struggles
of successfully with much of the
presenting the program
program without
apparent signs of 3 = Interpreter seems
nervousness or self- comfortable, but
doubt. might become
uncomfortable at
times
4 = Interpreter is
not nervous and
handles the program
with ease
Responsiveness The extent to which NA = Not able to
(Jacobson, 1999; the interpreter observe (e.g., large
Knudson et al., interacts with the programs in dark
2003; Lewis, 2008) audience, collects theatres)
information about
their interests and 1 = Interpreter is
backgrounds, and aloof or averse to
responds to their the visitors'
specific questions presence
and requests or non-
verbal cues. 2 = Interpreter is
somewhat responsive
to visitors'
questions/body
language
3 = Interpreter was
very responsive to
the audience
Inequity The presence of 1 = Interpreter did
(Ham and Weiler, unequal attention not pay equal
2002) devoted to certain attention to all
attendees and not audience members.
others through
greater interaction 0 = No inequity
or attentiveness. issues.
Humor quality How funny is the 1 = Not funny at all
(Ham and Weiler, interpreter overall?
2002; Knapp and Does the audience 2 = A little funny
Yang, 2002; Regnier react positively to
et al., 1992) the interpreter's use 3 = Moderately funny
of humor and seem to
enjoy it? 4 = Hilarious
Humor quantity The extent to which 1 = Interpreter
the interpreter attempts no humor
attempts to use throughout the
humor, sarcasm, or presentation
jokes to share the
topic with the 2 = Interpreter
visitor, regardless rarely uses humor
of their success.
3 = Interpreter uses
an equal mix of humor
and non-humor to
convey the message
4 = Interpreter is
mostly trying to be
humorous
5 = Interpreter uses
humor as the primary
vehicle to convey
their message
Sarcasm The degree to which 1 = Not at all
the interpreter used
sarcasm (the use of 2 = Done to some
mocking, extent
contemptuous, or
ironic language or 3 = A central feature
tone) or self- of the delivery style
deprecation that was
not meant to be
serious, as a part of
presenting their
program.
Charisma (Ward and A general sense of 1 = Not likeable/
Wilkinson, 2006) the overall found interpreter
likeability/charisma irritating
of the interpreter,
commonly recognized 2 = Somewhat off-
by seemingly genuine putting
interaction with the
visitors, including 3 = Neither liked or
smiling, looking disliked interpreter
people in the eye,
and having an overall 4 = More or less
appealing presence. liked interpreter
5 = Found interpreter
very likeable/
charismatic
Sincerity The degree to which 1 = Interpreter
(Ham, 2009) the interpreter seems seemed to only be
genuinely invested in going through the
the messages he or motions, with no real
she is communicating, emotional connection
as opposed to or sincerity
reciting information,
and seems sincere in 2 = Interpreter
the emotional seemed somewhat
connection they may connected through the
exude to the message words they used,
and/or the resource. though their
In other words, the mannerisms or
extent to which the intonation didn't
interpretation was corroborate their
delivered through words
authentic emotive
communication. 3 = Interpreter
seemed mostly sincere
with authentic
emotive communication
for most of the
program
4 = Communication was
clearly sincere and
authentic throughout
the program, as
evidenced by words,
gestures, intonation,
or other mannerisms
Passion The interpreter's 1 = Interpreter seems
(Beck and Cable, apparent level of completely detached/
2002; Ham and enthusiasm for the disinterested from
Weiler, 2002; material, as opposed the program
Moscardo, 1999) to a bored or
apathetic attitude 2 = Low levels of
toward it. The passions
overall vigor with
which the material is 3 = Interpreter shows
presented. moderate levels or
sporadic instances of
passion
4 = Pretty high
levels of passion
overall
5 = Interpreter seems
extremely passionate
about the program
Personal sharing The degree to which 1 = Interpreter did
(Jacobson, 1999; the interpreter not share any
Myers et al., 1998) shared personal personal information
insights or about themselves with
experiences, answered the audience
questions about
themselves for the 2 = Interpreter
audience, or provided shared minimal
their own opinion on personal information
topics or events or viewpoints
relevant to the
program. 3 = Interpreter
shared a large amount
of personal
information and
perspective
4 = Interpreter's
personal life/point
of view is explicitly
the central focus of
the experience (used
themselves as the
primary framework for
the program)
Apparent knowledge The degree to which 1 = Interpreter seems
(Ham and Weiler, the interpreter not at all
2002; Lewis, 2008; appears to know the knowledgeable (unsure
Ward and Wilkinson, information involved of facts or has a
2006) in the program, the hard time recalling
answers to visitors the information
questions, and has intended for the
local knowledge of program)
the area and its
resources. 2 = Interpreter seems
somewhat
knowledgeable, but
appears to forget a
few things or leave
out important details
3 = Interpreter
appears more or less
knowledgeable without
any major hiccups or
uncertainty
throughout the
program.
4 = Interpreter's
presentation of facts
and information
during the program is
flawless
Audibility The extent to which 1 = Interpreter could
the interpreter can not be heard by the
clearly be heard and audience during the
understood by the majority of the
audience. program
2 = Interpreter could
be clearly heard for
the majority of the
program, but wasn't
audible during some
parts
3 = Interpreter could
be clearly heard
throughout the entire
program
Eloquence The extent to which 1 = Interpreter
(Lewis, 2008) the interpreter spoke stumbled on their
clearly and speech throughout
articulately, and did their entire program
not mumble or and was hard to
frequently use filler understand
words such as "um" or
"like."
2 = Interpreter had
some minor issues
with mumbling or
unclear speech
3 = Interpreter had
no such issues during
the program
4 = Interpreter was
exceptionally
eloquent
Impatience Did the interpreter 1 = Interpreter was
show any explicit explicitly impatient
impatience toward with the audience
audience members?
0 = No issues noted
Formality The degree to which 1 = Interpreter was
the interpreter was extremely casual
very formal and
official vs. casual 2 = More casual than
and laid back about formal
the presentation.
3 = Interpreter was
neither explicitly
casual nor formal
4 = More formal than
casual
5 = Interpreter was
entirely formal
False assumption of At any point during 1 = No problem with
the audience the program, did the false assumptions
interpreter make
assumptions of the 2 = Some minor false
audience's attitudes assumptions that
or knowledge that likely did not
could have easily detract from the
been false? quality of the
program
3 = Obvious false
assumptions that made
the experience less
enjoyable or
meaningful
Character acting The degree to which 0 = Interpreter does
role playing or no character role
character acting is playing during the
incorporated into the program, he/she is
program, either to simply leading the
add authenticity or program
to help tell a story.
1 = Interpreter acts
like one or more
characters during
parts of the program
2 = Interpreter is in
full costume or does
not break character
at any point during
the program
Primary identity Friend: outwardly 1 = primary identity;
(Ham and Weiler, friendly, casual, 0 = not
2002; Ham, 2002; approachable, mingles
Knapp and Yang, informally
2002; Larsen, 2003;
Mills, 1920; Wallace Authority figure: 1 = primary identity;
and Gaudry, 2005) emphasizes own role 0 = not
as a park ranger and
focuses on rules,
regulations, and/or
authority to
communicate
Walking encyclopedia: 1 = primary identity;
Focused on conveying 0 = not
a large volume of
facts
Questionable Obvious factual 1 = present
information inaccuracy (incorrect
or inaccurate 0 = not present
information) or false
attribution
(unfounded claims
about others, e.g.,
"the native people
were happy to hand
over their land so a
National Park could
be formed.")
Bias Did the interpreter 1 = yes
share any apparent
bias or strong 0 = no
opinion with
potential effects on
relationships with
audience members?
Table 5. Outcome indexes developed through confirmatory factor
analyses.
OUTCOME INDEXES
Program outcome: Visitor Experience and Appreciation (Cronbach's
[alpha] = 0.89)
To what extent did the program you just attended influence any of the
following for you?
* Made my visit to this park more enjoyable
* Made my visit to this park more meaningful
* Enhanced my appreciation for this park
* Increased my knowledge about the program's topic
* Enhanced my appreciation for the National Park Service
Program outcome: Behavioral intentions (Cronbach's [alpha] = 0.94)
To what extent did the program you just attended influence any of
the following for you?
* Changed the way I will behave while I'm in this park
* Changed the way I will behave after I leave this park
Table 6. Independent variable indexes developed through exploratory
factor analyses.
INDEPENDENT VARIABLE INDEXES
Interpreter characteristic: Confidence (Cronbach's [alpha] = 0 .70)
* Comfort of the Interpreter
* Apparent knowledge
* Eloquence
Interpreter characteristic: Authentic emotion and charisma
(Cronbach's [alpha] = 0.85)
* Passion
* Charisma
* Sincerity
Program characteristic: Organization (Cronbach's [alpha] = 0.82)
* Quality of the introduction
* Appropriate sequence
* Effective transitions
* Holistic story
* Clarity of theme
* Link between introduction and conclusion
Program characteristic: Connection (Cronbach's [alpha] = 0.88)
* Links to intangible meanings and universal concepts
* Cognitive engagement
* Relevance to audience
* Affective messaging
* Provocation
Table 7. Means and standard deviations of outcome variables measured
in visitor surveys.
Variable (Scale) Means (with standard deviations)
Five or more Fewer than
attendees five attendees
Satisfaction (0 to 10) 8.96 (0.68) 9.02 (0.89)
Visitor experience and 4.41 (0.32) 4.57 (.042)
appreciation (1 to 5)
* Made my visit to this park more 4.55 (0.30) 4.70 (0.43)
enjoyable (1 to 5)
* Made my visit to this park more 4.49 (0.32) 4.69 (0.45)
meaningful (1 to 5)
* Enhanced my appreciation for 4.36 (0.37) 4.51 (0.51)
this park (1 to 5)
* Increased my knowledge about the 4.45 (0.34) 4.62 (0.47)
program's topic (1 to 5)
* Enhanced my appreciation for the 4.27 (0.36) 4.38 (0.58)
National Park Service (1 to 5)
Behavioral intentions (1 to 5) 2.92 (0.64) 3.02 (0.98)
* Changed the way I will behave 2.92 (0.67) 3.08 (0.97)
while I'm in this park (1 to 5)
* Changed the way I will behave 2.92 (0.61) 2.97 (1.04)
after I leave this park (1 to 5)
Table 8. Intended outcomes expressed by interpreters immediately
prior to their programs.
I want my audience to ... Proportion
expressing
each
outcome
Have an increased knowledge of the program topic 79.5%
Have an increased appreciation for this park 56.4%
Have an increased understanding of the park's resources 39.1%
Want to learn more about the program topic 24.8%
Be entertained 15.6%
Have an increased appreciation of the NPS 14.1%
Have an increased concern for a specific topic 11.5%
Change their attitudes toward something 10.6%
Change a certain behavior in the future 7.0%
Develop and practice a new skill 3.5%
Table 9. How interpretive programs were developed.
Program development Proportion
expressing
each
Program provided for ranger with full script planned <1%
out
Program provided for ranger with some freedom to 14%
inject own style
Program topic provided, little restrictions on 20%
information or style to be presented
General topic suggested, but wrote own script and 53%
selected information
Interpreter selected and developed entire program free 13%
of restrictions
Table 10. Means and standard deviations of ordinal interpreter
delivery styles.
Variable (Scale) Means (with standard deviations)
Five or more Fewer than
attendees five attendees
Confidence index (1 to 4) 3.28 (0.49) 3.12 (0.41)
* Comfort of the interpreter 3.49 (0.60) 3.25 (0.63)
(1 to 4)
* Apparent knowledge (1 to 4) 3.45 (0.63) 3.40 (0.59)
* Eloquence (1 to 4) 2.99 (0.65) 2.83 (0.50)
Authentic emotion and charisma 3.57 (0.85) 3.46 (0.70)
index (1 to 5)
* Passion (1 to 5) 3.23 (1.02) 3.08 (1.04)
* Charisma (1 to 5) 3.82 (0.86) 3.68 (0.69)
* Sincerity (1 to 4) 2.93 (0.77) 2.88 (0.65)
Responsiveness (1 to 3) (a) 2.81 (0.41) 2.82 (0.45)
Humor quality (1 to 4) 2.08 (0.73) 1.92 (0.58)
Humor quantity (1 to 5) 2.08 (0.72) 1.85 (0.53)
Personal sharing (1 to 4) 1.68 (0.72) 1.79 (0.73)
Audibility (1 to 3) 2.86 (0.36) 2.85 (0.36)
Formality (1 to 5) 3.21 (0.86) 3.00 (0.68)
Sarcasm (1 to 3) 1.23 (0.46) 1.15 (0.36)
False assumptions of audience (1 to 1.17 (0.40) 1.08 (0.27)
(a) Responsiveness was not observable in every case. For larger
programs, n = 245.
Table 11. Descriptive statistics of interpreter delivery styles
(categorical variables).
Interpreter delivery style % of programs in which
delivery style occurred
Five or Fewer than
more five
attendees attendees
Professional appearance of the interpreter 98.2 100.0
Inequitable treatment of audience 2.9 2.5
Impatience 1.8 2.5
Primary identity: Friend 18.0 37.5
Primary identity: Authority 4.4 2.5
Primary identity: Walking encyclopedia 76.8 67.5
Character acting: partial 2.6 2.5
Character acting: complete 2.9 0.0
Interpreter bias 3.3 7.5
Questionable information 9.9 2.5
Table 12. Means and standard deviations of ordinal program
characteristics.
Variable (Scale) Means (with standard deviations)
Five or more Fewer than
attendees five attendees
Organization index (1 to 5) 3.34 (0.71) 3.14 (0.65)
* Quality of introduction (1 to 3) 2.13 (0.45) 1.93 (0.42)
* Appropriate sequence (1 to 4) 2.79 (0.69) 2.70 (0.69)
* Transitions (1 to 4) 2.72 (0.76) 2.55 (0.71)
* Holistic story (1 to 5) 2.78 (1.01) 2.78 (0.77)
* Conclusion linked to intro 2.63 (0.86) 2.48 (0.75)
(1 to 4)
* Clear theme (1 to 4) 2.82 (0.86) 2.58 (0.90)
Connection index (1 to 5) 2.77 (0.78) 2.74 (0.55)
* Links to intangible meanings and 2.88 (0.94) 3.00 (0.80)
universal concepts (1 to 5)
* Cognitive engagement (1 to 5) 2.85 (0.94) 2.78 (0.83)
* Relevance to audience (1 to 5) 2.86 (0.86) 2.70 (0.69)
* Affective messaging (1 to 5) 2.43 (0.95) 2.38 (0.71)
* Provocation (1 to 4) 2.24 (0.72) 2.25 (0.67)
Clear message (1 to 4) 2.20 (0.94) 2.00 (0.85)
Appropriate logistics (1 to 4) 3.11 (0.93) 3.15 (0.89)
Appropriate for the audience 3.93 (0.70) 4.15 (0.83)
(1 to 5)
Multisensory (1 to 3) 2.39 (0.51) 2.35 (0.48)
Physical engagement (1 to 4) 1.42 (0.69) 1.50 (0.75)
Verbal engagement (1 to 5) 2.51 (1.02) 2.68 (0.80)
Surprise (1 to 3) 1.10 (0.31) 1.03 (0.16)
Novelty (1 to 3) 1.18 (0.43) 1.10 (0.30)
Consistency (1 to 3) 2.88 (0.37) 2.88 (0.34)
Resource quality (1 to 3) 2.37 (0.70) 2.13 (0.69)
Multiple viewpoints (1 to 3) (a) 2.63 (0.51) 2.61 (0.50)
Behavioral theory elements (b)
Benefits of action (1 to 4) 2.52 (0.63) 2.80 (0.45)
Costs of action (1 to 3) 1.97 (0.75) 2.40 (0.89)
Norms of action (1 to 3) 1.48 (0.57) 1.40 (0.55)
Ease of action (1 to 3) 1.81 (0.65) 1.20 (0.45)
Demonstrates action (1 to 4) 2.13 (0.96) 2.20 (1.30)
(a) Multiple viewpoints were not appropriate or relevant in every
case (e.g., a talk on butterfly life cycles). We only observed this
variable where it seemed potentially relevant (n = 94 for larger
programs; n = 22 for smaller programs).
(b) These variables are explicitly associated with behavioral
change theory. As such, they were only observed on a small subset
of cases within the sample where specific behaviors were discussed
by the interpreter (n = 31 for larger programs; n = 5 for smaller
programs).
Table 13. Descriptive statistics of program characteristics
(categorical variables).
Program characteristics % of programs w program
characteristic was observed
Five or more Fewer than
attendees five
attendees
Fact-based messaging 26.8% 25.0%
Use of props 30.5% 27.5%
Pace too fast 6.2% 5.0%
Pace too slow 9.2% 5.0%
Pace just right 84.6% 90.0%
Unexpected positive circumstance 1.8% 2.5%
Unexpected negative circumstance 15.8% 10.0%
Table 14. Pearson correlations between ordinal independent variables
and visitor outcomes for programs with five or more attendees.
Variable Satisfaction Visitor Behavioral
experience intentions
and
appreciation
Interpreter style: .479 ** .277 ** .174 **
Confidence index
Interpreter style: .423 ** .303 ** .182 **
Authentic emotion and
charisma index
Program characteristic: .381 ** .378 ** .153 *
Approp. for audience
Program characteristic: .362 ** .219 ** .132 *
Organization index
Program characteristic: .342 ** .259 ** .124 *
Connection index
Interpreter style: Humor .288 ** .233 ** .155 *
quality
Program characteristic: .271 ** .281 ** .034
Consistency
Program characteristic: .255 ** .281 ** .187 **
Clear message
Interpreter style: .241 ** 245 ** .061
Responsiveness
Program characteristic: .234 ** .240 ** .162 **
Verbal engagement
Program characteristic: .216 ** .115 .141 *
Multisensory engagement
Interpreter style: .197 ** .134 * .104
Audibility
Interpreter style: False -.172 ** -.197 ** -.088
assumption of audience
Program characteristic: 170 ** .245 ** .165 **
Appropriate logistics
Program characteristic: .150 * .151 * .127 *
Surprise
Program characteristic: .145 * .024 .014
Novelty
Interpreter style: Humor .144 * .097 .062
quantity
Program characteristic: .074 .120 * .061
Physical engagement
Interpreter style: -.069 -.155 * -.023
Formality
Interpreter style: .105 .053 -.114
Sarcasm
Program characteristic: .077 .068 .065
Quality of the resource
Interpreter style: .035 .048 .112
Personal sharing
Program characteristic: .031 .157 .128
Multiple points of view
Table 15. Statistically significant t-tests results,
comparing the means of visitor outcome scores for
selected categorical variables for programs with five
or more attendees.
Observed
category Satisfaction
Mean Cohen's
diff. t P d
Impatience -0.36 -2.2 0.031 0.68
"Friend" 0.23 2.3 0.023 0.36
"Walking
encyclopedia"
Fact-based -0.34 -3.9 <0.001 0.50
messaging
Unexpected neg. -0.29 -2.8 0.006 0.45
circumstance
Observed Visitor experience and
category appreciation
Mean Cohen's
diff. t P d
Impatience -0.47 -3.3 0.001 1.28
"Friend"
"Walking
encyclopedia"
Fact-based -0.12 -2.6 0.011 0.36
messaging
Unexpected neg. -0.19 -3.6 <0.001 0.60
circumstance
Observed
category Behavioral intentions
Mean Cohen's
diff. t P d
Impatience
"Friend"
"Walking -0.20 -2.2 0.031 0.32
encyclopedia"
Fact-based
messaging
Unexpected neg.
circumstance
The following categorical variables yielded no
statistically significant differences in visitor
outcomes: Inequitable treatment of the audience,
questionable information, "Authority" identity,
unexpected positive circumstances, use of props.
Table 16. One-way ANOVA comparing outcome variables for programs of
different pace with five or more attendees. Items not sharing the
same superscript are statistically different from one another.
Means
Visitor
Pace Satisfaction experience and Behavioral intentions
appreciation
Too fast [8.62.sup.A] [4.27.sup.AB] [2.56.sup.A]
Too slow [8.43.sup.A] [4.23.sup.A] [2.84.sup.AB]
Appropriate [9.03.sup.B] [4.44.sup.B] [2.96.sup.B]
Statistics F = 12.9; F = 6.9, F = 3.2, p = 0.042
p < 0.001 p = 0.001 Cohen's d
Cohen's d Cohen's d (appropriate pace vs.
(appropriate (appropriate others): 0.34
pace vs. pace vs.
others): others): 0.57
0.78
Table 17. Independent samples t-tests comparing means of
characteristics for programs that experienced attrition (people left
the program early) vs. those that did not.
Program Cohen's
Characteristic attrition? Means t p d
Responsiveness of the Yes 2.62 -2.4 0.020 0.46
interpreter No 2.83
Audibility Yes 2.72 -2.3 0.025 0.49
No 2.91
False assumption of Yes 1.31 2.4 0.020 0.50
the audience No 1.11
Appropriate logistics Yes 2.44 -5.0 <0.001 0.86
No 3.23
Confidence Yes 3.08 -2.8 0.006 0.46
No 3.32
Organization Yes 3.09 -2.2 0.031 0.32
No 3.36
Program
Outcomes attrition? Means T p
Satisfaction Yes 8.49 -3.9 <.001 0.79
No 9.04
Visitor experience Yes 4.26 -2.6 .014 0.51
and appreciation No 4.44
Behavioral intentions Yes 2.73 -1.8 .070 0.34
No 2.95
Table 18. Chi-square tests comparing programs that experience attrition
vs. those that did not.
Pearson Relation to
Characteristic [chi square] P attrition
statistic
Interpreter identity: walking 3.6 .058 More attrition
encyclopedia
Use of props 12.4 .001 More attrition
Slow pace 5.8 .026 More attrition
Unexpected negative occurrence 8.9 .006 More attrition
Table 19. Statistically significant t-tests results, comparing the
means of visitor outcome scores for interpreters who expressed
different intended outcomes for their interpretive programs.
Satisfaction
Mean Cohen's
Intended outcome diff. t P d
Increased knowledge
Increase desire to learn 0.20 2.2 0.029 0.30
Change attitude 0.18 2.0 0.048 0.31
Increase appreciation for Park 0.22 2.7 0.007 0.34
Increase understanding of
resource
Increase level of concern
Change visitor behavior
Visitor experience and
appreciation
Mean Cohen's
Intended outcome diff. t P d
Increased knowledge -0.12 2.4 0.019 0.37
Increase desire to learn 0.14 3.2 0.002 0.46
Change attitude 0.16 4.3 <0.001 0.45
Increase appreciation for Park 0.09 2.2 0.028 0.28
Increase understanding of 0.08 2.1 0.040 0.26
resource
Increase level of concern
Change visitor behavior
Behavioral intentions
Mean Cohen's
Intended outcome diff. t P d
Increased knowledge
Increase desire to learn
Change attitude
Increase appreciation for Park
Increase understanding of
resource
Increase level of concern 0.27 2.2 0.032 0.41
Change visitor behavior 0.41 2.7 0.008 0.66