文章基本信息

标题：A multi-study investigation of self-efficacy measurement issues.
作者：Spiller, Shane ; Hatfield, Robert D.
期刊名称：Journal of Organizational Culture, Communications and Conflict
印刷版ISSN：1544-0508
出版年度：2007
期号：July
语种：English
出版社：The DreamCatchers Group, LLC
摘要：This research explores the self-efficacy-performance relationship in the classroom. Previous research done in this setting has typically reported correlations that are approximately half of what is found in other settings. This paper proposes that these lower correlations are due to a failure to consider the specific nature of the efficacy construct and a failure to construct efficacy measures in a manner suggested by most researchers. To test these propositions, multiple efficacy measures are developed, some in the style suggested by Bandura, others in the more traditional Likert-style question style. Additionally, a test efficacy measure is designed to capture a student's belief about their capabilities in test taking. These measures are tested with two classes of upper level college students. Results indicate that the Bandura suggested measurement style does lead to greater predictive ability, as does adding a test efficacy measure specific to the assessment type used in the class.
关键词：Classroom environment;Rating scales;Rating scales (Social science research);Self efficacy;Self-efficacy (Psychology)

A multi-study investigation of self-efficacy measurement issues.

Spiller, Shane ; Hatfield, Robert D.

ABSTRACT

This research explores the self-efficacy-performance relationship in the classroom. Previous research done in this setting has typically reported correlations that are approximately half of what is found in other settings. This paper proposes that these lower correlations are due to a failure to consider the specific nature of the efficacy construct and a failure to construct efficacy measures in a manner suggested by most researchers. To test these propositions, multiple efficacy measures are developed, some in the style suggested by Bandura, others in the more traditional Likert-style question style. Additionally, a test efficacy measure is designed to capture a student's belief about their capabilities in test taking. These measures are tested with two classes of upper level college students. Results indicate that the Bandura suggested measurement style does lead to greater predictive ability, as does adding a test efficacy measure specific to the assessment type used in the class.

INTRODUCTION

In 1977 Bandura developed a theoretical framework for learning and motivation that highlights the role of self-referent thought. Critical to the framework of Bandura's social cognitive theory is the role of self-efficacy (Bandura, 1986). Self-efficacy is defined as one's belief in their ability to perform a given behavior (Bandura, 1977; Wood & Bandura, 1989). Self-efficacy is hypothesized to be an important determinant of action, given the appropriate level of skill, and performance (Locke & Latham, 1990).

The construct of self-efficacy has presented researchers in many academic disciplines with a significant predictor of performance. For example, in the education literature Lent, Brown and Hackett (1994) review the research on the relationship between self-efficacy and career choice and academic interest, and Pajares and Miller (1994) attempt to clarify a voluminous literature on self-efficacy and mathematics performance. In the organizational behavior and psychology literature Locke and Latham (1990) included self-efficacy as an important moderator of performance in their model of goal setting. Researchers have found many useful applications for the construct proposed by Bandura, especially in the education fields. Much work has been done relating efficacy to academic performance. For example Wood and Locke (1987) found significant correlations for self-efficacy and course performance, as did Woodruff and Cashman (1993). These studies represent just a few studies among the dozens that investigate the relationship between self-efficacy and student performance, as well as the studies investigating teacher efficacy in the classroom.

The results of the classroom study by Wood and Locke (1987) point to some of the problems in the literature to date. The correlation between efficacy and performance was half of that which is typically found in laboratory studies. This problem was pointed to in a study by Pajares & Miller (1994). They examined the field of mathematics self-efficacy and concluded that much of the confusion in the area over the predictive validity of self-efficacy could be traced to measures that were not task specific. They noted that the mismatch between self-efficacy and criterial task assessment is a recurring problem in educational research. Bandura (1986) has defined self-efficacy as being very task specific, measures which do not address this specificity should exhibit lower predictive validity.

In the classroom there are multiple tasks that affect performance. One, typically measured by researchers, is the ability to understand the material. However, limiting the measure to ability to understand ignores other performance areas that could influence performance. This study attempts to bridge the gap between student understanding and student performance by studying the student's perceived efficacy for test-taking. The efficacy for classroom ability in a particular class might be separate from the efficacy for taking tests. This is often demonstrated in the classroom by those students "who are just good at taking a multiple choice test." Conversely we might hear students profess their desire for one test form, essay for example, as opposed to multiple choice.

An additional area of concern in efficacy measurement in educational research can be found in the structure of the efficacy measures used. Bandura (1986) suggested a very specific method for measuring self-efficacy. He suggest that researchers ask subjects whether that can perform at specific levels on a specific task (responses are either yes or no) and ask for the degree of confidence in that endorsement (rated on a scale from total uncertainty to total certainty, or 0-100) at each specific performance level. One self-efficacy measure, self-efficacy magnitude, is formed by summing the total questions answered "yes". A second self-efficacy measure, strength, is formed by summing the confidence ratings across all performance levels. A third measure, one suggested for use by Lee and Bobko's (1994) assessment of the validity of each measure, is a composite measure, composed by adding the confidence ratings for only those questions on which the subject indicated they could perform the task. This method has been used by the majority of researchers in the psychology, and organizational behavior disciplines (however there are exceptions, see, for example, Saks' (1994) study of stress and anxiety). Yet, within the education literature measures of unvalidated type exist. For example, Pintrich & DeGrant (1990) use a nine question measure to assess learning efficacy. Similarly, Lopez and Lent (1991) and Woolfolk and Hoy (1990) used self-efficacy measures of different types in their studies.

This study set out to measure classroom efficacy in both the way suggested by Lee and Bobko (1994) as well as with a measure developed in the fashion of more standard questionnaires. This was done by surveying two separate large lecture classes. Additionally, a test-efficacy measure was developed in the same style as each class efficacy measure and also administered to the subjects. The goals of this study were thus to (a) contrast the Bandura suggested measurement style with the regular questionnaire style and (b) demonstrate that a better prediction of classroom performance can be achieved by including test taking efficacy as a predictor. Factor analysis was used for scale development, and to assure that the classroom efficacy and test efficacy items were measuring separate constructs. The test efficacy measure was designed to capture two separate dimensions of test efficacy, multiple choice efficacy and essay test efficacy. Additionally, other behavioral measures were used to assess the validity of each efficacy scale.

SURVEY 1

The study was done using 530 students taking a junior level management course. Performance data was available in the student's four test grades as well as the final averages in the course. Of the 530 students 490 completed some portion of my survey. Complete surveys were available from 439 of the students. An examination of the non-respondents and incomplete respondents revealed no patterns by final grade.

Item Generation for Standard Questionnaire

The structure and focus of the questions within this survey are based on those that have appeared in other scales to measure other efficacy dimensions. For the purposes of this scale 35 questions were originally written. These original 35 questions were examined by four research colleagues that are familiar with efficacy research. Thirteen of the items were identified as being redundant or unclear and were deleted. The resulting survey contained 22 items, of these, 12 items were reversed, some bipolar, most simple negation. This survey was administered to two classes of students, yielding a pre-test sample of 64. A factor analysis of this data yielded three distinct factors, one for class efficacy, and one each for multiple choice efficacy and essay efficacy. The questions used in this scale are included in Table 1.

Data Analysis

The subjects used in this analysis were of questionable commitment to the surveys. Therefore, various procedures were used to check subjects' responses for patterns indicative of careless responding (Greenleaf, 1992(a) 1992(b); Schmitt & Stults 1985).

Difference analysis

This analysis examines the differences in responses between the mean positive item response and the mean negative item response. This was examined across both the entire measure and across the three expected dimensions. One would not expect the difference between these two indices to be very large--assuming that the only difference between the questions is the manner in which they are worded and that the respondents are reading the question and answering them truthfully. These differences were analyzed using two as the decision rule for identifying careless respondents. Elimination of those identified would have eliminated 36 respondents who scored a two on at least one of the dimensions.

Factor analysis

The factor analysis approach looks for individuals who respond in a significantly different way than other respondents. This method computes an index that indicates the consistency of each respondent. This index is then compared to a critical chi-square value to determine whether the respondent answered consistently (Schmitt & Stults, 1985).

For this analysis, separate analyses were run for each of the test efficacy dimensions and for the class efficacy dimension. The critical chi square for this analysis is 2.76 (1 d.f., p=.10). This value was compared to the consistency index across all the survey respondents. Additionally, the entire survey was analyzed in this fashion, the critical value for this analysis is 6.251 (3 d.f. p=.10). This approach pointed to 73 careless respondents on the class efficacy scale, 30 on the multiple choice scale, 44 on the essay scale, and 63 on the entire measure.

Subject exclusion

The careless respondent methods (difference and factor analysis) were not consistent in targeting individuals for termination; therefore, I chose to integrate their findings for a more complete analysis of the careless respondents. The subjects identified as careless by each approach were entered into a word processing package into tables, one column for each analysis. The subjects' identifiers were then sorted, and compared across columns. Subjects appearing careless in more than one analysis were targeted for deletion. This procedure allowed me to account for any inconsistencies between the two identification methods and allowed use of those subjects who might have appeared careless on only one scale. Using this approach 51 subjects were targeted for deletion, yielding a final sample of 388 subjects for further analysis.

The remaining data were analyzed for any normality problems; none were found.

Results

The data were first analyzed using principal components analysis with the estimate of shared variance set equal to the squared multiple correlations, with a varimax rotation. The number of factors to extract was set at three for this analysis as this is what I expected. As hoped the questions associated with essay test taking and multiple choice test taking separated into two distinct factors. An examination of the eigenvalues shows the third to be less than 1 (.8353). However, adding in this third factor raised the cumulative variance accounted for to 95%. The resulting factor structure appears in Table 2. There are a few questions that appear to cross-load, such as questions 1, 2, 10, 14, and 22. Although the loadings of most of these variable exhibits one loading that is much higher than the other.

Most of the cross loading appears to be between the first two factors, class efficacy and multiple choice test efficacy. The possibility that these three scales may actually be correlated was examined through factor analysis with a promax rotation. Promax first rotates the factors using a varimax rotation and then allows the rotated axis to correlate. The rotated factor pattern result from this analysis is presented in Table 3.

The results from the promax rotation yield the same structure as the varimax rotation. The difference is that the promax rotation eliminates the cross-loading problem evident in the varimax rotation. The interfactor correlations presented in Table 4 show a strong correlation between the first two factors. One item that appears to be problematic is question 22. Upon reflection it is easy to see why this would load on both test dimensions, reversed for essay. This question should be deleted and a replacement question added for essay efficacy.

Alpha factor analysis was used to assess the reliabilities of the three scales. The factor loadings obtained were approximately equal to those in Table 2, with reported reliabilities of .987 for class efficacy, .893 for multiple choice efficacy, and .546 for essay efficacy.

Of the three scales the essay scale obviously need the most work. I deleted too many items from this one scale to shorten the overall scale. In the class in which the survey was administered this was not too important since there was no essay component to the grade.

Given that the first two factors were strongly correlated LISREL[R] was used to test the hypothesis that these two factors represented separate constructs. This was done by constraining the parameters for these factors to be equal in a structural equation as recommended by Joreskog and Sorbom (1993). The first run examines the fit with the equality constraint; the second run removes this constraint and looks at the change in the resulting goodness of fit index. The result of this analysis is presented in Table 5. The resulting difference in chi-square values for the constrained model versus the model where the parameters are allowed to be unequal is significant, indicating these constructs are not measuring the same thing.

Table 6 illustrates the questions grouped by factor and named.

Validity

As mentioned earlier many different measures were available to assist in establishing the construct validity of these scales. The class efficacy scale correlated with a general efficacy scale (r=.2632, p=.0001), a measure of student study skills (r=.42, p=.001), establishing convergent validity, and did not correlate with a measure of social efficacy (r=-.07, p=.09) establishing discriminant validity. The multiple choice efficacy score also correlated with general efficacy (r=.176, p=.0009) and the study skill measure (r=.38, p=.0001), and failed to correlate with social efficacy (r=-.09, p=.08). The essay efficacy score did not correlate with the general efficacy score, study skills, nor social efficacy measures significantly, establishing that this scale needs more work.

Predictive validity is seen in that final average correlated with class efficacy (r=.485, p=.0001) and with multiple choice efficacy (r=.54, p=.0001).

To test the association between the three factors and performance multiple regression was used. The resulting [R.sup.2] for all three factors was .60515, with an adjusted [R.sup.2] of .5988. Of more interest are the parameter estimates and probabilities. The t-value for factor 1, class efficacy, and factor 2, multiple choice efficacy, were both significant at the p=.0001. The p-value for factor 3, or essay efficacy, was .7001. This is important to note because in the class used in this research there was no essay component to grade.

SURVEY 2

The second study was done using students taking the same course in the subsequent semester to the first study. This study used 518 students taking a junior level management course. The same performance measures were available for study. Of the 518 students 478 completed some portion of the survey. Complete surveys were available from 463 of the students. In addition to the new Bandura type measures used in this study, the measures developed for the first study were also used. An examination of the non-respondents and incomplete respondents revealed no patterns by final grade.

Bandura-type Measure

For the Bandura-based efficacy measure the items used were similar in form to those used by Wood & Locke (1987) in their study of class performance and efficacy. Their measure was used as a measure of class performance efficacy. Table 7 contains a complete sample item, along with the question component of the additional items. It is important to note that this scale does include one item for exam concentration, giving the possibility of overlap with any exam efficacy measure. The multiple choice efficacy measure and essay test efficacy measure were constructed by the same four researchers used in generating the items in the standard questionnaire. Table 8 contains the sample items for each measure.

Survey 2 Results

Analysis of the results from the second survey was limited to validity checks. As with the first survey, the class efficacy correlated with a general efficacy scale (r=.3342, p=.0001), a measure of student study skills (r=.49, p=.0001) establishing convergent validity, and did not correlate with a measure of social efficacy (r=.04, p=.13) establishing discriminant validity. The multiple choice efficacy scale correlated with general efficacy (r=.28, p=.0001) and the study skill measure (r=.39, p=.0001), and failed to correlate with social efficacy. In contrast to the first set of measures the essay test efficacy measure did correlate with general efficacy score (r=.33, p=.0001), and the study skills measure (r=.41, p=.0001), and failed to correlate with the social efficacy measure.

The interrelationships between the measures indicates a greater degree of association between the three measures than in the first study, as class efficacy correlated with multiple choice efficacy (r=.58, p=.0001), and essay efficacy (r=.38, p=.0001), and the multiple choice efficacy and essay efficacy measures correlated (r=.205, p=.001).

Predictive validity was seen in that final average correlated with class efficacy (r=.785, p=.0001) and with multiple choice efficacy (r=.72247, p=.0001. The two measures together in a multiple regression equation yielded an [R.sup.2] of .821, with an adjusted [R.sup.2] of .817. Once again essay efficacy did not correlate with performance.

As in the first study, LISREL[R] was used to test the hypothesis that these multiple choice measures and the class efficacy measure measured distinct constructs. This was done by constraining the parameters for these factors to be equal in a structural equation as recommended by Joreskog and Sorbom (1993). The result of this analysis is presented in Table 9. The resulting difference in chi-square values for the constrained model versus the model where the parameters are allowed to be unequal is significant, indicating these constructs are not measuring the same thing. Further, the class efficacy measure from study one was compared to the class efficacy measure, as was the multiple choice efficacy measure, in a similar fashion. These results are also contained in Table 9. The results approach significance (p=.101, and p=.051), indicating that both types of measures seem to measuring the same construct.

CONCLUSION

The results obtained here do point to the multi-dimensionality of the construct class-efficacy. Most important of these dimensions is test efficacy. Specifically students may have very different perceptions of their ability for class performance and for test performance. Adding this component into previous research may help explain some of the lower correlations found in classroom experiments. Conceptualizing the construct in this way does align it more closely with the original ideas as proposed by Bandura. Additionally, the Bandura-suggested measurement style was contrasted with the more familiar questionnaire style. Results indicate that the measures appear to measure the same construct; however the Bandura-type measures did demonstrate better ability at predicting final performance.

REFERENCES

Bandura, A. (1977). Social Learning Theory. Englewood Cliffs, NJ: Prentice-Hall.

Bandura, A. (1986). Social foundations of thought and action. Englewood Cliffs, NJ: Prentice-Hall Inc.

Greenleaf, E. A. (1992a). Improving rating scale measures by detecting and correcting bias components in some response styles. Journal of Marketing Research, 29, 176-188.

Greenleaf, E. A. (1992b). Measuring extreme response style. Public Opinion Quarterly, 56, 328-351.

Joreskog, K. G. & Sorbom, D. (1993). LISREL[R] User's Reference Guide. Chicago: Scientific Software International, Inc.

Lee, C. & Bobko, P. (1994). Self-efficacy beliefs: Comparison of five measures. Journal of Applied Psychology, 79, 364-369.

Lent, R. W., Brown, S. D. & Hackett, G. (1994). Toward a unifying social cognitive theory of career and academic interest, choice, and performance. Journal of Vocational Behavior, 45, 79-122.

Locke, E. A. & Latham, G. P. (1990). A Theory of Goal Setting and Task Performance. Englewood Cliffs, NJ: Prentice-Hall Inc.

Lopez. F. G. & Lent, R. W. (1991). Efficacy-based predictors of relationship adjustment and persistence among college students. Journal of College Student Development, 32, 223-229.

Pajares, F. & Miller, M. D. (1995). Mathematics self-efficacy and mathematics performances: The need for specificity of assessment. Journal of Consulting Psychology, 42, 190-198.

Pintrich, P. R. & De Groot, E. V. (1990). Motivational and self-regulated learning components of classroom academic performance. Journal of Educational Psychology, 82, 33-40.

Schmitt, N. & Stults, D. M. (1985). Factors defined by negatively keyed items: The result of careless respondents? Applied Psychological Measurement, 9, 367-373.

Wood, R. E. & Locke, E.A. (1987). The relation of self-efficacy and grade goals to academic performance. Educational and Psychological Measurement, 47, 1013-1024.

Woodruff, S. L. & Cashman, J. F. (1993). Task, domain and general efficacy: a reexamination of the self-efficacy scale. Psychological Reports, 72, 423-432.

Woolfolk, A. E. & Hoy, W. K. (1990). Prospective teachers' sense of efficacy and beliefs about control. Journal of Educational Psychology, 82, 81-91.

Shane Spiller, Western Kentucky University Robert D. Hatfield, Western Kentucky University

Table 1: Combined Questionnaire

 1 I would say that I am an excellent student.

 2 I understand this material well enough to use it the workplace.

 3 I am not very good at taking multiple choice tests.

 4 When I take multiple choice tests if I do not know the answer I
 usually can guess the correct choice.

 5 I have a hard time remembering all of the important points to
 write on an essay tests.

 6 I do not feel that multiple choice tests serve as an accurate
 indicator of my understanding of the material.

 7 Essay tests are hard to study for.

 8 I don't think that I would make a very good manager.

 9 The material in this class takes a lot of time to understand.

10 I cannot understand the material in this class.

11 Multiple choice tests confuse me.

12 I could not describe this material from this class to someone
 else in the workplace who needed it.

13 I understand the material taught in this class.

14 If I don't know the correct answer right away on a multiple
 choice test I can usually narrow the choices down to a couple
 of answers.

15 On multiple choice tests I have a hard time distinguishing
 between choices.

16 I would rate my ability as a student in this class as excellent.

17 It is hard for me to find the important points in the assigned
 chapters.

18 I could teach the material in this class to someone else.

19 If I was in a management position I could apply some of the
 material from this class.

20 I can perform well on Essay tests.

21 I can not tell the important points during class that I should
 take notes on.

22 I would rather take a multiple choice test than an essay test.

Table 2: Loadings For Three Factor Solution With Varimax Rotation

Question Factor 1 Factor 2 Factor 3

 1 .70185 .40800 -.01381
 2 70229 .39264 .04355
 3 .26499 .70673 -.03046
 4 .33107 .63578 -.06815
 5 .25220 .21084 .59989
 6 .20158 .57573 -.21474
 7 -.12753 -.12868 .61773
 8 .78005 .21329 .04969
 9 .64312 .31830 .21007
 10 .72448 .41963 .14222
 11 .36542 .79628 -.01676
 12 .74632 .31190 .14419
 13 .75952 .29642 .14273
 14 .39946 .69999 -.02169
 15 .32702 .76682 .07301
 16 .59395 .33539 .05067
 17 .62587 .30896 .16034
 18 .77972 .33389 .03754
 19 .77794 .29813 .02906
 20 .26366 .02077 .56069
 21 .60568 .31531 .17053
 22 .10343 .50286 -.48184

Table 3: Loadings For Three Factor Solution With Promax Rotation

Question Factor 1 Factor 2 Factor 3

 1 .51691 .33909 -.04899
 2 .70075 .14488 -.00984
 3 -.01781 .76633 -.01484
 4 .23577 .58587 -.07632
 5 .27400 .12418 .48321
 6 -.00351 .60697 -.21374
 7 -.14963 -.03875 .63062
 8 .89236 -.12468 -.02407
 9 .63305 .11941 .03798
 10 .68176 .24178 .09241
 11 .06467 .83062 -.00652
 12 .76493 .09255 .08494
 13 .77200 .11409 .05323
 14 .11859 .53229 -.05720
 15 .02283 .82145 .08667
 16 .55214 .22681 -.02936
 17 .47512 .24826 .14704
 18 .83116 .02895 -.02848
 19 .84789 -.01681 -.03918
 20 .28291 -.05480 .53844
 21 .54499 .16784 .16004
 22 .10244 .58929 -.30011

Table 4: InterFactor Correlations

 Factor 1 Factor 2 Factor 3

Factor 1 1.0
Factor 2 .481 1.0
Factor 3 .095 .-.059 1.0

Table 5: LISREL[R] Results

Model df [chi square] difference

Constructs are Same 3 129.75
Constructs are Different 2 7.61 1,122.14

 Table 5: LISREL[R] Results

Model p-value

Constructs are Same
Constructs are Different .00001

Table 6: Named Scales

Factor 1 Class-Efficacy

 1 I would say that I am an excellent student.

 2 I understand this material well enough to use it the
 workplace.

 8 I don't think that I would make a very good manager.

 9 The material in this class takes a lot of time to understand.

 10 I cannot understand the material in this class.

 12 I could not describe this material from this class to someone
 else in the workplace who needed it.

 13 I understand the material taught in this class.

 16 I would rate my ability as a student in this class as
 excellent.

 17 It is hard for me to find the important points in the
 assigned chapters.

 18 I could teach the material in this class to someone else.

 19 If I was in a management position I could apply some of the
 material from this class.

 21 I can not tell the important points during class that I
 should take notes on.

Factor 2 Multiple Choice Efficacy
 3 I am not very good at taking multiple choice tests.

 4 When I take multiple choice tests if I do not know the answer
 I usually can guess the correct choice.

 6 I do not feel that multiple choice tests serve as an accurate
 indicator of my understanding of the material.

 11 Multiple choice tests confuse me.

 14 If I don't know the correct answer right away on a multiple
 choice test I can usually narrow the choices down to a couple
 of answers.

 15 On multiple choice tests I have a hard time distinguishing
 between choices.

Factor 3 Essay Test Efficacy
 5 I have a hard time remembering all of the important points to
 write on an essay tests.

 7 Essay tests are hard to study for.

 20 I can perform well on Essay tests.

 22 I would rather take a multiple choice test than an
 essay test.

Table 7: Complete sample item, and item questions for class efficacy
scale

 Can Do Confidence
 (Yes=Yes N=No) (0 to 100%)

I could memorize 60% of the
facts & concepts.

I could memorize 70% of the
facts & concepts

I could memorize 80% of the
facts & concepts.

I could memorize 90% of the
facts & concepts.

I could memorize 100% of the
facts & concepts.

Memorization The proportion of facts and concepts
 covered in this course that you feel you
 will be able to memorize and recall on
 demand.

Discriminating Concepts The proportion of time that you feel that
 you will be able to discriminate between
 the important and not so important facts
 concepts and arguments covered in this
 class.

Explaining Concepts The proportion of facts, concepts, and
 arguments covered in the course that you
 feel you could explain clearly to
 others in your own words.

Understanding The proportion of facts, concepts, and
 arguments covered in the course that you
 feel you can understand.

Class Concentration The proportion of the class periods for
 which you fell you are able to concen-
 trate and stay fully focused on the
 materials being presented.

Note-Taking The proportion of the time that you feel
 you are able to make understandable
 course notes which emphasize, clarify
 and relate key facts, concepts and
 arguments as they are presented in
 lectures, tutorials or course materials.

Exam Concentration The proportion of the time during exams
 for which you feel you are able to focus
 exclusively on understanding and
 answering questions and avoid breaks in
 concentration

Table 8: Test Efficacy items

Multiple Choice Efficacy

Fairness The percentage of the time that you feel that
 multiple choice tests serve as an accurate
 indicator of your understanding of the course
 material.

Reasoning Ability (I) If you do not know the answer to a multiple
 choice question, the percentage of the time
 that you feel you can reason through the
 available choices and pick the correct one.

Reasoning Ability (II) The proportion of the time on a multiple
 choice tests that you are able to distinguish
 between the available choices.

Concentration (I) The proportion of the time that you are able
 to maintain concentration on a multiple
 choice test.

Concentration (II) The proportion of the time that you are able
 to answer questions without being confused
 by the other answer alternatives.

General Ability The proportion of the time that you feel you
 are able to perform well on multiple choice
 test, even if you are not completely prepared
 for the material.

Multiple Choice Anxiety The proportion of the time that you feel you
 can maintain control and not panic while
 taking a multiple choice test.
Essay Ability
Ability (I) The proportion of the important facts that
 you feel you could remember and write
 about on an essay question

Ability (II) The proportion of the time that you feel you
 are able to perform well on essay tests.

Concentration The proportion of the time that you feel you
 are able to concentrate on essay questions.

Essay Anxiety The proportion of the time that you feel you
 can maintain control and not panic on an
 essay test.

* Each of these questions was posed in the same style as the class
efficacy scale example in Table 7.

Table 9: LISREL[R] Results

Model df [chi square] difference

Class efficacy and MC efficacy 3 142.12
constructs are same

Constructs are Different 2 3.26 1,138.86

Class efficacy measures are same- 3 12.65

Measures represent different 2 9.82 1,2.83
constructs

Multiple choice measures are same 3 13.45

Measures represent different 2 9.75 1,3.7
constructs

Model p-value

Class efficacy and MC efficacy
constructs are same

Constructs are Different .00001

Class efficacy measures are same-

Measures represent different .101
constructs

Multiple choice measures are same

Measures represent different .051
constructs