Reflection is a staple of contemporary writing pedagogy and writing assessment. Although the power of reflective writing has long been understood in writing studies, the field has not made progress on articulating how to assess the reflective work. Developed at the crossroads of research in reflection and metacognition, the Index for Metacognitive Knowledge (IMK) is designed to help writing researchers, teachers, and students articulate what is being rewarded in the assessment of reflection and to articulate the role of metacognitive knowledge in critical reflective writing.
Ratto Parks, Amy. (2023). What Do We Reward in Reflection? Assessing Reflective Writing with the Index for Metacognitive Knowledge. Journal of Writing Assessment, 16(1). DOI: 10.5070/W4jwa.1570
Measuring
metacognition, or the awareness of one’s thoughts, is no easy task. Self-report
may be limited and we may overestimate the frequency with which we use that
information to self-regulate. However, in my quest to assess metacognition, I
found the MAI, or the Metacognitive Awareness Inventory (Schraw and Dennison,
1994). The MAI, comparatively speaking, is one of the most widely used
instruments. Schraw and Dennison found an alpha coefficient of .91 on each
factor of the MAI and .95 for the entire MAI, which indicates reliability.
Pintrich (2000) agrees the MAI has external validity given MAI scores and
students’ academic achievement are highly correlated.
The Problem
Despite the wide
use and application of the MAI, I found the survey measurement scale unfitting
and constrictive. The survey consists of 52 questions with true or false
response options. Some of the behaviors and cognitions measured on the MAI
include, “I consider several alternatives to a problem before I answer,” “I
understand my intellectual strengths and weaknesses,” “I have control over how
well I learn,” and “I change strategies when I fail to understand”, just to
name a few (see https://services.viu.ca/sites/default/files/metacognitive-awareness-inventory.pdf).
Though these
questions are valid, to dichotomously respond to an extreme “true”, as in I always do this, OR a “false”, as in I never do this, is problematic. Yes-No
responses also make for difficult quantitative analysis. All or nothing
responses makes hypothesis testing (non-parametric testing) challenging. I felt
that if the scale was changed to be Likert-type, then participants could more accurately
self-report on how often they may
exhibit these behaviors or cognitions, and we could more readily assess variability
and change.
The Revised MAI
Thus, I revised
the MAI to use a five-point Likert-type rating scale, ranging from “I never do
this” to “I do this always” (see Figure 1). Five points also allows a middle
rating with two extremes on either side
(always/never). It is important to note that the original
content of the survey questions has not been altered.
My recent findings
(Terlecki & McMahon, 2018; Terlecki & Oluwademilade, in preparation) show
the revised MAI to be effective as a pre- and post-test measure to assess the growth
due to metacognitive instruction, compared to controls with varying levels of
instruction, in college students.
Figure 1. Revised MAI likert-scale (Terlecki & McMahon, 2018). Response scale adapted from Schraw and Dennison (1994) with permission from Sperling (Dennison).
In our longitudinal
sample of roughly 500 students, results showed that students exposed to direct
metacognitive instruction (across a one semester term) yielded the greatest
improvements on the revised MAI (compared to controls), although maturation
(age and level in school) had a moderating effect. Thus, we concluded that students
who were deliberately taught metacognitive strategies did exhibit an increase in
their cognitive awareness, as measured by the revised MAI, regardless of
initial levels of self-awareness. In other words, the older one is, the greater
the likelihood one may be self-aware; however, explicit metacognitive
intervention still boasts improvements.
These changes
might not have been elucidated using the original, dichotomous true/false response
options. The revised MAI is a useful tool in measuring such metacognitive
behaviors and whether changes in frequency may occur over time or intervention.
Likewise, anecdotal evidence from my participants, as well as researchers,
supports the ease of reporting using this Likert-scale, in comparison to the
frustration of using the 2-point bifurcation. Still, usage of the revised MAI
in more studies will be required to validate.
Suggestions for Future Usage of the MAI & Call for
Collaboration
The Metacognitive
Awareness Inventory (MAI) is a common assessment used to measure metacognition.
Quantifying metacognition proves challenging, yet this revised instrument
appears promising and has already provided evidence that metacognition can grow
over time. The addition of a wider range of response options should be more
useful in drilling down to frequency of usage of metacognitive behaviors and
thinking.
Validation studies on the revised scoring have yet to be conducted, thus if other researchers and/or authors are interested in piloting the revised MAI, please contact me (* see contact information below). It would be great to collaborate and collect more data using the Likert-form, as well as have a larger sample that would allow us to run more advanced statistics on the reliability and validity of the new scaling.
References
Pintrich, P.R. (2000). Issues
in self-regulation theory and research. Journal
of Mind and Behavior, 21, 213-220.
Schraw, G., & Dennison, R. S. (1994). Assessing
metacognitive awareness. Contemporary Educational Psychology, 19(4),
460-475.
Terlecki, M. & McMahon, A. (2018). A call for metacognitive intervention: Improvements due to curricular programming and training. Journal of Leadership Education, 17(4), doi:10.12806/V17/I4/R8
Terlecki, M. & Oluwademilade, A. (2020). The effects of instruction and maturity on metacognition (in preparation).
*Contact: If looking to collaborate
or validate the revised instrument, please contact Melissa Terlecki at
mst723@cabrini.edu.
When you read, do you
ask yourself whether the material is contributing to your knowledge of the
subject, whether you should revise your prior knowledge, or how you might use
the new knowledge that you are acquiring?
Do you highlight information or make notes in the margins to better
remember and find information later on? Prior research by Pressley and
colleagues (e.g., Pressley & Afflerbach, 1995) suggested that the type of
metacognitions suggested by reading strategies like these were critical for
effective reading comprehension.
Inspired by that
research, Taraban et al. (2000) conducted a study involving 340 undergraduates
and 35 reading strategies like those suggested by Pressley and colleagues and found
that self-reports of strategy use were significantly associated with
grade-point averages (GPA). Specifically, students who reported higher use of
reading strategies also had higher GPAs.
Additionally, responses to open-ended questions showed that students who
could name more reading strategies and reading goals also had significantly
higher GPAs.
The data in Taraban et
al. (2000) overwhelmingly suggested a strong positive relationship between
students’ knowledge and use of reading goals and strategies and their academic
performance. More generally, data by Taraban et al. and others suggest that effective
reading depends on metacognitive processing – i.e., on directed cognitive
effort to guide and regulate comprehension. Skilled readers know multiple
strategies and when to apply them. In the remainder of this post, I review
subsequent developments associated with metacognitive reading strategies,
including cross-cultural comparisons, as well as raising a question about the
relevance of these strategies to present-day text processing and comprehension
given widespread technological developments.
Analytic VS Pragmatic Reading Strategies
In 2004, my students and
I created a questionnaire, dubbed the Metacognitive Reading Strategies
Questionnaire (MRSQ) (Taraban et al., 2004). The questionnaire drew on the
strategies tested earlier in Taraban et al. (2000) and organized the strategies
into two subscales through factor analytic methods: analytic strategies and
pragmatic strategies. The analytic scale
relates to cognitive strategies like making inferences and evaluating the text
(e.g., After I read the text, I consider other possible interpretations
to determine whether I understood the text.). The pragmatic scale
relates to practical methods for finding and remembering information from the
text (e.g., I try to underline when reading in order to
remember the information.). Students respond to these statements
using a five-point Likert-type scale: Never
Use, Rarely Use, Sometimes Use, Often Use, Always Use.
Initial applications of the MRSQ suggested that the
two-factor model could aid in better understanding students’ use of
metacognitive comprehension strategies.
Specifically, in students’ self-reports of expected GPA for the coming
academic year, there was a significant positive correlation with analytic
strategies but a non-significant correlation with pragmatic strategies, which
suggested that students who reported higher use of analytic strategies also
anticipated doing well academically in the coming academic year.
Cross-Cultural
Explorations of Reading Strategies
Vianty (2007) used the MRSQ to explore difference in students’ use of metacognitive reading strategies in their native
language, Bahasa Indonesia, and their second language, English. Participants were
students in a teacher education program who completed the MRSQ in English and
Bahasa Indonesia. Vianty found that students processed language differently in
their native language compared to a non-native language.
In comparing mean use of analytic strategies when reading in
their native language compared to English, Vianty found that nearly all means
were higher for Bahasa Indonesia.
T-tests showed significant differences favoring Bahasa Indonesia for
eight out of sixteen analytic strategies. Conversely, four of the six pragmatic
strategies were favored when reading English, however, only one difference (I take notes when reading in order to
remember the information) was significant on a t-test. Vianty concluded
that students used analytic strategies significantly more in Bahasa Indonesia
than English. Conversely, use of pragmatic strategies was higher when reading
in English, but the effect was weak.
Taraban et al. (2013)
compared US and Indian engineering undergraduates on their application of
analytic and pragmatic strategies. The
language of instruction in Indian universities is English; however, this is not
typically the native language (the mother tongue) of the students. Therefore, the researchers predicted lower use
of analytic strategies and higher use of pragmatic strategies among Indian
students compared to US students, reasoning from the findings in Vianty (2007).
The latter but not former prediction was supported. Indeed, Indian students
applied analytic strategies significantly more frequently than US
students. Pragmatic strategy use was
significantly lower than analytic strategy use for US students but not for
Indian students, who applied analytic and pragmatic strategies equally often. Contrary to the findings in Vianty (2007),
these findings suggest that students can make significant use of analytic and
pragmatic strategies in a non-native language.
The most comprehensive
cross-linguistic comparison was conducted recently by Gavora et al. (2019), who
compared analytic and pragmatic strategy use, measured by variants of the MRSQ,
among 2692 students from Poland, Hungary, Slovakia, and the Czech Republic,
enrolled in education programs, primarily teacher and counseling. Students in Hungary, Slovakia, and the Czech
Republic reported significantly higher use of pragmatic over analytic
strategies. Students in Poland showed a converse preference, reporting
significantly more frequent use of analytic strategies. Quite striking in the
results were the significant correlations between pragmatic strategy use and
GPA, and analytic strategy use and GPA, for all four countries. Specifically, the correlation showed that
higher frequency use of both pragmatic and analytic strategies was associated
with more successful academic performance.
Gavora et al. (2019) suggest that “In order to succeed academically,
students direct their reading processes not towards comprehension but to
remembering information, which is the core component of the pragmatic strategy”
(p. 12). Their recommendation, that “educators’ attention should be focused on
developing especially analytic strategies in students,” is strongly reminiscent
of the ardor with which Pressley and colleagues began promoting metacognitive
reading strategies beginning in the elementary grades.
However, given the significant correlations between both analytic and
pragmatic strategy use with GPA, it may be that the predominance of analytic
strategies is not what is important, but whether application of either type of
strategy – analytic or pragmatic – aids students in their academic achievement.
The data from Vianty (2007) may be informative in this regard, specifically,
the finding that those students applied pragmatic strategies more frequently
than analytic strategies when the context – reading outside their native
language – dictated a more pragmatic approach to reading and comprehension.
A relevant point made by Gavora et al. relates to the samples that have
been tested to-date, and the relevance of context to strategy use. They point
out that in contexts like engineering (e.g., Taraban et al., (2013), the
context may support more analytic thinking and analytic strategy use. The Gavora et al., sample consisted of
humanities students, which, on their argument, may have resulted in an
overwhelming affirmation of pragmatic strategies. Further comparisons across
students in different programs is certainly warranted.
Changing Times: The Possible
Influence of Technology on Reading
An additional question comes to mind, which is the effect of widespread
technology in instructional settings. When I, like others, am uncertain about a
definition, algorithm, theory, etc., I find it very easy to simply Google the
point or look for a YouTube, which I simply need to read or watch for an
explanation. This personal observation suggests that perhaps the strategies
that are probed in the MRSQ may, at this point, be incomplete, and in some
instances, somewhat irrelevant. The next
step should be to ask current students what strategies they use to aid
comprehension. Their responses may lead to new insights into contemporary student
metacognitions that assist them in learning.
In conclusion, there is no doubt that metacognitive strategies are
essential to effective information processing.
However, there may be room to reconsider and update the strategies that
students employ when reasoning and searching for information and insights to
guide and expand comprehension and learning.
It may be that current technology has made students more pragmatic and a
promising goal for further research would be to uncover the ways in which that
pragmatism is being expressed through new search strategies.
In the fourth post of “The Evolution of Metacognition in Biological Sciences” guest series, Dr. Lindsay Doukopoulos describes the goals and outcomes of the spring meetings of the Biology curriculum committee that were led by Biggio Center and the Office of Academic Assessment. Namely, they sought to create a questionnaire that would be given to all graduating students that would allow them reflect on their learning over the course of their academic career and to create a rubric to measure the quality of metacognition in their responses.
by Dr. Lindsay Doukopoulos, Assistant Director of the Biggio Center for the Enhancement of Teaching and Learning. In collaborating with the Office of Academic Assessment on the Learning Improvement Initiative, she leads faculty development initiatives designed to connect faculty with strategies, resources, and partners to support their teaching.
In Spring of 2019, Katie Boyd and I led four
meetings with Biology’s curriculum committee with the two-part goal of
producing a set of metacognitive-reflection questions to be completed by every
graduating student in their capstone course and a rubric to assess the quality
of metacognition evidenced in the reflections.
Developing the Rubric
In the first workshop, our goal was to help faculty unpack the definition of metacognition into categories and decide how many levels or standards to maintain within the rubric. In other words, we were hoping to fill in the x and y axes of the Metacognition rubric.
To facilitate this discussion, we brought two rubrics designed to measure metacognition. One came from the General Learning Outcome site of Cal State University-San Bernardino (CSUSB). The other came from the AAC&U Value rubric on Lifelong Learning, specifically, the two elements called Transfer and Reflection. Both rubrics appeared to offer valuable ways of assessing the metacognition evident in a written reflection. Rather than choose one or the other, we combined the two categories (rows) of the AAC&U Value rubric and the three categories (rows) of the CSUSB rubric. We also decided to use four standards of quality (columns) and discussed terminology resulting in: Beginning or N/A, Emerging/Developing, Mastery, and Exceeding Expectations.
In the second workshop, our goal was to fill in the
performance criteria or behavioral anchors that would make up the rubric. After
much discussion, we again decided to leave the rubric big and pilot it in our
next meeting to determine whether the AAC&U elements or the
CSUSB elements would be preferable.
In our third workshop, piloted the rubric by we scoring a packet of student reflections that had come out of the Biology undergraduate research capstone course the previous year. In practice, the faculty found the two elements of the AAC&U rubric easier to apply and more valuable for differentiating between the quality of metacognition in the student responses. Thus, we reduced the final rubric to those two elements.
Developing the Reflection Questions
In the final workshop, our goal was to draft and finalize questions that would be given to sophomores and seniors in the program. These questions would parallel those already being used in the undergraduate research capstone course. These are the questions the committee created:
What has been your favorite learning moment in your major? Please describe it in detail and explain why it was your favorite.
What were the most useful skills you learned in your major and why?
Regarding the skills you listed in question 2: how do you know you learned them? Please provide specific examples.
How do you plan to apply these skills in future courses or your career?
As a student, what could you have done to learn more? If you could go back in time and give yourself advice, what you say?
Evaluate your capacity to design an experiment and generate hypotheses. Please provide specific examples of aspects of the scientific process you’re most and least confident about.
Reflect on your view of science. How has your participation in your Biological Sciences major changed your view of science, if at all? Please provide specific examples.
Reflecting on your learning journey, what do you value most about your major curriculum (i.e. the courses you took and the order you took them in)?
This question-writing process concluded the initial phase of the Learning Improvement Initiative as it led to the creation of the instrument the department will use to gather baseline data on the metacognition SLO. Moving forward, all students (roughly 75 majors per year) will complete the questionnaire during their capstone course and the curriculum committee will lead assessment using the rubric we created.
The goal is to have every student scored on the rubric every year beginning with baseline data collection in spring 2020 with students who have not experienced the “treatment” conditions, i.e. courses redesigned by faculty to improve metacognition. Over time, we expect that the faculty development workshops around transparent assignment design, reflective writing assignments, and ePortfolio pedagogy will result in graduates who are more metacognitive and data that reflects the learning improvement.
Ed Nuhfer, Retired Professor of Geology and Director of Faculty Development and Director of Educational Assessment, enuhfer@earthlink.net, 208-241-5029
Early this year, Lauren Scharff directed us to what might be one of the most influential reports on quantification of metacognition, which is Kruger and Dunning’s 1999 “Unskilled and Unaware of It: How Difficulties in Recognizing One’s Own Incompetence Lead to Inflated Self-Assessments.” In the 16 years that since elapsed, a popular belief sprung from that paper that became known as the “Dunning-Kruger effect.” Wikipedia describes the effect as a cognitive bias in which relatively unskilled individuals suffer from illusory superiority, mistakenly assessing their ability to be much higher than it really is. Wikipedia thus describes a true metacognitive handicap in a lack of ability to self-assess. I consider Kruger and Dunning (1999) as seminal because it represents what may be the first attempt to establish a way to quantify metacognitive self-assessment. Yet, as time passes, we always learn ways to improve on any good idea.
At first, quantifying the ability to self-assess seems simple. It appears that comparing a direct measure of confidence to perform taken through one instrument with a direct measure of demonstrated competence taken through another instrument should do the job nicely. For people skillful in self-assessment, the scores on both self-assessment and performance measures should be about equal. Seriously large differences can indicate underconfidence on one hand or “illusory superiority” on the other.
The Signal and the Noise
In practice, measuring self-assessment accuracy is not nearly so simple. The instruments of social science yield data consisting of the signal that expresses the relationship between our actual competency and our self-assessed feelings of competency and significant noise generated by our human error and inconsistency.
In analogy, consider signal as your favorite music on a radio station, the measuring instrument as your radio receiver, the noise as the static that intrudes on your favorite music, and the data as the actual sound mix of noise and signal that you hear. The radio signal may truly exist, but unless we construct suitable instruments to detect it, we will not be able to generate convincing evidence that the radio signal even exists. Failures can lead to the conclusions that metacognitive self-assessment is no better than random guessing.
Your personal metacognitive skill is analogous to an ability to tune to the clearest signal possible. In this case, you are “tuning in” to yourself—to your “internal radio station”—rather than tuning the instruments that can measure this signal externally. In developing self-assessment skill, you are working to attune your personal feelings of competence to reflect the clearest and most accurate self-assessment of your actual competence. Feedback from the instruments has value because they help us to see how well we have achieved the ability to self-assess accurately.
Instruments and the Data They Yield
General, global questions such as: “How would you rate your ability in math?” “How well can you express your ideas in writing?” or “How well do you understand science?” may prove to be crude, blunt self-assessment instruments. Instead of single general questions, more granular instruments like knowledge surveys that elicit multiple measures of specific information seem needed.
Because the true signal is harder to detect than often supposed, researchers need a critical mass of data to confirm the signal. Pressures to publish in academia can cause researchers to rush to publish results from small databases obtainable in a brief time rather than spending the time, sometimes years, needed to generate the database of sufficient size that can provide reproducible results.
Understanding Graphical Depictions of Data
Some graphical conventions that have become almost standard in the self-assessment literature depict ordered patterns from random noise. These patterns invite researchers to interpret the order as produced by the self-assessment signal. Graphing of nonsense data generated from random numbers in varied graphical formats can reveal what pure randomness looks like when depicted in any graphical convention. Knowing the patterns of randomness enables acquiring the numeracy needed to understand self-assessment measurements.
Some obvious questions I am anticipating follow: (1) How do I know if my instruments are capturing mainly noise or signal? (2) How can I tell when a database (either my own or one described in a peer-reviewed publication) is of sufficient size to be reproducible? (3) What are some alternatives to single global questions? (4) What kinds of graphs portray random noise as a legitimate self-assessment signal? (5) When I see a graph in a publication, how can I tell if it is mainly noise or mainly signal? (6) What kind of correlations are reasonable to expect between self-assessed competency and actual competency?
This article “provides an overview of the conceptual and methodological issues involved in developing and evaluating measures of metacognition and self-regulated learning.” Sections in this article discuss the components of metacognition and self-regulated learning as well as the assessment of metacognition.
Pintrich, Paul R.; Wolters, Christopher A.; and Baxter, Gail P., “2. Assessing Metacognition and Self-Regulated Learning” (2000). Issues in the Measurement of Metacognition. Paper 3.