Archives For assessment

Since I write to understand what I think, I have decided to focus this particular post on the different categories of assessments. My thinking has been motivated by helping teachers with ongoing education reforms that have increased demands to measure student performance in the classroom. I recently organized a survey asking teachers about a variety of assessments: formative, interim, and summative. In determining which is which, I have witnessed their assessment separation anxieties.

Therefore, I am using this “spectrum of assessment” graphic to help explain:

Screenshot 2014-06-20 14.58.50

The “bands” between formative and interim assessments and the “bands” between interim and summative blur in measuring student progress.

At one end of the grading spectrum (right) lie the high stakes summative assessments that given at the conclusion of a unit, quarter or semester. In a survey given to teachers in my school this past spring,100 % of teachers understood these assessments to be the final measure of student progress, and the list of examples was much more uniform:

  • a comprehensive test
  • a final project
  • a paper
  • a recital/performance

At the other end, lie the low-stakes formative assessments (left) that provide feedback to the teacher to inform instruction. Formative assessments are timely, allowing teachers to modify lessons as they teach. Formative assessments may not be graded, but if they are, they do not contribute many points towards a student’s GPA.

In our survey, 60 % of teachers generally understood formative assessments to be those small assessments or “checks for understanding” that let them move on through a lesson or unit. In developing a list of examples, teachers suggested a wide range of examples of formative assessments they used in their daily practice in multiple disciplines including:

  • draw a concept map
  • determining prior knowledge (K-W-L)
  • pre-test
  • student proposal of project or paper for early feedback
  • homework
  • entrance/exit slips
  • discussion/group work peer ratings
  • behavior rating with rubric
  • task completion
  • notebook checks
  • tweet a response
  • comment on a blog

But there was anxiety in trying to disaggregate the variety of formative assessments from other assessments in the multiple colored band in the middle of the grading spectrum, the area given to interim assessments. This school year, the term interim assessments is new, and its introduction has caused the most confusion with members of my faculty. In the survey, teachers were first provided a definition:

An interim assessment is a form of assessment that educators use to (1) evaluate where students are in their learning progress and (2) determine whether they are on track to performing well on future assessments, such as standardized tests or end-of-course exams. (Ed Glossary)

Yet, one teacher responding to this definition on the survey noted, “sounds an awful lot like formative.” Others added small comments in response to the question, “Interim assessments do what?”

  • Interim assessments occur at key points during the marking period.
  • Interim assessment measure when a teacher moves to the next step in the learning sequence
  • interim assessments are worth less than a summative assessment.
  • Interim assessments are given after a major concept or skill has been taught and practiced.

Many teachers also noted how interim assessments should be used to measure student progress on standards such as those in the Common Core State Standards (CCSS) or standardized tests. Since our State of Connecticut is a member of the Smarter Balanced Assessment Consortium (SBAC), nearly all teachers placed practice for this assessment clearly in the interim band.

But finding a list of generic or even discipline specific examples of other interim assessments has proved more elusive. Furthermore, many teachers questioned how many interim assessments were necessary to measure student understanding? While there are multiple formative assessments contrasted with a minimal number of summative assessments, there is little guidance on the frequency of interim assessments.  So there was no surprise when 25% of our faculty still was confused in developing the following list of examples of interim assessments:

  • content or skill based quizzes
  • mid-tests or partial tests
  • SBAC practice assessments
  • Common or benchmark assessments for the CCSS

Most teachers believed that the examples blurred on the spectrum of assessment, from formative to interim and from interim to summative. A summative assessment that went horribly wrong could be repurposed as an interim assessment or a formative assessment that was particularly successful could move up to be an interim assessment. We agreed that the outcome or the results was what determined how the assessment could be used.

Part of teacher consternation was the result of assigning category weights for each assessment so that there would be a common grading procedure using common language for all stakeholders: students, teachers, administrators, and parents. Ultimately the recommendation was to set category weights to 30% summative, 10% formative, and 60% interim in the Powerschool grade book for next year.

In organizing the discussion, and this post, I did come across several explanations on the rational or “why” for separating out interim assessments. Educator Rick DuFour emphasized how the interim assessment responds to the question, “What will we do when some of them [students] don’t learn it [content]?” He argues that the data gained from interim assessments can help a teacher prevent failure in a summative assessment given later.Screenshot 2014-06-20 16.50.15

Another helpful explanation came from a 2007 study titled “The Role of Interim Assessments in a Comprehensive Assessment System,” by the National Center for the Improvement of Educational Assessment and the Aspen Institute. This study suggested that three reasons to use interim assessments were: for instruction, for evaluation, and for prediction. They did not use a color spectrum as a graphic, but chose instead a right triangle to indicate the frequency of the interim assessment for instructing, evaluating and predicting student understanding.

I also predict that our teachers will become more comfortable with separating out the interim assessments as a means to measure student progress once they see them as part of a large continuum that can, on occasion,  be a little fuzzy. Like the bands on a color spectrum, the separation of assessments may blur, but they are all necessary to give the complete (and colorful) picture of student progress.

If I had a choice of vanity license plates, I might consider one that marked my recent experience as a volunteer on an educational accreditation team.

NEASC PlateEducational accreditation is the “quality assurance process during which services and operations of schools are evaluated by an external body to determine if applicable standards are met.”

I served as a volunteer on a panel for the New England Association of Schools and Colleges (NEASC), an agency that provides accreditation services  – Pre-K through university for more than 2000 public and private institutions in the six state region.  NEASC  Panels are composed of experienced chairpersons and volunteer teachers, administrators, and support staff who visit schools according to a set schedule. According to its website:

In preparation for a NEASC evaluation, all member schools must undertake an exhaustive self-study involving the participation of faculty, administrators, staff, students, community members, and board members.

The key word here? Exhaustive.

Exhaustive in preparation for a NEASC visit. Exhaustive in being hosting a NEASC visit. Exhaustive in being a member of the NEASC team that visits.

But first, a little background. In order to serve as a volunteer, I had to leave several lessons on Hamlet, my favorite unit, with my substitute. So, when I understood the level of professional discretion required for a NEASC visit, I felt a curious connection to the Ghost, Hamlet’s father, who likewise abides by an oath.  On the ramparts of Elsinore, he tells Hamlet:

But that I am forbid
To tell the secrets of my prison-house
I could a tale unfold whose lightest word
Would harrow up thy soul, freeze thy young blood,(1.5.749-752)

I may not say what school I visited nor may I discuss any part of the actual accreditation discussion by members of my team. So this post will speak only as a self reflection of the process and a few moments of recognition on how accreditation works.

List, list, O, list! (1.5.758)

Sunday morning at 9:30 AM, the team members were already hard at work organizing piles of documents prepared for our visit. We were organized into pairs, two members to work on each of the seven standards, 14 members of the team and two chairpeople.

There was a working lunch before the entire team went to the school for a prepared presentation. This presentation was the high school’s opportunity to quickly familiarize us with their school’s culture and present their strengths and needs that they had determined in the (exhaustive) self study.

Madam, how like you this play?(3.2.222)

Returning to our hotel, the lodgings provided by our hosting school, the work began in earnest. We looked through bins of student work to see if they met the standards set by NEASC.  We looked at all forms of assessments, lesson plans, and student responses. We recorded our findings well into the night, and finally left the work room at 10 PM.

…to sleep;/To sleep: perchance to dream (3.1.65-66)

On both Monday and Tuesday, the team was up early to return to the school (7:00 AM), and the team split up individually or in groups to spend a school day conducting interviews with faculty, staff, and students. Facility tours, lunches shared with students in the cafeteria, and opportunities to “pop-into” classes were available. There simply was no “unobligated time” as we worked steadily in the work room at the school. Here we would record our findings before returning to the school hallways.

Were you not sent for? Is it
your own inclining? Is it a free visitation? Come,
deal justly (2.2.275-276)

Both Monday and Tuesday evening sessions were long as team members furiously documented their findings into a report that will still need editing and revision.  We had worked from 6AM-10:30PM with time allotted for meals and one hour respite in order to call home or check on my own school’s e-mail.  Closing my eyes, I thought how much,

My spirits grow dull, and fain I would beguile
The tedious day with sleep. (3.2.226-227)

An early Wednesday morning work session let us polish the report and present our final conclusions to other members of the team. Finally, the votes as to whether the team would recommend accreditation or not to the school were tallied, and we marched into the school library to meet the faculty and staff a final time. We were leaving a report for them to:

suit the action to the word, the word
to the action; (3.2.17-18)

The chair gave a short speech indicating the tone but not the contents of our report, and then, according to protocol, we left as team, not speaking to anyone from the school, nor to each other. Staying silent, I thought

Farewell, and let your haste commend your duty. (1.2.39)

The experience provided me with insights into the strengths and weaknesses in the educational program of my own school, and I am eager to share ways that can improve instruction with my fellow faculty members. Our school is scheduled for a visit in the spring of 2014 by a NEASC accreditation team.

As professional development, the experience was positive but physically demanding and intellectually challenging. The chairs’ use of technology (Google docs, Livebinders, Linot) allowed for efficient sharing of information on seven standards: Core Values and Beliefs, Curriculum, Instruction, Assessment, School Culture and Leadership, School Resources, and Community Resources. Awash in papers and digital materials for 16 hours a day, I wondered how any previous teams using only hard copies had collaborated successfully.

Additionally, as I looked at the various standards of instruction, I also found myself wondering about the consequences of implementing Common Core Standards (CCSS) and the growing reliance on standardized testing in evaluating teachers and assessing student understanding. Will the current form of regional accreditation adjust to measurements that will be implemented nationally? The United States is broken into five regional accreditation districts, however, if students meet the national standards, how will these regional accreditation panels be used?

Finally, our four day “snap shot” coupled with a the school’s own exhaustive self-study could not address all of the arbitrary elements out of a school’s control, but the process is far more informative and meaningful than any standardized test results that could be offered by the CCSS. Consider also that the financing of a school seriously impacts, for good or for ill, all standards of measuring a school’s success. The intangible “culture” surrounding a school and the fluid landscape of 21st Century’s technology are other arbitrary factors that impact all standards. We even encountered a “snow-delayed” opening as if to remind us that a capricious Mother Nature refuses to allow for standardized measurement!

I only hope that my experience in informing another school in order to improve their educational program will prove beneficial. I know that when the team comes in the spring of 2014, that that they will do as I have tried to do:

 report me and my cause aright…(5.2.339)

The rest I now need requires silence.

E is a beautiful young 16 year old who blithely drifted in and out of my English II classroom this year without any materials. She seemed surprised to find herself in the class every day. She is pleasant, friendly, and well-liked by her peers; we have a cordial relationship. Unfortunately, E achieved a 31% in English for the first quarter, which seriously damaged her GPA for the remainder of the 2011-2012 school year. Over the course of eight months, E continued to leave assignments incomplete and did little classwork, choosing instead to text or to socialize with the students sitting around her. She lost study guides, lost materials, and lost interest in editing and revising her work. She once sent me an e-mail telling me she “could not get online to see the assignment.”

This  week, I will enter her final grade.  After  four quarters of assigning, collecting, correcting, and returning, I am looking at a failing grade (just below a 60%). Her grade must be a reflection of her academic ability….or is it?

I am in the Groundhog Day of academics when every June I  experience this exact philosophical dilemma: Do I pass a student who understands the materials but who has not completed the assigned work or do I enter a failing grade? Over the course of the year, I am careful that the work I do assign is critical to assessing student understanding. Assigned work should be meaningful and assessed accurately, a process that should result in plenty of data (tests, projects, quizzes) that determines student progress. However, and perhaps more importantly, there is also anecdotal information to consider; classroom performance is the  “third leg” to the footstool of data collection.

While class was in session, and E was engaged, she made contributions. I recently overheard her explain the complicated allegorical ending of The Life of Pi to a fellow student (“The author is saying you have to decide which story is the true story…”). In March she made connections to the  Kony 2012 campaign after we watched Hotel Rwanda as part of our  Night unit. She casually suggested that over time Lady Macbeth “developed insecurities and should have taken a little Valium to settle her nerves.” She equitably included fellow students in “tossing” the plush witch doll when the class was reviewing important lines from the play, and she decided that the witches should be assigned 70% of the responsibility for Duncan’s death but only 20% of the responsibility for Banquo’s death. She noted that Macbeth was deteriorating as a “human” as his guilt increased. She empathized with Oliver Twist (“If I was an orphan, I might have been a pickpocket too…”) and suggested that the “Irish Airman Who Foresees His Death” had a “need for speed.” She understood an author’s purpose, tone, and use literary devices. I anticipate she will have a passing grade on the state mandated assessment that she took in February.

On the rare occasion when E turned in work, she demonstrated that she was capable of writing on grade level. Numerous common assessments taken in class indicated that her reading comprehension was also on grade level.   She remained blissfully unconcerned as I cajoled, teased , chided, scolded, and threatened her into completing work. Calls home were unproductive, and other teachers indicated that English was not the only cause for academic concern. The school year was maddening.

Now, as the grades are totaled in June, I wonder, do I hold her accountable for work left incomplete? Can she be exempted from the assignments that all her classmates completed? What is the minimal number of assignments that are the most important to determining student performance?  If I exempt her from less important assignments, am I reinforcing her lack of responsibility? Finally, is passing her fair to the students who did complete the work assigned?

I have been teaching for over twenty years, and I still wrestle with the emphasis placed on grades. Do grades really reflect student ability? There are students in the class who have completed all of the work I assigned. Does their “B” grade mean they really understand 85% of the material? Does E’s failing grade mean she understands less than 60% of the material in grade 10 English? Will enrolling her in another year in 10th grade English bear a different result? Is she prepared or unprepared to meet the rigors of Grade 11 English?

These philosophical questions become more complicated as education is increasingly driven by data. Student performance is quickly aggregated and evaluated using collective (vs. class) and individual (vs. self) bits of data. Mean scores and t-tests are recorded, spreadsheets are created, and reports generated to create “smart goals” that target instruction. Ultimately, assessment data will be used to evaluate teacher performance. Unfortunately, E’s overall 10th grade performance in English  has been measured by a lack of data.

Ultimately, I need to make the decision that relegates E to summer school, requires her to repeat Sophomore English, or allows E to move to  Junior English. Every year I am in the same philosophical dilemma with a student who defies the conventions of assessment. This year it is E; last year it was J. Every year I wonder how I can make this objective data-driven decision when the subjective experience in the classroom informs me so differently? My professional experience as an educator encourages me to see E as more than a unit to be measured. Finally, while I am painfully aware that the decisions she has made directly  impacts the decisions I now must make, she remains characteristically blithely unaware.

To pass or not to pass? That is the question.

UPDATE