In colleges and universities nationwide science faculty are changing how they teach to include more student-active approaches. It is difficult to know how widespread or consequential the changes are, but it appears that funding for course reforms from PKal (Project Kaleidoscope), NSF and others has paid off. Walk into a random college classroom and the science professor will probably be lecturing, but walk into the class next door and it wouldn't be surprising to see students working on open ended questions in groups or engaged in lively discussion, even if the class is large. What you would be seeing are examples of students taking responsibility for their own learning.
Those of us who have taken the leap into student-active teaching are now being urged by our funding sources or workshop leaders to take the next logical step and evaluate our progress. We are being encouraged to ask how do I know whether my changes are working?, or in education language, to do evaluation.
Evaluation of our efforts makes a lot of sense. After all, we are scientists who pose questions that we continually re-examine with data. A problem facing us, though, is the same problem that made changing our views about teaching and learning so difficult in the first place. Most of us have no formal training in pedagogy or theories of cognition and learning, and we have no training in the principles and methods of evaluation either. We are being urged to do something even more foreign to us than experimenting with cooperative groupwork or nontraditional testing.
In my experience most faculty catch on quickly to the concepts and approaches of evaluation if they are given the fundamentals, such as vocabulary definitions, and also specific examples developed and modified by other science teachers. With the basic tools of evaluation I believe most of us can progress quickly towards designing our own evaluation procedures because doing evaluation is in essence like doing science. Evaluators make observations, ask questions, design experiments, get data, and revisit initial questions. Science faculty know how to do this and many enjoy such work.
In this article I will give Ecology 101 readers the same key information about evaluation that I and my colleagues give faculty in our teaching workshops. With this information and these resources I hope that some of you will become interested in asking "how do I know if it's working?" and more confident about how you can find out.
Educators like any other professionals have their own language, and the first hurdle for would-be evaluators is understanding the jargon and concepts of evaluation. I start with "evaluation" vs. "assessment" because this was most confusing to me at first.
In my first NSF Division of Undergraduate Education training session, I learned that we evaluate projects or programs and assess student progress (e.g. give them tests). This is similar to the glossary definition of assessment in NSF's evaluation publication (1993): "Assessment is often used as a synonym for evaluation. The term [assessment] is sometimes recommended for restriction to processes that are focused on quantitative and/or testing approaches". To avoid confusion be aware that use of two terms is sometimes reversed. The classic Classroom Assessment Techniques (Angelo and Cross 1993) focuses on ongoing evaluation as a way for teachers ascertain what and how well their students are learning.
NSF clearly distinguishes between evaluation of programs (a coordinated collection of projects) and projects (a particular activity; NSF 1993.). This article is about project evaluation.
Understanding the difference between formative and summative evaluation helped me appreciate what evaluation was really about. Formative evaluation looks at the project (course) all along the way and its purpose is to give ongoing diagnosis and feedback so that professors can change their teaching if needed. Summative evaluation is what we are all familiar with when we give students tests. Or, as evaluator Bob Stake, said: "When the cook tastes the soup, that's formative; when the guest taste the soup, that's summative" (ibid.).
Angelo and Cross (1993) give a good overview of formative evaluation with their 7 basic assumptions of classroom assessment: 1) quality of student learning is directly related to quality of teaching, 2) the first step in getting useful feedback about course goals is to make these goals explicit, 3) students need focused feedback early and often, and they should be taught how to assess their own learning, 4) the most effective assessment addresses problem-directed questions that faculty ask themselves, 5) course assessment is an intellectual challenge and therefore motivating for faculty, 6) assessment does not require special training, 7) collaboration with colleagues and students improves learning and is satisfying.
A website that also nicely explains formative and summative evaluation has been developed by Doug Eder, a biologist at Southern Illinois University (www.siue.edu/assessment/, click on "classroom assessment techniques"). Doug emphasizes that formative evaluation is non-judgmental partly because the focus in on learning as influenced by many factors such as teaching approaches, student's background knowledge, and student motivation. Final assessments (grades) are usually private and anonymous, and the full weight of a grade is placed on the student alone who therefore identifies with them, for better or for worse. The following table modified from Eder's website details the difference between formative assessment and summative assessment (graded tests).
Formative | Grades |
---|---|
Formative | Summative |
Diagnostic | Final |
Non-judgmental | Evaluative |
Private | Administrative |
Often Anonymous | Identified |
Specific | Holistic |
Usually Goal-Directed | Usually Content-Driven |
A good way to begin to understand the process of course evaluation is to simply look at a range of ways that science professors do it. The following is a list of approaches from Eder's site plus Angelo and Cross (1993).
Minute Paper — popular because it is a quick diagnostic that helps students reflect on the class and gives the teacher immediate feedback. Questions for a minute paper at the end of a session might be "What was the main point of today's class?", What points were most confusing?", "What points were most interesting?". Faculty who use this come up with their own ways to collect the responses efficiently (e.g. students pick up index cards on the way in and drop them in boxes in the back of the room on the way out). Even with large classes a professor can quickly scan through the cards to get the overall response to the questions. An important point with all formative evaluations is that faculty should bring common or interesting student responses to the next class because students will be much more likely to take this evaluation seriously if they see that the professor respects what they have to say. Minute papers can be used often or infrequently.
Muddiest Point — a modification of the minute paper that allows students to describe ideas or concepts that are most unclear to them.
Transfer and Apply — a way for students to learn how to apply what they have learned to new situations. Application is one of the more difficult critical thinking skills (along with analysis and comparison) that students need to practice. In this evaluation students are asked to list ideas or strategies from the class and then apply them to other situations.
Student-active learning usually involves students working collaboratively in groups on questions or projects in and out of class. In the workshops I've attended or led "groupwork" is second only to "coverage" as a controversial and difficult aspect course reform. In this short article I will only describe one way for students to evaluate how well their group is doing. This example illustrates how and why students, as well as teachers, do formative evaluation.
Most of us have very mixed feelings about asking students to evaluate their own and their colleague's performance and effectiveness in groups. I have only done this a few times and the results were fuzzy, probably because I did not prepare the students well enough. If you want to try this you can use one of the forms available on the web (see sites below) or make up your own. In this assessment students are asked to rank their responses to questions like: "How many group members participate actively in your group most of the time?", "How effectively did your group accomplish this task?", "How would you judge your own effectiveness in this group?" or address open ended questions about uneven participation or how the group could work better together. An important aspect of this evaluation is that it helps students be more reflective about group process and what can be done to improve it. Another critical point is that the teacher must allow class time for discussion of the purpose and ethics of this evaluation.
Reciprocal Classroom Interview — a formal technique in which two colleagues who know and trust each other interview students in each other's courses. This requires a fair amount of time and includes a meeting before the selected class to outline the focus and questions, another debriefing after class, clear explanation to the students about the purpose of the evaluation (e.g. that it is anonymous and private) and enough classtime.
Educators frequently talk about rubrics, but most faculty have never heard of them. A rubric is a formal way to explicitly tell your students how you are going to grade or otherwise evaluate them on a test, paper, oral presentation, poster. When I first read about rubrics I immediately understood their utility and (again) how my ignorance about education has impacted my teaching. How could I expect my undergraduates to write good primary papers if I didn't tell them in real detail what I meant by "put your question or hypothesis in context", "compare your results with others'", or "describe your findings in the results section, interpret them in the discussion"?
Writing good rubrics may be one of the most important things you can do for your students and to improve your teaching. Creating the rubric clarifies your thinking about what you consider essential for your students to know and be able to do. For example, faculty in workshops often list "improving critical thinking skills" as an important course goal, but they do not explicitly explain, discuss, or practice what they mean by "critical thinking" in their particular course. Writing a rubric that operationally defines critical thinking helps faculty restructure their teaching to focus more directly on this sophisticated aspect of learning.
Eder's site contains a good example of a rubric for assessment and evaluation of student writing. Listed are aspects he uses in grading such as "uses disciplinary facts correctly" and "provides adequate supporting arguments with reasons, evidence, and examples". These are ranked from excellent to poor. I prefer to develop my own rubrics and I often do this with students because they become much more invested in the goals as a result. Some of these class discussions about good and poor development of arguments, data description, and the like have been invaluable to my students' understanding of these higher level skills.
Action research is a type of formative investigative evaluation done by teachers on their own classes and institutions. According to Elliot (1991) action research is "the study of a social situation with a view to improving the quality of action within it". Essential aspects of this research are that it is reflective, useful, focused on pragmatic issues or questions (that you can do something about), and structured. Action research has stimulated K-12 teachers in particular to professionalize and communicate their reform efforts and has empowered them to change situations in their classrooms and schools.
Action research could also be a powerful tool for college science teachers, but very few do it. Again, components of this research are quite familiar to scientists — focusing and shaping an issue or question plus collecting and reflecting on data. What I have found most foreign about this type of evaluation is that is that the data include fuzzy, qualitative information such as student behavior or interviews. What I have most appreciated is the potential immediate utility of the findings.
As an example from my own teaching, I was especially interested to know whether students in a freshman ecology course recognized the importance of a goal we discussed numerous times over the semester. The objective was for the students to recognize that science is a reiterative process and that messy or unexpected data are not "wrong". To evaluate my student's appreciation of this goal, I asked them to write self evaluations that focused on objectives for the course, and I looked for wording that would indicate their maturity about this aspect process of science.
About half of the students wrote things like "I did learn a valuable lesson that even mistakes made in research are useful..." and "Our experiment ... didn't really work as we wanted it to, but I learned a lot about setting up an experiment, looking for all the variables, and identifying problems". While I was pleased with these comments, I was surprised that more students did not make them. This finding has forced me to think more carefully about how I discuss this aspect of science in the class. Although I thought I was quite explicit, perhaps I was not.
Another example comes from an introductory Oceanography class taught by Richard Yuretich, a geologist at UMass in Amherst, MA. Richard teaches in the most challenging of situations — in a big lecture hall to 300 students who use this course to fulfill a requirement at a state university. Over the past few years Richard has made major changes in this class including cooperative group exams and frequent groupwork in class on open-ended questions (e.g. "think-pair-share"). One way that he has attempted to assess the effect of these changes on student performance is to compare final exam results in 1996 (before the changes) and 1998 (after changes; Yuretich and Leckie 2000). He found that the mean exam score was substantially higher in 1998 and students in the redesigned courses did better on 37 of 38 identical questions. (The topic for question 38 was not covered in 1998). More specifically, fewer students received a "D' or "F" in the second year, indicating that the changes may have helped those with greatest academic difficulty. The questions assessed a range of abilities: recall, calculation, interpretation, and deduction. Additional evidence about improved student attitudes in this class came from end of semester evaluations. Many more students in the reformed class showed interest in oceanography and they acknowledged benefits to them from the new teaching approaches. (For a similar but more thorough study of a large course in biology see Ebert-May et al. 1997).
Reflecting on data like these has been important to Richard. Redesigning this class has been extremely time-consuming. He and I have had many conversations about time investment in teaching vs., research, and so positive feedback that the efforts are really worth it is crucial for him. In addition, he now has evidence that he can show students from their own course when he explains the pedagogical philosophy of the class and talks about how people learn best. Finally, the research provides that bases for further improvements. Written comments from students at the end of the semester emphasized a common problem with groupwork — students who "go along for the ride" and do not participate in discussion. In the future he hopes to use older students as roving monitors in class to help address this issue, and these written comments from students may help him get the extra funds to support this.
If you decide to try doing formative evaluation in a course, start small. Pick an appropriate method for one class session, tell your students what you are up to and why, and then report back to the class, including explaining any adjustments you make.
To learn more about formative evaluation of your own redesigned courses, look through the websites and other resources listed below. Also, I have just completed a commercially produced video (funded by NSF's Division of Undergraduate Education) called How Change Happens: Breaking the Teach As Your Were Taught Cycle in Science and Math that features Richard Yuretich and other faculty from a range of colleges and university settings. How Change Happens follows these teachers into their classroom as they improve their teaching and reflect on their progress, what keeps them going, and how their students have become more reflective, better thinkers. Email me if you would like a free copy of this video. Finally, sign up for the teaching research workshop offered by Diane Ebert-May at the Snowbird 2000 and future ESA meetings.