Grading Student Work
What Purposes Do Grades Serve?
Barbara Walvoord and Virginia Anderson identify the multiple roles that grades serve:
- as an evaluation of student work;
- as a means of communicating to students, parents, graduate schools, professional schools, and future employers about a student’s performance in college and potential for further success;
- as a source of motivation to students for continued learning and improvement;
- as a means of organizing a lesson, a unit, or a semester in that grades mark transitions in a course and bring closure to it.
Additionally, grading provides students with feedback on their own learning, clarifying for them what they understand, what they don’t understand, and where they can improve. Grading also provides feedback to instructors on their students’ learning, information that can inform future teaching decisions.
Why is grading often a challenge? Because grades are used as evaluations of student work, it’s important that grades accurately reflect the quality of student work and that student work is graded fairly. Grading with accuracy and fairness can take a lot of time, which is often in short supply for college instructors. Students who aren’t satisfied with their grades can sometimes protest their grades in ways that cause headaches for instructors. Also, some instructors find that their students’ focus or even their own focus on assigning numbers to student work gets in the way of promoting actual learning.
Given all that grades do and represent, it’s no surprise that they are a source of anxiety for students and that grading is often a stressful process for instructors.
Incorporating the strategies below will not eliminate the stress of grading for instructors, but it will decrease that stress and make the process of grading seem less arbitrary — to instructors and students alike.
Source: Walvoord, B. & V. Anderson (1998). Effective Grading: A Tool for Learning and Assessment . San Francisco : Jossey-Bass.
Developing Grading Criteria
- Consider the different kinds of work you’ll ask students to do for your course. This work might include: quizzes, examinations, lab reports, essays, class participation, and oral presentations.
- For the work that’s most significant to you and/or will carry the most weight, identify what’s most important to you. Is it clarity? Creativity? Rigor? Thoroughness? Precision? Demonstration of knowledge? Critical inquiry?
- Transform the characteristics you’ve identified into grading criteria for the work most significant to you, distinguishing excellent work (A-level) from very good (B-level), fair to good (C-level), poor (D-level), and unacceptable work.
Developing criteria may seem like a lot of work, but having clear criteria can
- save time in the grading process
- make that process more consistent and fair
- communicate your expectations to students
- help you to decide what and how to teach
- help students understand how their work is graded
Sample criteria for a few different types of assignments are available via the following links.
Making Grading More Efficient
- Create assignments that have clear goals and criteria for assessment. The better students understand what you’re asking them to do the more likely they’ll do it!
- Use different grading scales for different assignments. Grading scales include:
- letter grades with pluses and minuses (for papers, essays, essay exams, etc.)
- 100-point numerical scale (for exams, certain types of projects, etc.)
- check +, check, check- (for quizzes, homework, response papers, quick reports or presentations, etc.)
- pass-fail or credit-no-credit (for preparatory work)
- Limit your comments or notations to those your students can use for further learning or improvement.
- Spend more time on guiding students in the process of doing work than on grading it.
- For each significant assignment, establish a grading schedule and stick to it.
Light Grading – Bear in mind that not every piece of student work may need your full attention. Sometimes it’s sufficient to grade student work on a simplified scale (minus / check / check-plus or even zero points / one point) to motivate them to engage in the work you want them to do. In particular, if you have students do some small assignment before class, you might not need to give them much feedback on that assignment if you’re going to discuss it in class.
Multiple-Choice Questions – These are easy to grade but can be challenging to write. Look for common student misconceptions and misunderstandings you can use to construct answer choices for your multiple-choice questions, perhaps by looking for patterns in student responses to past open-ended questions. And while multiple-choice questions are great for assessing recall of factual information, they can also work well to assess conceptual understanding and applications.
Test Corrections – Giving students points back for test corrections motivates them to learn from their mistakes, which can be critical in a course in which the material on one test is important for understanding material later in the term. Moreover, test corrections can actually save time grading, since grading the test the first time requires less feedback to students and grading the corrections often goes quickly because the student responses are mostly correct.
Spreadsheets – Many instructors use spreadsheets (e.g. Excel) to keep track of student grades. A spreadsheet program can automate most or all of the calculations you might need to perform to compute student grades. A grading spreadsheet can also reveal informative patterns in student grades. To learn a few tips and tricks for using Excel as a gradebook take a look at this sample Excel gradebook.
Providing Meaningful Feedback to Students
- Use your comments to teach rather than to justify your grade, focusing on what you’d most like students to address in future work.
- Link your comments and feedback to the goals for an assignment.
- Comment primarily on patterns — representative strengths and weaknesses.
- Avoid over-commenting or “picking apart” students’ work.
- In your final comments, ask questions that will guide further inquiry by students rather than provide answers for them.
Maintaining Grading Consistency in Multi-sectioned Courses (for course heads)
- Communicate your grading policies, standards, and criteria to teaching assistants, graders, and students in your course.
- Discuss your expectations about all facets of grading (criteria, timeliness, consistency, grade disputes, etc) with your teaching assistants and graders.
- Encourage teaching assistants and graders to share grading concerns and questions with you.
- Use an appropriate group grading strategy:
- have teaching assistants grade assignments for students not in their section or lab to curb favoritism (N.B. this strategy puts the emphasis on the evaluative, rather than the teaching, function of grading);
- have each section of an exam graded by only one teaching assistant or grader to ensure consistency across the board;
- have teaching assistants and graders grade student work at the same time in the same place so they can compare their grades on certain sections and arrive at consensus.
Minimizing Student Complaints about Grading
- Include your grading policies, procedures, and standards in your syllabus.
- Avoid modifying your policies, including those on late work, once you’ve communicated them to students.
- Distribute your grading criteria to students at the beginning of the term and remind them of the relevant criteria when assigning and returning work.
- Keep in-class discussion of grades to a minimum, focusing rather on course learning goals.
For a comprehensive look at grading, see the chapter “Grading Practices” from Barbara Gross Davis’s Tools for Teaching.
The essential problem in data assessment is called overfitting, i.e. using a small dataset to predict something. The grading software must compare essays, understand what parts are great and not so great and then condense this down to a number which constitutes the grade, which in its turn must be comparable with a different essay on a totally different topic. Sounds hard, doesn’t it? That’s because it is. Very hard. But still, not impossible. Google uses similar tactics when comparing what resulting texts and images are more preferable to different search terms. The issue is just that Google uses millions of data samples for their approximations. A single school could, at best, input a few thousand essays. This is like trying to solve a 1000-piece puzzle with just 50 pieces. Sure, some pieces can end up in the right place but it’s mostly guess work. Until there is a humongous database of millions and millions of essays, this problem will most likely be hard to work around.
The only plausible solution to overfitting is specifying a specific set of rules for the computer to act upon to determine if a text makes sense or not, since computers can’t read. This solution has worked in many other applications. Right now, auto-grading vendors are throwing everything they got at coming up with these rules, it’s just that it is so hard coming up with a rule to decide the quality of creative work such as essays. Computers have a tendency of solving problems in the way they usually do: by counting.
In auto-grading, the grade predictors could, for example, be; sentence length, the number of words, number of verbs, number of complex words and so on. Do these rules make for a sensible assessment? Not according to Perelman at least. He says that the prediction rules are often set in a very rigid and limited way which restrains the quality of these assessments. For example, he has found out that:
- A longer essay is considered better than short one (a coincidence according to auto grading advocate and professor Mark D. Shermis)
- Specific word associated with complex thinking such as ’moreover’ and ’however’ leads to better grades
- Towering words such as ’avarice’ gives more points than using simple ones such as ’greed’
On other instances he found examples of rules poorly applied or just not applied at all, the software could for example not determine whether facts were true or false. In a published and automatically graded essay, the task was to discuss the main reasons why a college education is so expensive. Perelman argued that the explanation lies within the greedy teacher’s assistants who has a salary of six times that of a college president and regularly uses their complementary private jets for a south sea vacation.
The essay was awarded the highest grade possible: 6/6.
To avoid the examining eye of Perelman and his peers most vendors have restricted use of their software while development is still ongoing. So far, Perelman hasn’t gotten his hand on the most prominent systems and admits that so far he has only been able to fool a couple of systems.
If we are to believe Perelman’s claims, automatic grading of college level essays still has a long way to go. But remember that already today, lower grade essays is actually being graded by computers already. Granted, under meticulous supervision by humans but still, technological progress can move fast. Considering how much effort being asserted towards perfecting automatic grading scoring it is likely we will see a fast expansion in a not too distant future.
About the author: Hubert.ai is a young edtech company based in Stockholm, Sweden. We are working to disrupt teacher feedback by using AI conversational dialog with every student separately. Feedback is then analyzed and compiled down to a few recommendations on how you as a teacher can improve your skills and methods. Are you a teacher and would like to help us in development? Please sign up as a beta tester at our website :]