Thursday, January 8, 2015

Data Doesn't Have to Be a Four Letter Word

I am not one to sing the praises of data collection. Specifically, I actively hate what legislators and the powers that be have used the excuse of "data collection" to do. I have written before about how I feel about test-driven class curriculum, and have complained very loudly on facebook (and written some legislators, Arne Duncan, and President Obama) about these things.

But I believe in useful data. Data that focuses on what I need to improve not only my teaching, but my current students' understanding so they are more successful in my class. This class, this year.

Last semester, we covered (I'll say "covered" until I can feel confident that "learned" is the correct word) 123 vocabulary words. This is a low number for a traditional textbook, but it is still a lot of words my students are juggling around in their heads, with a lot of potential for forgetting and confusion. I need to know what words have been forgotten in general and since the two week break we just completed. I need to know what words I need to review so students can comfortably function in the language their second semester. I need data.

I actually came up with a really simple way to collect that data, and I'm making students help me collect the data because I don't want to spend hours outside of class doing it myself.

Collecting the data

I put together two quizzes, each a little over 60 words (123 seemed like a lot for one day), for students to take. I asked them not to write their names on the quizzes, so I can have accurate data, and to do their best, with the threat that it would be more work for them the more words we needed to review.

In class each day I assigned a quiz, explained the purpose and the need for accurate data, and students wrote in the words they knew. Then I took up the quizzes, passed them back out at random, and had students grade them as I quickly went through the list. Lastly, I called out the words on the list, one by one, and had students raise their hands if the paper they graded missed the word I called. This is where the anonymity and exchange of papers was really important. No one in the class (except probably the student who wrote the paper) knew who had missed what word, so it prevented embarrassment (as much as possible) and encouraged honesty. I recorded the results on a data sheet I had previously created (see bottom of post on how to create one yourself if you aren't too familiar with worksheet programs) which you can see in the image below offers me the total number of students who missed each word in each class, the total for all classes, and the percentage of my students who did not know a word.


The percentage was really what I was looking for. In a dream world, I would hold every word to a 90% rule--it is only good if 90% of my students know it. Or even a 100% standard. But it's unrealistic. I am currently standing at 186 students. With 186 students of varying personalities and learning strengths, it would be impossible to teach everyone everything perfectly. 

So I'm looking for 80% understanding. It's from the traditional TPRS standard "80% of 80%" which translates to "at least 80% of your students understand at least 80% of the language you're using." 

I went through and color-coded the words with that standard in mind. If 20%-30% of my students didn't know a word, I colored it yellow. For 30%-40%, orange. And if more than 40% of my students did not know a word, it went into the red zone, as a word that I need to cover first and with the most repetitions.

Trends I saw

I found a few trends that reinforced my expectations and understanding of language learning. Generally the words that were red were words we had not spent as much time on. There were a concentration of them at the beginning of the semester, and at the end of the semester, representing words we began with but didn't really repeat again and words that we ended with and had maybe a week or two to work into repetitions.

The first day we collected data on the first half of the vocabulary for the semester, and the great news is that after a semester of repetition, over 80% of students knew 70% of the words. That's honestly a very good number, and better than any year I've taught previously.

The second half of the vocabulary did not fare as well, and that is most likely due to less repetitions and more words competing for repetition in stories. For that section itself, I found that 80% or more of students only knew 31% of the words (although if I slackened it to 70% of my students knowing, it becomes less dire). For all the words over both days, 77% of the words were generally known (over 60% of students knew them). In specific, 60% or more students knew 77% of the words, 70% or more students knew 62% of the words, and 80% or more students knew 51% of the words. It's not perfect, but it's a starting point, and useful data to have in my hands.

What I am doing with the data

The whole point of collecting this data is to use it. I am using the red words first; I have a list of words that less than 60% of my students know and that is a major focus for my review. After making sure we have a decent number of repetitions of those words (using any combination of the activities Miriam and I have discussed in our different posts), I'm going to add the orange words, and finally the yellow. This review might take a week, or it might take two. One of the wonderful things about teaching without a textbook is that I set my own deadlines. I am focusing on vocabulary students need to know for a county-wide test, as I've mentioned in a previous post, and I can take time to make sure my kids really know it. This is a rare luxury, and I know that well.

If I weren't free to spend two weeks reviewing past vocabulary, I would work to fold these words into my upcoming lessons, at a rate of two per day. 

But I'm lucky, so I'm going to take advantage of that luck.

How to set up a spreadsheet to do math for you

It's actually not too hard. I've used spreadsheets for this sort of thing before, and "programmed" one to calculate percentages for me so I can quickly look up the grade for 13 out of 15 without a calculator.

Basically, I wanted something that would add the totals for each word, then divide it by the number of students I have, and give me the percentage of students who missed each word.

I started by laying out what I was looking for:

Then I highlighted across the periods so I could set up the spreadsheet to add the total number of students who did not correctly translate "surgit." I clicked the sum symbol on my program (I use Google Spreadsheet, but all worksheet programs have this capability).


Once I had my total set up, I set up my percentage. I clicked the space next to the total, and typed in the code "=H2/186" to stand for the specific square I wanted to divide (for me the total, which was on H2--squares are designated on a Battleship sort of system) and the number of students I have total (186). 


Lastly, I added in some numbers so you could see the math work, and I selected the square for the total, pressed ctrl+C (you can also right-click and select "Copy"), selected the squares underneath, and pressed ctrl+V (you can also right-click and select "Paste"). It will automatically fill the squares with the appropriate row's total. 



Do the same for the percentage column and you have a worksheet that will automatically figure out your percentages for you.