M&M’s pt. 2

So the M&M’s distribution activity went fairly well (now a week and a half later…) . I used Fathom to aggregate their data into a large table as they tabulated the counts of each color. Color was our main variable, but we also tracked the total counts and the mass of each bag.

2013-08-15 08.32.48I went around with a tablet and took pictures of the pie charts they made with their M&M’s which automatically uploaded to my computer through Dropbox’s camera upload feature! We scrolled through roughly 15 pie charts and compared them while the students marinated over the task of looking for a pattern. Turns out they thought they saw orange and green a lot and but not a lot of brown.

Later we used Fathom to summarize the distributions of each person’s color counts together. We found that Orange was the most popular color, with blue and green roughly tied for second.

13-14 m&m count meansLater we compared it to the “true” distribution numbers another AP Statistics teacher had obtained from the company and were able to discuss the causes of variance and whether they had potentially changed their distribution since the data had been sent from the company.

We also began to look at displays of a quantitative variable because they were interested in looking at the number of M&M’s per bag and the mean mass of the bags.

The students had a good time with it and we had a good chance to talk about what we could infer from the data.


AP Stat students will love me tomorrow. Their third period teachers? Not so much…

2013-08-13 20.44.18

Got the idea for using M&Ms to illustrate distributions of categorical variables and pie charts from my AP summer institute. Look out for some follow ups after tomorrow.



Story in the (Caffeinated) Data

Inspired by Dan Meyer‘s 2012 Annual Report, I decided to start tracking data on myself in February. Thanks to the Keep Track Pro android app on my phone, I have been tallying each and every coffee I drank since February 18. I’ve had a lot of fun self-analyzing through this activity and am trying to figure out a way to get students doing something similar in AP Statistics this year.

On to some graphs!

Distribution of Counts of Coffees in a Day

Distribution of Counts of Coffees in a Day


First, I decided to do a bar chart on the counts of coffees I drank in a day. I’d never gone above three cups in a day, but somehow survived the 18 days I had no coffee.

With this preliminary information in mind, I decided to think about the Law of Large Numbers and a moving average. The basics of the theorem suggest that over time, the average of some experimental outcome will approach the “true” value. I wanted to infer what my “true” average coffee intake per day was. Using some Microsoft Excel formula wizardry, I calculated an average after each day of the tracker, and decided to plot this cumulative average over time.

Cumulative Average of Coffees per Day


I noticed an interesting pattern here. As the Law of Large Numbers suggested, the cumulative average began to settle by May at around 1 coffee per day. Thinking about my routine during the school year, I would normally have a coffee at home before work, and every once in a while, maybe one or two times a week I would have another after work at home. Then I noticed that the graph became unsettled again after school ended and revealed an upward trend. At this point, I wanted to examine in a little more detail the reasons for the instability. Instead of looking at the cumulative average, more Excel wizardry enabled me to look at a running average, only looking at the prior 14 day period. This displayed a higher resolution of my caffeination habits.

Cumulative average (blue) and 14 day running average (red)

Cumulative average (blue) and 14 day running average (red)

I noticed a few regions of interest in this graph. First, around late March to early April, the running average reveals a drop in my drinking habits. I roast my own coffee, and I bet this period corresponds to a time when I had forgotten to re-order beans to roast and had to wait for them to come in. Secondly, I notice that about 2 weeks after school ends, my moving average starts to increase drastically. The red trendline begins to settle around 2 cups per day. This corresponds to what happens when I am at home without much structure. I either make several cups of coffee during the day or hang out at the coffee shop. The instability in the cumulative average (blue) is understandable based on understanding the running average (red).