Over time, our online pursuits generate a rich set of data points.  The websites we visit, the articles we read, and the things we buy, they all reflect our personality.  It can feel creepy when someone else analyzes this information, but when we explore it on our own we get to know ourselves a little better.

Me, I recently uncovered some trends in my e-mail habits.  It turns out my mail client stores some message metadata — recipients, dates, and so on — in a local database.  By mixing a little SQL and some quality time with R’s charting tools, I learned just how much nonsense I had sent out over the past four years.

Let’s start with the basics.  This chart shows the total e-mails I sent out in that period, based on the time of day:

total e-mails sent, per hour

total e-mails sent, per hour

Here we see a ramp-up in the morning, followed by a brief dip around 6PM, then another quick peak around 8PM before it tails off for the night.  (Those rare e-mails sent during the wee hours, I chalk those up to my travels through different time zones.)  The lack of a midday dip hints at a person who typically works through lunch.  Yes, that seems to fit me very well.

But we’re dealing with a lot of information here, so these are perhaps broad statements?  It may help to slice the data by day to get a clearer picture of my habits:

e-mails sent per hour, broken down by day

e-mails sent per hour, broken down by day

The weekdays exhibit the same pattern of one large hump followed by a smaller, late-day peak.  We also see some new details that were hidden in the other chart:

The thick magenta line represents Thursdays, and the dip around noon indicates this was my day to step away from my desk for a proper meal.  While I took it easy on the weekends, I sent a relatively large number of messages after Sunday dinner.  (Check out the peak at 8PM.)  Was I getting a head start on the work week, or raving about some new restaurant I had tried that night?  Hmm.

Slicing the data yet another way reveals even more details.  The real eye-opener was the number of messages I sent per month:

total e-mails sent each month

total e-mails sent each month

See that spike there, between the two red lines?  The one between December 2005 and January 2006?

That marks when I acquired my first Blackberry.

Fine, it’s time to come clean: I’m hooked.  I’m a connectivity addict, and I now see why they’re affectionately called “crackberry” phones.

Granted, this is just a quick skim over a lot of data.  I may see other trends if I were to separate professional and personal communication, and look at conversation (mail thread) counts instead of raw message counts.  Additionally, the charts alone don’t tell us the whole story behind that spike in January 2006.   (I may have seen the need ahead of time and bought the crackberry to keep up.  At least that’s what I’ll say until proven otherwise.)  Still, were I a sleuth, these charts would give me an idea of where to dig for more details.


Have some interesting data you’d like us to check out? Need our help making sense of your company’s data? Please drop us a line. Thanks for reading.

Bookmark and Share