During the final three months of writing my dissertation I tried tracking my productivity by recording in half-hour increments what I was working on, and whether or not I was being productive. I kept all this data in a google sheet for ease of access no matter where I was working. Recording this data actually helped keep me on track, returning to the spreadsheet after 30 minutes of time spend browsing the web, and seeing that the past couple of entries were also “nonproductive” often got me out of a rut and back on track. I also thought that it would be a fun data set to play with when I had more time, i.e. when the dissertation was done.
Well, now I find myself revising the final draft, after successfully defending it in early December, and desperate for something to distract me from the onerous task. So I’ve taken to working with the data, using the pandas library for Python. I completed a coursera course on R a while back, but chose pandas because of a curiosity about data analysis in Python, and familiarity with the language. What I’ve done so far is the most basic of interpreting and plotting data, but still reveals some interesting patterns.
First, a word about the data. I tried to keep it as simple as possible, recording only the date, time (in half-hour increments), general category of work, whether or not I felt I was productive, and finally the task in simple terms. The biggest issue is obviously the subjective nature of “productivity” but since this was a tool to help me get work done, if I abused that categorization the whole idea was moot. Here’s a link to the spreadsheet.
Exporting the whole thing to a csv file allowed me to import it into pandas and start playing with the data. The first thing I was interested in, was when was I most productive. That was relatively simple to figure out, by separating out the productive column along with the time and doing a count on total values and the sorting by the time. That produced a graph like this:
You can see two obvious peaks in my productivity, one right after I got to the office in the morning when I was feeling optimistic about the days tasks, and a shallower peak in the afternoon post-lunch when I slowly returned to a productive state. I’m guessing the slower rise in the afternoon is due to the varying times I would get lunch, and the inevitable post-lunch tiredness.
I was able then to excerpt the non-productive time and add it to the same graph, producing this:
Now there’s an interesting point to consider here. I only recorded nonproductive time when I thought I should be working. I.e. first thing in the morning I didn’t start marking the time nonproductive if I was eating breakfast and getting dressed. So nonproductive time in this case represents time when I was sitting at my desk hoping to get work done. With that in mind, it makes sense that the majority of nonproductive time occurs either side of lunch and in the afternoon, when I was intending to work but distracted.
There’s obviously a lot more to do with this data, next it would be interesting to look at what days of the week I worked best, and how the length of a lunch break correlated with post-lunch productivity, but now I’ve got to get back to work.