Reading by the Numbers, or what I learned about myself from 7 years worth of data

Kelly Delahanty is the Communications Coordinator for the Illinois Program for Research in the Humanities. You can see her visual design portfolio at kellydelahanty.com and chat with her on twitter @kelldel.


I joined Goodreads, a “social cataloging” site for readers, in the summer of 2009 after I graduated from high school. I had an ever growing list of books I wanted to read and the post-its scattered around my room were no longer going to cut it. So like any good millennial, I turned to the internet to solve my problem. As of this November, I’ve added 1400 books to my Goodreads account.

One of the more recent books I’ve added is Dear Data by Giorgia Lupi and Stefanie Posavec. Dear Data is the documentation of “a year-long, analog drawing project”—every week for one year, Lupi and Posavec collected personal data and then visualized the data through a hand-drawn postcard that they then sent to each other across the Atlantic. In addition to being a charmingly gorgeous art book, Dear Data is an argument for “small data”—looking at what personal data tracking can tell you about yourself, or as Dear Data states: “spending time with your data is spending time with yourself.”

giorgia_deardata_46_front

Giorgia Lupi’s postcard from week 46, visualizing all the books she has at home. Source: Dear-data.com

More than a few of the books on my Goodreads account are about data visualization and information design. My favorite work as a designer has always been about using design to facilitate understanding, whether it’s understanding how supercomputers work or what the impact of the National Endowment of the Humanities on University of Illinois’s research is. As I read Dear Data I began to wonder what my personal data would say about me. I use a number of apps to track my habits so I had a few different data sets I could choose from, but since this is Reading Matters, I decided to look at my Goodreads account to see what I could learn about my own reading habits.

So what kind of questions did I have about my reading habits and what did I learn? Well…

Question 1: How much do I read and how much do I want to read?

Let’s start with the basics: As of November 10, 2016 there are 1400 books on my Goodreads account. I’ve read about 30% of those. In comparison, 68% of the books are on my “to read” list, 1.7% of the books have been started and then “abandoned,” and I am currently reading .3% of the books on my list.

Pie chart showing the status of books on my goodreads account. 68% are to read, 30% are read, 1.7% are abandoned, and .3% are being currently read

Figure 1: Status of Books

So I’m certainly ambitious in how much I want to read and I rarely give up on a book. I also read multiple books at once, something I only started doing in the last year or so. But really, this information doesn’t tell me much I didn’t already know. It just lays the groundwork for more interesting queries, like…

Question 2: When did I read these books?

This gets a little complicated because when I signed up for Goodreads I attempted to back date some of the books I had read prior to getting my account. Obviously this included a lot of guess work (Did I read the first Harry Potter book the year it came out or the year after?) and loss of information (What was the name of that book about time travel and Stonehedge that I read in middle school?). The books I’ve catalogued as read prior to 2009 are hardly a comprehensive list, but rather serve as a showcase of the books I read during childhood that had a long lasting impact on me. Or books I had to write school papers about, but you know…same difference.

Post 2009 is another story. The most interesting fact in this data is the fact that I barely read at all my senior year of college, and that my reading habits have picked up considerably since graduating

Line chart showing how many books I've read per year. It goes up over time, takes a sharp dip in 2013 and then goes up sharply again.

Figure 2: Books read per year

This is most likely due to me being especially stressed and busy during senior year. This conclusion is further supported when I break down the last few years into how many books I read by month. Looking at this data, I can link the times when I read the least amount of books to times when I was particularly stressed or busy (which is apparently August, every single year, for some reason).

Multiple line charts showing how many books I read each month for the past 4 years. It typically starts higher in the beginning of the year, goes down in the summer, and then goes back up in the fall.

Figure 3: Books read by month

It’s also possible that I just read more in the winter months than the summer because it cold out and I refuse to leave the warm comforts of my bed for entertainment. If I had access to more people’s Goodread data this is something I’d love to look into. Do all readers cuddle up with a good book when it gets chilly or are most readers apathetic to the changing of the seasons? Do people read more in the wake of new year’s resolution, the same way new members flood the gym on January 1st?

On a personal level, I want to know what the heck had me so busy during March 2015?

Question 3: What do I read?

There’s two ways to interpret this question: by format (prose, poetry, comics, etc) or by genres/subject matter (fantasy, romance, biographies, etc).

Looking at the formats of books across all my lists reveals, unsurprisingly, that most of the books I read or want to read are prose and that most of those are fiction. About 12% of all the books on my Goodreads are fiction comics (also unsurprising to anyone who knows me), followed by 3.5% being poetry (not surprising to me, maybe a tad bit surprising to others), and 1.7% non-fiction comics (not surprising for the simple fact that non-fiction comics are not exactly flooding the shelves). Trailing in last place are a few plays and art books.

Pie chart show what the formats are of the books I read. 45% are fiction prose, 36.6% are non-fiction prose, 12% are fiction comics, 3.5% are poetry, 1.7% are non-fiction comics, .5% are art books, and .5% are plays

Figure 4: Format of books

I’m not sure that this information provides any particular insight into me as a person, but it doesn’t lay a bit of groundwork for understanding the data.

Breaking down the books on my “read” and “to read” list, I think it becomes obvious pretty darn fast what my favorite book genre is.

Bar chart showing how many books of specific genres are on my read and to read lists. The fantasy bar is much longer then the others. The Design, social science, and humanities and arts bars are also fairly long

Figure 5: Genre of books

So…I read a lot of fantasy, which is not surprising to me or anyone who knows me. I also have a lot of books about design (broadly defined to include graphic design, web design, video game design, motion graphic design, and probably some other forms of designs as well). I also, apparently, like non-fiction books about “Humanities and Arts” and “Social Science.”

But here’s where it gets complicated, because Goodreads doesn’t define a book’s genre—users do. I’m the one that decided to categorize something as a particular genre, and how can I decide what to categorize a book as when I haven’t read it yet? Goodreads does suggest genres for each book, but it’s based on what other users have categorized a book as and who knows how trustworthy that is. If only twenty people out of a thousand have labeled a book as science fiction, is it really science fiction or do twenty people not understand the difference between science fiction and fantasy with technology in it?

And because I love making my own life difficult, I decided to limit each book to one genre or subject matter for the purposes for this article. Which means that I had to make some executive decisions on books that technically fit more than one genre, which in turn brought up a number of issues. Do books with time traveling count as historical fiction? What about books that at first glance appear to be fantasy but—spoiler alert—they’ve been in a coma the whole time? How much can a biography look at the larger historical context of a person’s life before it becomes general history book? Should a book that’s about writing marketing copy fall under “writing” or “business”? And don’t even get me started a sorting out whether or not a book should fall into “Humanities and Arts” “Social Science” or “STEM.”

Ultimately I had to make some basic rules. Part of the reason fantasy is so overwhelmingly high is not necessarily because I read fantasy to the exclusion of all other genres, but because I decided to label nearly every book that could be classified as either fantasy or something else as fantasy. I figure that even if there are bodices ripping in horse-drawn carriages, the fact that the occupants of said carriages are time-traveling vampire wizards is probably more important to the story.* Science fiction trumped everything after fantasy was no longer on the table for a particular book, then historical fiction, then romance, and so on, until all that was left was “realistic fiction.”

Non-fiction was even more complicated. I’ve broken my non-fiction books down into a few broad categories, but for the most part this an oversimplification of how I and other Goodreads users label books. They don’t label something as “Social Science”—they label it as philosophy, as politics, as pop culture, as psychology, or as other words that start with “p.” Trying to break down my reading habits by very specific subject matter was unhelpful in extracting any real insight, so I decided I needed group these books into broader categories. And the lines between my three main non-fiction categories—Humanities and Arts, Social Science, and STEM—were not always clear. For example, if a book is about feminism, is that a humanities book or a social science book? If it’s about the history of a specific technology, is that STEM or humanities?

Ultimately it came down to a lot of gut feelings and guess work. I guess I’ll see how well I did when I finally get around to reading those books.

Conclusions

So what did I learn? Well, I don’t read when I’m stressed, and I read more when I’m cold. I read more poetry than art books, and more romance novels than mysteries. I love fantasy, way more than I even realized, and I’m maybe a tad bit obsessed with understanding the boundaries of genres.

There are a lot more questions I want to know. Do I read more stand alone novels or books from series? How soon after a book is published do I typically read it? Do I read more books written by women or men? What about books written by people of color? How do my ratings on books compare to other Goodreads users’ ratings? And how do my reading habits compare to others?

But despite all the unanswered questions, I really enjoyed this project. This seems like something I’d want to revisit again, perhaps once a year. Who knows, maybe I’ll find a Dear Data-like penpal to do it with me.

Anyone interested?

*As far as I know there are no novels about time-traveling vampire wizards who ride around in horse drawn carriages and rip their bodices open. If you know of one, please tell me. And if there are any aspiring authors out there looking for an idea for the next big best seller…you’re welcome.

Advertisements

About iprh

The Illinois Program for Research in the Humanities at the University of Illinois at Urbana-Champaign was established in 1997 to promote interdisciplinary study in the humanities, arts, and social sciences.
This entry was posted in Reading Matters@IPRH. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s