Since I learned that analytics could be applied to television I knew that’s what I wanted to do, but I quickly learned interesting data about television is hard to acquire for a nonprofessional television and analytics enthusiast. A few months ago I got put on a project at work pulling and analyzing tweets for a company around the same time as speculation started about who was going to be the next bachelor. It finally occurred to me that I had access to a huge repository of data about television, social media, twitter in particular.

I wanted to know what people were actually saying about the next bachelor and break it down a little more than the one sided view I was getting on the internet. At that time people were mostly tweeting about Bachelor in Paradise with the #TheBachelor hashtag and with the Twitter API limits if I just used the standard hashtag I would get a very small number of tweets about the next Bachelor. Fortunately Bachelor creator Mike Fleiss started tweeting clues about the next bachelor, so I pulled all of the tweets in response to his clues, did some simple sentiment analysis, and came up with this:



Not exactly earth shattering stuff, I didn’t even standardize the axes, but I had a lot of fun doing it. I currently work in a consulting type role, so the data I analyze is from a variety of industries. I never get the opportunity to be a subject matter expert in any of the data that I work with. I would consider myself a Bachelor Franchise subject matter expert, and it was exciting to actually go into a data set and know what I was looking at right away. I loved it and I wanted to do more of it, this time with bigger data sets and more analysis.

That’s where the idea of Data-Driven Drama came from. The plan is to collect Bachelor tweets every week, look for interesting patterns and insight, and write about them on this blog. What’s more Bachelor than using the Bachelor to promote my personal musings? Historical tweets are expensive, so I’ve been anxiously awaiting January 2nd to start collecting Bachelor tweets. I’ve done a lot of text analytics projects in in the last few months in preperation, and also because it was required of me by my job, but all of my work assignments were just practice for analyzing viewer responses to the Most Dramatic Season in Bachelor History.

NOTE: Full disclosure, I work for the software company I’m going to be using to do most of the analytics. I wanted to use my software (Aster) for a few reasons:

  • By using Aster, all of the hours I spend analyzing Bachelor data will have the added benefit of making me better at my job, and I won’t feel like a total weirdo who is just really into the Bachelor (though anyone who knows me in real life knows this to be true).
  • I like using Aster, I’ve used it every day for a few years, and I am very familiar with the text analytics library and capabilities.
  • Aster is pretty easy to download and use, and I’m going to publish all of my scripts, so it would be possible for someone with very little analytics experience to start doing their own analysis on Bachelor tweets. If you’re interested in learning more about Aster, check out the Aster Community.