Twitter Excitement Index & Aussie TV Premieres

Welcome to another week of Telemetrics updates. In this blog we’ll take a quick look back at the last set of ‘predictions’, explore the Twitter Excitement Index and finally take a look at this Monday’s Australian TV premieres, The Block & My Kitchen Rules, both alone and in the context of other recent Australian Reality TV ‘events’.

A Quick Look Back

So, as we left things, we put out a few predictions for the week beginning January 20, as follows:

Predictions17Jan_Results

 

So, some hits and misses here.. The most concerning of these are Pretty Little Liars & Ravenswood, so let’s have a closer look at those, starting with Ravenswood. Essentially, I think this was an over-prediction influenced by the second half premiere, which rated at 181,200 tweets. We have now added an adjustment for these type of shows, that will highlight second half premieres in our predictor, with the option to manually exclude them where they rate significantly different. More concerning was “Pretty Little Liars”, which wasn’t a second half premiere but a regular show. Here, I think we had the reverse problem. Because there weren’t enough shows in the past two months (only 1 after the second half premiere), our prediction algorithm defaulted to include episodes from the previous series, where 300-400k tweets was the norm. Combining this with the 488,000 tweets from last week, this seemed a reasonable estimate, but the show actually nearly doubled that performance — it will be interesting to see how Pretty Little Liars fairs over the coming weeks.

The other errors of over 10% were both reality shows, and here I’m tempted to put the predictions on hold until we’ve established a measure of season context. Just as Big Brother follows a weekly cycle of Daily Shows, Nomination Shows and Eviction Shows, shows like American Idol, The Voice etc follow a somewhat standard season format between auditions, performances, eliminations and so forth. Modelling this is on my to-do list, but it’s after the establishment of the Twitter Excitement Index and indeed the modelling of regular season shows with their premieres, finales and other formats. Time and other priorities didn’t allow us to get predictions out for this week, and I won’t do so retrospectively, but we’ll get some out either late this week or early next.

Twitter Excitement Index

One thing we did finish last week was the establishment of our Twitter Excitement Index, which is a measure of the volatility of conversation on Twitter based on the principles of Brian Burke’s Excitement Index at Advanced NFL Stats. Over a few weeks Katie Prowd and I went through a few variations on this approach, looking at different measures to see which best captured the patterns in Twitter conversation, and I also owe thanks to Patrik Wikstrom for his help in tweaking our statistical approach. Essentially though, the theory here is that if you have two shows, they may both average 100 tweets per minute, but have very different engagement levels, as seen in this dummy data:

DummyData

 

In the first graph, the activity is ‘spiky’, that is, every other minute people are prompted to tweet, while in the second there is an underlying level of 100 tweets, with minimal variation around that average, suggesting a constant stream of conversation but no particular moments which provoke users to tweet. These should be at opposite ends of the spectrum, and with our Twitter Excitement Index, the top graph would see a TEI of 9.9, while the second gets a TEI of 0.5. The scale here has been calculated to vary between 0 and 10 for presentation purposes, although in practice from our test data it would seem very unusual for a show to achieve a TEI of over 5.

The Australian Premieres

We are currently working on adjusting our metrics to work with Australian television, and this weeks premieres of The Block and My Kitchen Rules gave a good first test for this approach. But first, let’s cover the basics, and look at how these shows performed on Twitter:

Volume Graph

As you can see here, My Kitchen Rules clearly won the night, both in terms of total tweets and audience peaks, from The Block in second place. The third line here represents The Biggest Loser, which wasn’t premiering but did air an episode in competition with the other two reality shows. The twitter audience for The Biggest Loser has fallen off a bit since its premiere, but this performance must be considered low by any standards. A similar picture was seen in the TV ratings, with My Kitchen Rules reaching 2.4m viewers, The Block 1.1m, and The Biggest Loser just 560k. However, it’s interesting to compare these to other recent reality ‘events’:

RecentShows

What is evident here is that Big Brother still owns the crown for Australian reality television, at least on Twitter, with both the Premiere and Finale easily outperforming the launch episode of My Kitchen Rules. For Channel 9, the performance of The Block suggests that the reality show success they achieved with Big Brother is not easily transferable to other shows, and for Channel 10 both the performance of Masterchef and Biggest Loser must be a cause of some concern… A similar pattern can be seen in the unique audience for Monday’s shows (n.b. the percentages add up to over 100% because of viewers who tweeted about multiple shows):

Audience

Finally, the launch of these shows also gave us an opportunity to test some of our new metrics, and after a slight hiccup caused by time zone confusion (the TEI saw the drop-off to QLD discussion as a good thing for the show, viewing NSW levels as large spikes), we ended up giving a slight win to MKR, although both ranked before the season launch of The Biggest Loser. The TEI does not take into account tweet volume, only measuring the volatility of conversation among the audience that *did* watch the show, and so should generally be read in combination with the audience size to properly understand the conversation around a show.

TEIInfo

Mumbrella also covered some of this, and you can view their article here.

And so wraps another week in Telemetrics.. until next time!

Incremental Updates & Tweets Above Replacement: The Week in Telemetrics

It’s Friday here in Brisbane, and so time to reflect on the past week (or 10 days), and offer some predictions for the week ahead. In general, the last 10 days has primarily been concerned with adding additional fields and detail to our data store, as the first stage in a process to investigate the relative importance of each of the factors impacting on Tweets, in order to better account for them in both our historic analysis and future predictions. We have pulled data from a number of sources, and so a significant amount of time over the past week has been spent scraping and formatting the data, establishing a key system so that they become relational, and spot-checking (and using excel formulae) to verify the integrity and accuracy of the data:

allthedata

Evaluating our predictions

But, let’s start this recap with a look back at last weeks predictions, and the lessons we can learn from them. In total, 9 of the shows we predicted were present in the Nielsen daily Top 5 during the week:

 

week1_original

 

Immediately obvious (and hinted at last week), is that the system copes badly with Second half premieres. We had deliberately excluded season premieres and finales from our predictive data set, but it now appears clear that second half premieres (i.e. after a show goes on a winter hiatus, but within the same season) also receive a significant boost in tweets. With our new data (scraped from EpGuides & Wikipedia), we are now able to systematically identify these episodes, and plan to evaluate whether we can make a simple adjustment to account for the boost such shows receive. Overall, our average error for the week here is 33.19%, however if we exclude the second half premieres that falls to 9.21%, which is within the realms of what we were expecting, and easily out-performs the simple average of the last 10 or 4 shows:

 

week1_averages

 

Here, taking a simple average of the last 10 episodes for each of the shows above, we see an overall error of 35%, which falls to 25.5% once the second half premieres are excluded, a much larger figure than our 9.21%, although of course the sample sizes here are still small – one of the major challenges of these predictions is guessing what the Top 5 Nielsen will report are, particularly given that Nielsen report the top 5 by number of ‘impressions’, rather than by tweets.


Filling out the last two months: Replacement Value & Tweets Above Replacement (TAR)

One significant issue we had been experiencing using our preferred “average weighted tweets over past two months” (of episodes) metric was that of small sample size. While I am reasonably confident that we can demonstrate the last 8 shows are a better estimator than the weighted average of all shows in our database, for shows we are not manually tracking there can be many holes in this data. A show which barely scratches the Nielsen top 5, and isn’t in our tracking system, may be missing data for 75% of the last two months shows, as seen here:

 

CarrieExample

If we, as we have previously, simply don’t count shows for which we have don’t have tweet figures, we are heavily biasing the estimates. As you can see in our prediction data above, we predicted 10,245 tweets last week for “The Carrie Diaries”, which turned out to be a 12% over-estimate. A better solution here is to adapt a concept from sabermetrics called “replacement value” . Essentially, in baseball and other sports, replacement level is defined as the production you could expect from signing a “free agent”, that is – the next best player who is not under contract to any major league team. This allows, for example, to predict the impact of an injury, trade or contract expiry on a teams production, but also to measure other players against this level; for example through Wins Above Replacement. I plan to return to the latter concept, or in this case Tweets Above Replacement (TAR) in the future, as a means of measuring how well a show does in comparison to throwing a “replacement level” show on in that timeslot.

For our current purposes, and the Carrie Diaries example, we are more concerned with predicting how many tweets a particular episode may have got, despite the final statistics not being (freely) available. However, we do know what the bottom ranked show on Friday is, which – for example – was 3,700 tweets on 15th November, 1,600 tweets on 22nd November, and 6,600 tweets on 6 December. However, here too, we are slightly thrown by Nielsens method of reporting, as the Tonight Show was ranked fourth with 1,600 tweets, but ahead of BBC America’s “An Adventure in Space and Time” which ranked 5th with 9,200, based on number of impressions.

So, for now, let’s call the replacement value for Friday nights 5,000 (we’re currently calculating replacement values for each Day/Month combination). If we plug 5,000 in for those missing episodes, instead of ignoring them, our weighted tweets formula would have predicted 8552 tweets, an error of just 5.2% compared to the previous 12.2%.

We’re probably a few days away from properly integrating this into the prediction algorithm, but we do hold high hopes for this improvement.


Sometimes you can be too precise, or the problems of small samples

Another, perhaps short lived, tweak over the past 10 days was experimenting with weekly indexes rather than monthly. The idea here was that such a system would better account for weeks such as ‘sweeps’, when networks tend to put on their strongest programming, as well as for the effect of major events such as the Superbowl, Oscars, Golden Globes and so on. Particularly with the wresting shows, such a method would also account for the cycle around events such as Wrestlemania and Summerslam, which anecdotally appear to signal increased tweeting. For much of our older data, where we have 50-100 weekly shows weekly indexes appear to offer small incremental advantages in forward-predictions over the monthly indexes utilised in the above predictions.

However, when predicting for the future (where weekly indexes can be more volatile), and for the last few months of data when we often only have 25-50 shows in a week, the volatility of this index negated such advantages and actually showed a decrease in prediction accuracy. In particular, weeks 52 and 53 of 2013 and 1 of 2014 (i.e. over the Christmas break) saw such low weekly indexes that any show that had mild success during those weeks was suddenly forecast to have season-highs in their next episode, when in actuality the index was primarily impacted by a few extremely low days over the actual Christmas holiday (particularly when we exclude sporting events – i.e. the NBA and College Football bowls which dominate American broadcasting over the Christmas period).

It may be that eventually we can recover those incremental gains, but for current purposes we plan to return to the monthly indexes. This also means we can more easily predict shows for the current month, where the index has already been somewhat established, as opposed to ‘predicting’ the monthly index itself based on historical data and trends.


Some predictions for the coming week

So, with all that said, let’s head back to the spreadsheets and give our top 10 predictions for the coming week:

Predictions_17Jan

 

Twitter Excitement Index (TEI)

Finally, in the tradition of all good TV shows, we’ll leave you with a teaser for the coming week. Based on work by Brian Burke at Advanced NFL Stats, we have developed a measure we’re calling the “Twitter Excitement Index”, which essentially measures the peaks and troughs in Twitter conversation to establish a measure of excitement. That is, a show may be averaging 3000 tweets / minute, but have a time series graph which hovers slightly above and below the average over the course of an episode – in this case, it has attracted a large twitter audience but that audience doesn’t seem to be particularly provoked by the content of the show to tweet. By contrast, a show averaging 300 tweets / minute may be spiking continuously throughout the episode, showing that people were reacting to the events as they happened (as we have seen, for example, with Big Brother 15 in our previous work). By again stripping out those factors related to audience size, we are producing a metric which analyses the reaction of Twitter users to the content of the show. But, more on this next week..

Weighted Tweet Index (WTI) — Some Predictions for the coming week

The Weighted Tweet Index (WTI) is the first metric we have created as part of the Telemetrics project. Essentially, the goal of this metric is to break raw volume numbers down into it’s constituent parts. For the moment, I don’t intend to go into too much depth on the methodology, in large part because this is an iterative process and it is subject to (substantial) change. Our data covers the period of 18 April 2012 through to the current day (well, 5 January as I write this), and comes from a variety of sources. One significant challenge of the methodology was to make data from these different sources comparable, as discussed briefly in my previous post.

Analysing the past

So, perhaps the best way to illustrate this is by way of an example:

The 1750th ranked show by pure volume, excluding specials and sporting events (more on this in a later post) happens to be an airing of the film “Space Jam” on Cartoon Network from Friday 6 July 2012. The data (in this case from Nielsen SocialGuide) records a total of 25,033 tweets attributable to the show, making it the 3rd ranked broadcast of the day. But what if that had aired on an average day, in an average month, and on an average network (taken in this case to be an average broadcast network – e.g. ABC, CBS, FOX, NBC)?

According to our current metrics, it turns out that a Cartoon Network show can expect to see about 16% of the twitter activity of an average broadcast network. Similarly, Friday’s can expect to receive around 52% of the Twitter activity of an average day (nothing new here for TV Execs, Friday is historically a dumping ground for non-performing shows), and July – being in the offseason for TV – receives 65% of an average month. Note that there’s a lot of refinement possible here: a kids movie during the school holidays (I assume) should probably be expected to do better than average, and we currently don’t account for the time of day nor what was being scheduled against it. But, for now, let’s stick with our numbers, so:

Weighted Tweets = (Old Tweets) / (Network Factor * Month Factor * Day Factor)

Therefore, new tweets = 25,033 / (0.16 * 0.52 * 0.65) = 25,033 / 0.0541 = 462,717 tweets if shown on an average broadcast network, on an average day, in an average month. With the greater degree of precision in the model (i.e. not rounding to 2 decimal places), our actual figure here is 466,805. As it turns out, once ranked by weighted tweets, Space Jam thus moves from being the 1,750th ranked show in this period to the 317th.

Predicting the Future

Much of the utility of this approach comes not necessarily from analyzing the past, though we’ll certainly do a lot of that over the coming weeks, but the ability to predict the future. Because our ‘Weighted Tweets’ figure has been stripped of much of it’s context, we can then take this figure (across all episodes that we have data for within a particular series), and apply to it the factors that will apply when it next airs. Here are two examples:

Here you can see one interface to our model. Once a show is selected at the top, historic data is pulled from the data store, and a number of metrics are calculated. Which of these metrics is the best estimator varies depending on the volume, source, and age of the data we hold. In this case, we have a number of recent episodes and so the weighted tweets for the last month were selected. We entered the day & month the show was airing, as well as the network, and (for now) set the growth factor as 1. Growth Factor is a variable that we are still experimenting with, although currently it is largely contained within the Month factor for current/future dates. Ultimately, we predicted 157,511 tweets, and when the Nielsen data became available earlier today, we saw the results:

Not too shabby. 7,000 on a total of 150,000 is a 4.80% error; for the first version of the model we’ll certainly take that. Looking at the last 10 shows, a pure average of the raw tweet totals over that time would have been 99,591 tweets, and for the last four shows 113,100 tweets, so our prediction certainly appears to be outperforming simple averages.

Both Teen Wolf & The Bachelor were essentially premieres (Teen Wolf being a second half premiere), and Wolf Watch appears to be a companion show to Teen Wolf, and so we are currently unable to predict any of those three, so let’s move on to WWE Monday Night Raw.

Here, again, we have a lot of data, with this being a weekly show, and so the monthly average was a sensible choice. Again we added in the new variables (Monday, Jan 2014 on USA), left the growth factor alone, and predicted 215,507 tweets, and the results were again quite pleasing:

We were off by 10,594 tweets, a 4.69% error, although this time the other way. It’s interesting to note that simply taking the average of the previous 10 RAW shows would have predicted 153,398 tweets, and an average of the last four shows would have predicted 165,275 tweets, so again we do seem to be on the right track.

Note here that we’re not necessarily predicting what it will get, but what it should expect to get. In part that’s because there is still a wide range of factors we don’t account for, but even as we add more refinements to the model over the coming months, there’s still one we can never really account for: content. Just as we saw in the Big Brother work I’ve published here previously (and more formal journal versions of that are still in the works), there are some things you cannot account for, such as the racism scandal that engulfed Big Brother 15 and saw a rapid increase in tweet volume.

However, in some ways knowing what a show should expect to get can be more useful, both in providing a barometer of success for networks, producers and social media strategists, and in providing industry and researchers with a list of shows which either exceed or fail to reach these levels, thus allowing an analysis of what may have contributed to the success or failure of a particular episode or series on social media.

All that said, however, we’re still very eager to see how close our predictions for the week ahead match reality.

Some predictions for the coming week

To generate these predictions, Katie Prowd has predicted the top few shows on US TV for the next week (excluding Sunday, which appears to be full of repeats), based both on the TV listings and our historic data for which shows have received the most tweets in recent weeks. Specifically excluded from this list are any shows that are either a premiere or finale, as these do not fit within the standard model we are testing at this stage (although we are currently working on refinements to allow for these).

DAY PROGRAM NETWORK PREDICTION
Tuesday Bad Girls All Stars Battle Oxygen 85882
Tuesday Ravenswood ABCF 79010 *
Tuesday Snooki & JWOWW MTV 15881
Wednesday American Horror Story: Coven FX 132516
Wednesday Modern Family ABC 22054
Thursday Vampire Diaries CW 92005
Thursday Mob Wives VH1 22029 *
Friday WWE Friday Night Smackdown! SyFy 34084
Friday Shark Tank ABC 10045
Friday Late Night with Jimmy Fallon NBC 4387
Friday The Carrie Diaries CW 10245
Friday Say Yes to the Dress Atlanta TLC 4150
Saturday Saturday Night Live NBC 44745
Saturday Iyanla, Fix My Life OWN 6071
Saturday Sam & Cat NIK 4798
Saturday Huckabee FOXNEWS 438 *
Saturday Organge County Choppers CMTV 698 *
Saturday Houston Beauty OWN 3722 *
Monday WWE Monday Night Raw USA 215507
Monday Love & Hip Hop VH1 157512
Monday How I Met Your Mother CBS 17293
Monday The Real Housewives of Beverly Hills Bravo 14554

 

Caveat: Those with a * are generated from a small sample, and we have a low(er) confidence in their accuracy.

This list was updated on 9 January to reflect TV scheduling. No alterations were made to predictions.

We’ll check back periodically to see how these are doing over the next week, and continue to work on refinements. Until then…

Telemetrics: From First Principles

Over at my blog a few weeks ago I highlighted a new approach I wanted to take to television metrics, which I have termed “telemetrics” as a nod to the field of sabermetrics in the analysis of sport, from which I take much of my inspiration. Along with a team including Katie Prowd, Portia Vann and Kate Guy, the last few weeks have seen a significant first push towards (i) proving our capture methodology, and (ii) examining differences in the relationship between tweets and television ratings, across exemplar shows, genres and countries. In particular the work below would not be possible without the data manipulation process which Katie completed to get both our internal and more recent external data into a form we can work with.

Our work to date shows that we are fairly accurately able to match the Nielsen SocialGuide capture technology, at least for shows which do not exceed the 1% Twitter API restrictions (as I have discussed previously in relation to Scandal). In terms of correlations between tweets and television ratings, we have observed a substantial variation across genre, format and country. In particular, early results suggest a significant difference between reality shows and more standard television fair such as sitcoms and drams. Additionally, ‘specials’ (which includes events such as the 2012 US Election debates and award shows, as well as potentially premieres and finales of series) do not appear comparable with standard episodes of series. Finally, for now, we have also noted substantial differences between similar formats in Australia as compared to the United States, which is particularly significant since much of the literature, and examples of best practice, come from the work of researchers and organisations such as Nielsen (and their SocialGuide subsidiary) which have focused on the US market. All of this means that drawing a fixed correlation between traditional television ratings and Twitter use does not seem a sensible approach.

Big Brother 15: Nielsen vs. CCI Data

The first thing we had to do was verify our collection methodology. Luckily, we have long been collecting tweets around television, as part of work discussed previously here on Big Brother 15, which expanded to comparisons between Big Brother in Australia and the United States, and subsequently reality TV shows as a whole. With the start of the 2013 TV season, we added a range of new terms including popular sitcoms, dramas and sci-fi shows in order to broaden the number of exemplars available to us. However, returning to the Big Brother data made sense for verifying our methods, as both we and Nielsen SocialGuide recorded, and Nielsen had published on their website, the number of tweets, and unique users, for a large portion of the season, as visualised below:

 

 

From our perspective, that was pretty good. Tweets matched almost exactly, and the only major difference, on 23 August (22 August in the United States), was easily attributable, being the night in which the Head of Household competition continued after the show, thus resulting in us recording an oversampling of tweets (compared to Nielsen) which were attributable to the live feed, rather than the network broadcast. That said though, we were very happy with these numbers, and meant we could be confident our methodology was producing accurate results. Thus, excluding known data outages, we were happy to move forward with the data we had collected. It is worth noting here that while we were slightly different on unique users, it was by a relatively consistent amount. We’re still not sure of the cause of this; although it seems unlikely that we were counting just enough tweets from a group of repeat users to have slightly different terms but the same overall volumes, and so with Nielsen’s methodology essentially a black-box, the mystery will remain.

Not simply replicating Nielsen

It’s important to note, however, that while we used the Nielsen Socialguide data to verify our methodology, we do not merely want to replicate their results in an Australian context. As I mentioned when I introduced the term  , much of my inspiration comes from sabermetrics. As a field, sports analytics have long been focused on separating out aspects of a game so that you can quantify them. For example, in Baseball, traditional statistics such as Earned Run Average (ERA) have been used to measure the performance of pitchers. ERA was lauded over simply counting the number of runs because it excluded the effect of errors by the fielders (that is, runs scored after a fielder made an error were simply not included in the total), however the work of sabermatricians has shown that the quality of the fielders behind a pitcher actually played a much larger role than was quantified by simply removing errors, and so a whole new range of statistics such as FIP (Fielder Independent Pitching) were born. In similar ways, Nielsens (publicly available) approach may also be too simplistic. Indeed, their measure of ‘reach’ is highly questionable, given that a large proportion of that potential ‘audience’ may not be reading Twitter at the time a tweet was made, and the figure may include a myriad of inauthentic and inactive accounts.

Accordingly, our work is targeting generating more authentic metrics such as the Weighted Tweet Index, which I will discuss in my next post, which enable us to draw preliminary conclusions about the success of a show on social media (independent of factors such as network and the time the show is aired), and further metrics which we are currently developing to isolate the ‘social’ aspect of a shows Twitter conversation. This is an ongoing and iterative process, which we will document on the blog, in addition to publishing some public tests of our current methodology to predict tweet volume for upcoming shows. Finally, we also intend to investigate the role social media strategy, hashtag promotion and indeed content has on the number of tweets around a particular show, and expand on current work looking at correlations between viewer count and tweets on a per-show (and, data allowing, a more frequent) basis.

That seems like a good place to leave this first post. Later today or early tomorrow you can expect a post outlining the Weighted Tweet Index with some predictions for upcoming days shows, and shortly after that some preliminary results from the past 18 months of television shows as seen through the Weighted Tweet Index.

Getting Organized & TVMetrics.net

In preparation for a series of posts that will be published here (and cross-posted to the usual venues) beginning on Wednesday 9 January, I’ve been getting things organised, deleting and throwing out a bunch of material I no longer need from previous projects, filing material for those that are in progress, and streamlining my work process by moving material from Apple’s notes (which tends to be my go-to location for things I think of commuting, watching TV etc) into Wunderlist, Evernote and WorkFlowy as necessary. I’ve also organized *all* of the data, especially that linked to the TV project, so that I can run it through some preliminary metrics I’ve been working on early next week.

The process wasn’t helped by an internet outage, which necessitated a 30 December visit to a nearly deserted office, but I’m more or less ready to go. I also have just registered http://www.tvmetrics.net, which (for now) points to the Telemetrics section of this blog; the corresponding Twitter, Facebook & Google+ accounts are also set up, just in case… Apparently a domain registration service have snapped up Telemetrics.net, so I’ll continue to use Telemetrics to describe the suite of statistics and methods, but TVMetrics.net seems an apt location for the results and analysis. I look forward to posting some of the preliminary work over the coming weeks.

Spotting Fake Followers: Aussie Band “The Contagious”

A few days ago, in the Wall Street Journal, Jeff Elder described bought followers on Twitter, and those who create and purchase them. One of the examples he cited was an Australian Rock Band, The Contagious, noting that “In September, Mr. Vidmar used software to follow more than 100,000 Twitter users in a week for the Australian rock band The Contagious; that boosted the band’s following by 20,000.” With such a public admission, and the means to examine it (the accession curve methodology described previously), we couldn’t resist taking a look. So, what do The Contagious’s followers look like?

Contagious

In short? Suspicious. As we previously noted with Tony Abbott during the Australian Election campaign, those periods which have substantial white space represent periods of time when virtually all of the new followers of a particular user registered their account over the same period of time.

Not only do The Contagious have a suspicious first 20,000 followers, but further batches of 5000 between approximately 23,000 and 28,000 and 54,000 and 59,000, as well as about 30,000 between 60,000 and 90,000.  Even after this, 10,000 between around 110,000 and 120,000 appear suspicious, and there is also a final burst of suspicious looking followers to get the band over the 200,000 follower mark, from around 190,000 to 200,000. All told, around 80,000 of the bands over 320,000, or 25%, appear to have been acquired illegitimately. We can’t say definitively that they were purchased, but they certainly do not appear natural.

I suspect the band, and/or their management, are none to impressed that they were outed in the Wall Street Journal , but this analysis shows how any analysis of the followers of Twitter users will identify suspicious patterns where they exist; something Twitter certainly have the capacity to do, but would need to automate in order to catch these accounts on a large scale.

Finally, in a brief throwback to my old research in game studies, there’s a series of articles at Eve News 24 which describe botting / real money trading in Eve Online; you’ll note a lot of similarities! I also talked about these in the games context in my DiGRA paper. Interestingly, during the course of my research, CCP – developers of Eve Online – did begin cracking down on these accounts; let’s see if Twitter follower suit.