Sunday, August 14, 2011

How Favstar rewards and enables cheaters to steal the best Tweets

Twitter has benefited from users who treat follower counts as a game. Many users spend a lot of effort attempting to gain followers, increasing their engagement with Twitter and keeping them active on the network. An ecosystem of third-party vendors has sprung up to address this preoccupation with follower count.

Favstar as an online game further gamifies Twitter by counting statistics for individual tweets. Data are collected for favorites, retweets, and its own Tweet of the Day awards.

Favstar is most closely focused on tracking a tweet's favorites, data which isn't readily available using Twitter's API. Favstar is currently the sole aggregator of favorite data and serves as the only public source. Thus, Favstar is vital to its users, who are at the forefront of using Twitter's "favorite" option as a "like" button for a tweet. Favstar lets its users know both a tweet's "score" and who awarded the "points."

Some features of Favstar:
  • Favorites, retweets, and Tweet of the Day awards are tracked for each tweet. In addition to counts, the users who awarded them are tracked and displayed.
  • Web pages display the "best" and most recent tweets from each user.
  • Notifications, where Favstar automatically retweets a tweet upon being favorited a certain number of times. Subscribers can increase the number of retweets by defining more notification thresholds.
  • Tweet of the Day award, available only to Favstar subscribers, who may designate one award per day. Sometimes awarded strategically in the hopes of an acknowledgement including a recommendation to follow the awarder.
  • The leaderboard, a set of webpages featuring tweets that have reached favorites counts of 10, 30, 50, or 100.
Most of these features increase the visibility of a tweet (and thus a Twitter account). While possibly resulting in more favorites being awarded, the extra exposure of a tweet is also valuable to anyone looking to increase the number of followers of an account.

Many of Favstar's users treat it as an online game.


Years ago Matthew Pritchard offered game developers two rules about online cheating:
Rule #1: If you build it, they will come -- to hack and cheat.

Rule #2: hacking attempts increase with the success of your game.
Indeed, in online games (i.e., World of Warcraft, the Call of Duty franchise) the temptation to cheat is so strong that companies must take preventative measures. Pritchard's rules have been accepted to the point that within the industry it's considered mandatory to have an anti-cheat solution in place before a game is launched. Online game companies typically ban detected cheaters, even canceling the accounts of paid subscribers.

Currently, Favstar has no mechanism for detecting tweet theft or cheating by its users.

Tweet theft existed long before Favstar, and hasn't been addressed technically by Twitter, although in The Twitter Rules it's defined as being spam: "If you repeatedly post other users' Tweets as your own."

Stolen tweets have become a common enough occurrence that one user has taken it upon himself to create @ThiefPolice, a Twitter account to "name, shame, and help shut down repetitive thieves' accounts."

Twitter's policy of requiring many complaints to close an account for spam works for blantant spammers but not subtle content thieves. The @ThiefPolice account helps generate the collective action needed to close thieving accounts more quickly.

Tweet theft and Favstar, an example

Because it's tracking favorites, Favstar provides the information about which tweets are best to steal. Simply look up any user on Favstar and steal the first tweet on their "Best Of" page, which is their most popular tweet. Or steal any tweet on that page.

Favstar has nothing in place to prevent free users or paid subscribers from cheating. There are Favstar subscribers who are using it as a source of stolen tweets to gain recognition on Favstar itself! Not to mention more Twitter followers.

An example of tweet theft is the account @FunnyB1tch. Note that FunnyB1tch's Favstar page has a More link at the bottom, indicating a subscriber. Here's an excerpt from FunnyB1tch's timeline via a list on Twitter:

All the pictured tweets are stolen (as is almost all of the timeline):
  • As a woman, I like to here how pretty I am and that I.....aww fuck it, just slap my ass and pull my hair! Stolen from @Mommie_EJ with typo intact. First on her Best Of page.
  • If money cannot buy you happiness, give it to me. I'll be HAPPY. Stolen from @elisa212007, with obscenity edited out. Second on her Best Of page.
  • If Red Bull gives you wings, meth gives you a jet pack. Stolen from @HaHaWhitePPL. #12 on his Best Of page.
  • I often confuse serious psychic shit with being hungover. Stolen from @joeinverarity.
  • Why don't witches wear panties when flying on their broomsticks? For better grip. This ancient joke is found all over Twitter and the Internet.
Here's a case where Favstar rewarded tweet theft:

  • Orgasms are like washing your car, you can do it yourself, but it’s so much better when a man does it for you. Stolen from @Jazzzzzmina. First on her Best Of page.
The blue and orange star icon indicates that the above Tweet was retweeted by Favstar itself! The retweet was via Favstar's experimental @favstar_pop account. In addition, since it was favorited by 10 users, the Tweet would briefly appear on the Most recent tweets with 10 favs leaderboard.

What Favstar could do

Before discussing measures Favstar could take to combat cheating, first some definitions of three severities of tweet theft:
  • Verbatim: The tweet is stolen unchanged. Automated detection is easy.
  • Minor edit: Minor edits, often to remove profanity or correct a typo. Automated detection is harder*.
  • Major rewrite: The content is rewritten in a manner that drastically changes the tweet. Can be very difficult to distinguish from non-stolen tweets conceived in parallel which are similar due to a common theme. Automated detection is impossible.
*A stolen tweet may appear to be identical if tricks employed by email spammers attempting to aid detection are used, such as whitespace and unicode character substitutions.
The severities fall on a continuum and the boundaries between them aren't fixed.

In Favstar's case, cheating need not be detected in real time. Processing required for cheat detection could happen as infrequently as once per day (during periods of light load) and still suffice.

Verbatim tweet theft could be detected automatically, at the cost of the CPU time needed to perform a cryptographic hash function such as SHA-256 on the texts and database space to store them.

Minor edits employing email spam tricks could be detected using conventional anti-spam measures to normalize texts before processing them for verbatim detection.

Minor edits to content might be detectable using Bayesian methods at the cost of some false positives and negatives.

No attempt should be made to automatically detect major edits.

Since there isn't an automated way to detect major edits or reliably detect minor ones, perhaps a user-based reporting or voting system could be implemented. There are many ways to implement this, ranging from entirely automated to having a number of volunteers from the community review the reports before declaring theft.

With enough agreement that two tweets are identical, theft could be declared. Too many thefts would result in the Favstar account being disabled.

Whatever system is implemented it should be recognized that some cases will be tough to judge. Many people are tweeting about common subjects and thus similar non-identical tweets may appear to be rewritten or inspired from others when they were conceived separately.


Because Favstar is running the game, it's Favstar's responsibility to deal with the cheaters. Favstar is currently the sole source of aggregated per-tweet favorite data, so it uniquely enables thieves to cherry-pick the best tweets. The thieves are both Favstar's own cheaters and Twitter spammers who are using it to identify the best content to steal.

[Update, 8/16] First, has a statement on plagiarism. It puts the onus on the user to deal with all plagiarism, first with the offender and then by filing a complaint with Twitter.

Second, Twitter is slowly rolling out the new @username and Activity tabs, which will make the favorite data readily available and thus remove Favstar's complete monopoly on the per-tweet data. Favstar will still be the only source of aggregate and historical data.


  1. I agree with the previous commentor on Haleys goldfish tweet. It applies here as well. You do so need to get a life! Tweets are used left right and centre by whomever, from whomever. Get off your high horse, sheesh!!

  2. Favstar is a combination of problems. Technically, it's totally unreliable. Ethically it's corrupt. Complain and you're blocked from viewing your tweets and from viewing those who have supported the tweets. The repitition is mind boggling. There is favoritism that doesn't give everyone the same opportunities. Read it, read it for several days. Check out the Leaderboard for those who receive 100+ stars on their tweets. Anything look familiar?