Tweets vs. Likes: What gets shared on Twitter vs. Facebook?

It always strikes me as curious that some posts get a lot of love on Twitter, while others get many more shares on Facebook:

Twitter Beats FB

FB Beats Twitter

What accounts for this difference? Some of it is surely site-dependent: maybe one blogger has a Facebook page but not a Twitter account, while another has these roles reversed. But even on sites maintained by a single author, tweet-to-likes ratios can vary widely from post to post.

So what kinds of articles tend to be more popular on Twitter, and which spread more easily on Facebook? To take a stab at an answer, I scraped data from a couple of websites over the weekend.

tl;dr Twitter is still for the techies: articles where the number of tweets greatly outnumber FB likes tend to revolve around software companies and programming. Facebook, on the other hand, appeals to everyone else: yeah, to the masses, and to non-software technical folks in general as well.

FlowingData

The first site I looked at was Nathan Yau’s awesome FlowingData website on data visualization. To see which articles are more popular on Facebook and which are more popular on Twitter, let’s sort all the FlowingData articles by their # tweets / # likes ratio.

Here are the 10 posts with the lowest tweets-to-likes ratio (i.e., the posts that were especially popular with Facebook users):

FlowingData Facebook

And here are the 10 posts with the highest tweets-to-like ratio (i.e., the posts especially popular with Twitter users):

FlowingData Twitter

Notice any differences between the two?

  • Instant gratification infographics, cuteness, comics, and pop culture get liked on Facebook.
  • APIs, datasets, visualizations related to techie sites (Delicious, foursquare, Twitter, LinkedIn), and picture-less articles get tweeted instead.

Interestingly, it also looks like the colors in the top 10 Facebook articles tend to the red end of the spectrum, while the colors in the top 10 Twitter articles tend to the blue end of the spectrum. Does this pattern hold if we look at more data? Here’s a meta-visualization of the FlowingData articles, sorted by articles popular on Facebook in the top left to articles popular on Twitter in the bottom right (see here for some interactivity and more details):

FlowingData MetaViz

It does indeed look like the images at the top (the articles popular on Facebook) are more pink, while the images at the bottom (the articles popular on Twitter) are more blue (though it would be nice to quantify this in some way)!

Furthermore, we can easily see from the grid that articles with no visualizations (represented by lorem ipsum text in the grid) cluster at the bottom. Grabbing some actual numbers, we find that 32% of articles with at least one picture have more shares on Facebook than on Twitter, compared to only 4% of articles with no picture at all.

Effect of a visualization

Finally, let’s break down the percentage of articles with more Facebook shares by category.

FlowingData Categories

(I filtered the categories so that each category in the plot above contains at least 5 articles.)

What do we find?

  • Articles in the Software, Online Applications, News, and Data sources categories (yawn) get 100% of their shares from Twitter.
  • Articles tagged with Data Underload (which seems to contain short and sweet visualizations of everyday things), Miscellaneous (which contains lots of comics or comic-like visualizations), and Infographics get the most shares on Facebook.
  • This category breakdown matches precisely what we saw in the top 10 examples above.

New Scientist

When looking at FlowingData, we saw that Twitter users are much bigger on sharing technical articles. But is this true for technical articles in general, or only for programming-related posts? (In my experience with Twitter, I haven’t seen many people from math and the non-computer sciences.)

To answer, I took articles from the Physics & Math and Technology sections of New Scientist, and

  • Calculated the percentage of shares each article received on Twitter (i.e., # tweets / (# tweets + # likes)).
  • Grouped articles by their number of tweets rounded to the nearest multiple of 25 (bin #1 contains articles close to 25 tweets, bin #2 contains articles close to 50 tweets, etc.).
  • Calculated the median percentage of shares on Twitter for each bin.

Here’s a graph of the result:

Technology vs. Physics & Math

Notice that:

  • The technology articles get consistently more shares from Twitter than the physics and math articles do.
  • Twitter accounts for the majority of the technology shares.
  • Facebook accounts for the majority of the physics and math shares.

So this suggests that Twitter really is for computer technology in particular, not technical matters in general (though it would be nice to look at areas other than physics and math as well).

Quora

To get some additional evidence on the computer science vs. math/physics divide, I

  • Scraped about 350 profiles of followers from each of the Computer Science, Software Engineering, Mathematics, and Physics categories on Quora;
  • Checked each user to see whether they link to their Facebook and Twitter accounts on their profile.

Here’s the ratio of the number of people linking to their Facebook account to the number of people linking to their Twitter account, sliced by topic:

Math/Physics vs. CS/Software

Math/Physics vs. CS/Software, Collapsed

We find exactly what we expect from the New Scientist data: people following the math and physics categories have noticeably smaller Twitter / Facebook ratios compared to people following the computer science and software engineering categories (i.e., compared to computer scientists and software engineers, mathematicians and physicists are more likely to be on Facebook than on Twitter). What’s more, this difference is in fact significant: the graphs display individual 90% confidence intervals (which overlap not at all or only slightly), and we do indeed get significance at the 95% level if we look at the differences between categories.

This corroborates the New Scientist evidence that Twitter gets the computer technology shares, while Facebook gets the math and physics shares.

XKCD

Finally, let’s take a look at which XKCD comics are especially popular on Facebook vs. Twitter.

Here are the 10 comics with the highest likes-to-tweets ratio (i.e., the comics especially popular on Facebook):

XKCD Facebook

Here are the 10 comics with the highest tweets-to-likes ratio (i.e., the comics especially popular on Twitter):

XKCD Twitter

Note that the XKCD comics popular on Facebook have more of a layman flavor, while the XKCD comics popular on Twitter are much more programming-related:

  • Of the XKCD comics popular on Twitter, one’s about server attention spans, another’s about IPv6 addresses, a third is about GNU info pages, another deals with cloud computing, a fifth talks about Java, and the last is about a bunch of techie sites. (This is just like what we saw with the FlowingData visualizations.)
  • Facebook, on the other hand, gets Ke$ha and Magic School Bus.
  • And while both top 10’s contain a flowchart, the one popular on FB is about cooking, while the one popular on Twitter is about code!
  • What’s more, if we look at the few technical-ish comics that are more popular on Facebook (the complex conjugate, mu, and Los Alamos comics), we see that they’re about physics and math, not programming (which matches our findings from the New Scientist articles).

Lesson

So why should you care? Here’s one takeaway:

  • If you’re blogging about technology, programming, and computer science, Twitter is your friend.
  • But if you’re blogging about anything else, be it math/physics or pop culture, don’t rely on a Twitter account alone; your shares are more likely to propagate on Facebook, so make sure to have a Facebook page as well.

What’s Next?

The three websites I looked at are all fairly tech-oriented, so it would be nice to gather data from other kinds of websites as well.

And now that we have an idea how Twitter and Facebook compare, the next burning question is surely: what do people share on Google+?!

Addendum

Let’s consider the following thought experiment. Suppose you come across the most unpopular article ever written. What will its FB vs. Twitter shares look like? Although no real person will ever share this article, I think Twitter has many more spambots (who tweet out any and every link) than FB does, so maybe unpopular articles will have more tweets than likes by default. Conversely, suppose you come across the most popular article ever written, which everybody wants to share. Then since FB has many more users than Twitter does, maybe popular articles will tend to have more likes than tweets anyways.

Thus, in order to find out which types of articles are especially popular on FB vs. Twitter, instead of looking at tweets-to-likes ratios directly, we could try to remove this baseline popularity effect. (Taking ratios instead of raw number of tweets or raw number of likes is one kind of normalization; this is another.)

So does this scenario (or something similar to it) actually play out in practice?

Overall Popularity vs. Facebook

Here I’ve plotted the overall popularity of a post (the total number of shares it received on either Twitter or FB) against the percentage of shares on Facebook alone, and we can see that as a post’s popularity grows, more and more shares do indeed tend to come from Facebook rather than Twitter.

Also, see the posts at the lower end of the popularity scale that are only getting shares on Twitter? Let’s take a look at the five most unpopular of these:

Notice that they’re all shoutouts to FlowingData’s sponsors! There’s pretty much no reason any real person would share these on Twitter or Facebook, and indeed, checking Twitter to see who actually tweeted out these links, we see that the tweeters are bots:

Now let’s switch to a slightly different view of the above scenario, where I plot number of tweets against number of likes:

FlowingData Tweets vs. Likes

We see that as popularity on Twitter increases, so too does popularity on Facebook – but at a slightly faster rate. (The form of the blue line plotted is roughly $\log(likes) = -3.87 + 1.70 \log(tweets)$.)

So instead of looking at the ratios above, to figure out which articles are popular on FB vs. Twitter, we could look at the residuals of the above plot. Posts with large positive residuals would be posts that are especially popular on FB, and posts with negative residuals would be posts that are especially popular on Twitter.

In practice, however, there wasn’t much difference between looking at residuals vs. ratios directly when using the datasets I had, so to keep things simple in the main discussion above, I stuck to ratios alone. Still, it’s another option which might be useful when looking at different questions or different sources of data, so just for completeness, here’s what the FlowingData results look like if we use residuals instead.

The 10 articles with the highest residuals (i.e., the articles most popular on Facebook):

The 10 articles with the lowest residuals (i.e., the articles most popular on Twitter):

Here’s a density plot of article residuals, split by whether the article has a visualization or not (residuals of picture-free articles are clearly shifted towards the negative end):

Residuals

Here are the mean residuals per category (again, we see that the miscellaneous, data underload, data art, and infographics categories tend to be more popular on Facebook, while the data sources, software, online applications, and news categories tend to be more popular on Twitter):

Category Residuals

And that’s it! In the spirit of these findings, I hope this article gets liked a little and tweeted lots and lots.

Edwin Chen

Human/AI. Ex: math and linguistics at MIT, speech recognition at MSR, quant trading at Clarium, ads at Twitter, ML at Google.


I work on math, machine learning, human computation, and visualization. Say hello!


hello[æ]echen.me


Quora
Twitter
Github

Atom / RSS

Recent Posts

Exploring LSTMs

Moving Beyond CTR: Better Recommendations Through Human Evaluation

Propensity Modeling, Causal Inference, and Discovering Drivers of Growth

Product Insights for Airbnb

Improving Twitter Search with Real-Time Human Computation

Edge Prediction in a Social Graph: My Solution to Facebook's User Recommendation Contest on Kaggle

Soda vs. Pop with Twitter

Infinite Mixture Models with Nonparametric Bayes and the Dirichlet Process

Instant Interactive Visualization with d3 + ggplot2

Movie Recommendations and More via MapReduce and Scalding

Quick Introduction to ggplot2

Introduction to Conditional Random Fields

Winning the Netflix Prize: A Summary

Stuff Harvard People Like

Information Transmission in a Social Network: Dissecting the Spread of a Quora Post

Introduction to Latent Dirichlet Allocation

Tweets vs. Likes: What gets shared on Twitter vs. Facebook?

Introduction to Restricted Boltzmann Machines

Topic Modeling the Sarah Palin Emails

Filtering for English Tweets: Unsupervised Language Detection on Twitter

Choosing a Machine Learning Classifier

Kickstarter Data Analysis: Success and Pricing

A Mathematical Introduction to Least Angle Regression

Introduction to Cointegration and Pairs Trading

Counting Clusters

Hacker News Analysis

Layman's Introduction to Measure Theory

Layman's Introduction to Random Forests

Netflix Prize Summary: Factorization Meets the Neighborhood

Netflix Prize Summary: Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights

Prime Numbers and the Riemann Zeta Function

Topological Combinatorics and the Evasiveness Conjecture

Item-to-Item Collaborative Filtering with Amazon's Recommendation System