How Could Facebook Align its ML Systems to Human Values? A Data-Driven Approach

(This is a crosspost from the official Surge AI blog.

For background on this blog post...

I used to work at Facebook, YouTube, and Twitter. One of the problems I big worked on: what was the right objective function to align our AI systems towards?

Optimizing for watch time at YouTube, for example, led to longer videos for the sake of longer videos, and clicks on videos with racy thumbnails.

Optimizing engagement at Facebook led to low-quality clickbait, and Hooter's appearing as the top search result when you searched for restaurants in Houston.

Optimizing for favorites and replies at Twitter led to toxic content at the top of the feed.

So while watch time, engagement, and replies would always go up – were these really the products we wanted to build? What happened to Facebook's original mission of connecting users with their friends and family? What did "favorites" have to do with being the platform for public conversation at Twitter? A ton of engineering and data science time is spent measuring active users, but why were there no dashboards measuring progress to broader product goals?

So could we figure out a metric that was better tuned to human values and preferences – but also fast, rigorous, and easily measurable? After all, we still need our A/B tests, ML objectives, and OKRs to align the company around.

This question is particularly important today, with all the troubles that social media platforms face, so I wrote up an approach I've often worked on, based on Facebook data. Read it on the Surge AI blog!

Edwin Chen

Founder at Surge AI: a unified data labeling platform and workforce, designed for the richness of AI.

Need high-quality, human-powered data? Reach out! We help top AI companies and research labs aroun the world create high-skill, human-labeled datasets.

Former AI, data science, and engineering lead at Google, Facebook, Twitter, Dropbox, and MSR. Pure math, theoretical CS, and linguistics at MIT.

Surge AI
Surge AI Blog
Surge AI Twitter
Surge AI LinkedIn
Surge AI Github


Recent Posts

How Could Facebook Align its ML Systems to Human Values? A Data-Driven Approach

A Visual Tool for Exploring Word Embeddings

Surge AI: A New Data Labeling Platform and Workforce for NLP

Exploring LSTMs

Moving Beyond CTR: Better Recommendations Through Human Evaluation

Propensity Modeling, Causal Inference, and Discovering Drivers of Growth

Product Insights for Airbnb

Improving Twitter Search with Real-Time Human Computation

Edge Prediction in a Social Graph: My Solution to Facebook's User Recommendation Contest on Kaggle

Soda vs. Pop with Twitter

Infinite Mixture Models with Nonparametric Bayes and the Dirichlet Process

Instant Interactive Visualization with d3 + ggplot2

Movie Recommendations and More via MapReduce and Scalding

Quick Introduction to ggplot2

Introduction to Conditional Random Fields

Winning the Netflix Prize: A Summary

Stuff Harvard People Like

Information Transmission in a Social Network: Dissecting the Spread of a Quora Post

Introduction to Latent Dirichlet Allocation

Introduction to Restricted Boltzmann Machines

Topic Modeling the Sarah Palin Emails

Filtering for English Tweets: Unsupervised Language Detection on Twitter

Choosing a Machine Learning Classifier

Kickstarter Data Analysis: Success and Pricing

A Mathematical Introduction to Least Angle Regression

Introduction to Cointegration and Pairs Trading

Counting Clusters

Hacker News Analysis

Layman's Introduction to Measure Theory

Layman's Introduction to Random Forests

Netflix Prize Summary: Factorization Meets the Neighborhood

Netflix Prize Summary: Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights

Prime Numbers and the Riemann Zeta Function

Topological Combinatorics and the Evasiveness Conjecture

Item-to-Item Collaborative Filtering with Amazon's Recommendation System