Stuff Harvard People Like

What types of students go to which schools? There are, of course, the classic stereotypes:

  • MIT has the hacker engineers.
  • Stanford has the laid-back, social folks.
  • Harvard has the prestigious leaders of the world.
  • Berkeley has the activist hippies.
  • Caltech has the hardcore science nerds.

But how well do these perceptions match reality? What are students at Stanford, Harvard, MIT, Caltech, and Berkeley really interested in? Following the path of my previous data-driven post on differences between Silicon Valley and NYC, I scraped the Quora profiles of a couple hundred followers of each school to find out.

Topics

So let’s look at what kinds of topics followers of each school are interested in*. (Skip past the lists for a discussion.)

MIT

Topics are followed by p(school = MIT|topic).

  • MIT Media Lab 0.893
  • Ksplice 0.69
  • Lisp (programming language) 0.677
  • Nokia 0.659
  • Public Speaking 0.65
  • Data Storage 0.65
  • Google Voice 0.609
  • Hacking 0.602
  • Startups in Europe 0.597
  • Startup Names 0.572
  • Mechanical Engineering 0.563
  • Engineering 0.563
  • Distributed Databases 0.544
  • StackOverflow 0.536
  • Boston 0.513
  • Learning 0.507
  • Open Source 0.498
  • Cambridge 0.496
  • Public Relations 0.493
  • Visualization 0.492
  • Semantic Web 0.486
  • Andreessen-Horowitz 0.483
  • Nature 0.475
  • Cryptography 0.474
  • Startups in Boston 0.452
  • Adobe Photoshop 0.451
  • Computer Security 0.447
  • Sachin Tendulkar 0.443
  • Hacker News 0.442
  • Games 0.429
  • Android Applications 0.428
  • Best Engineers and Programmers 0.427
  • College Admissions & Getting Into College 0.422
  • Co-Founders 0.419
  • Big Data 0.41
  • System Administration 0.4
  • Biotechnology 0.398
  • Higher Education 0.394
  • NoSQL 0.387
  • User Experience 0.386
  • Career Advice 0.377
  • Artificial Intelligence 0.375
  • Scalability 0.37
  • Taylor Swift 0.368
  • Google Search 0.368
  • Functional Programming 0.365
  • Bing 0.363
  • Bioinformatics 0.361
  • How I Met Your Mother (TV series) 0.361
  • Operating Systems 0.356
  • Compilers 0.355
  • Google Chrome 0.354
  • Management & Organizational Leadership 0.35
  • Literary Fiction 0.35
  • Intelligence 0.348
  • Fight Club (1999 movie) 0.344
  • Hip Hop Music 0.34
  • UX Design 0.337
  • Web Application Frameworks 0.336
  • Startups in New York City 0.333
  • Book Recommendations 0.33
  • Engineering Recruiting 0.33
  • Search Engines 0.329
  • Social Search 0.329
  • Data Science 0.328
  • History 0.328
  • Interaction Design 0.326
  • Classification (machine learning) 0.322
  • Startup Incubators and Seed Programs 0.321
  • Graphic Design 0.321
  • Product Design (software) 0.319
  • The College Experience 0.319
  • Writing 0.319
  • MapReduce 0.318
  • Database Systems 0.315
  • User Interfaces 0.314
  • Literature 0.314
  • C (programming language) 0.314
  • Television 0.314
  • Reading 0.313
  • Usability 0.312
  • Books 0.312
  • Computers 0.311
  • Stealth Startups 0.311
  • Daft Punk 0.31
  • Healthy Eating 0.309
  • Innovation 0.309
  • Skiing 0.305
  • JavaScript 0.304
  • Rock Music 0.304
  • Mozilla Firefox 0.304
  • Self-Improvement 0.303
  • McKinsey & Company 0.302
  • AngelList 0.301
  • Data Visualization 0.301
  • Cassandra (database) 0.301

Stanford

Topics are followed by p(school = Stanford|topic).

  • Stanford Computer Science 0.951
  • Stanford Graduate School of Business 0.939
  • Stanford 0.896
  • Stanford Football 0.896
  • Stanford Cardinal 0.896
  • Social Dance 0.847
  • Stanford University Courses 0.847
  • Romance 0.769
  • Instagram 0.745
  • College Football 0.665
  • Mobile Location Applications 0.634
  • Online Communities 0.621
  • Interpersonal Relationships 0.585
  • Food & Restaurants in Palo Alto 0.572
  • Your 20s 0.566
  • Men’s Fashion 0.548
  • Flipboard 0.537
  • Inception (2010 movie) 0.535
  • Tumblr 0.531
  • People Skills 0.522
  • Exercise 0.52
  • Joel Spolsky 0.516
  • Valuations 0.515
  • The Social Network (2010 movie) 0.513
  • LeBron James 0.506
  • Northern California 0.506
  • Evernote 0.5
  • Quora Community 0.5
  • Blogging 0.49
  • Downtown Palo Alto 0.487
  • The College Experience 0.485
  • Consumer Internet 0.477
  • Restaurants in San Francisco 0.477
  • Chad Hurley 0.47
  • Meditation 0.468
  • Yishan Wong 0.466
  • Arrested Development (TV series) 0.463
  • fbFund 0.457
  • Best Engineers at X Company 0.451
  • Language 0.45
  • Words 0.448
  • Happiness 0.447
  • Path (company) 0.446
  • Color Labs (startup) 0.446
  • Palo Alto 0.445
  • Woot.com 0.442
  • Beer 0.442
  • PayPal 0.441
  • Women in Startups 0.438
  • Techmeme 0.433
  • Women in Engineering 0.428
  • The Mission (San Francisco neighborhood) 0.427
  • iPhone Applications 0.416
  • Asana 0.413
  • Monetization 0.412
  • Repetitive Strain Injury (RSI) 0.4
  • IDEO 0.398
  • Spotify 0.397
  • San Francisco Giants 0.396
  • Fortune Magazine 0.389
  • Love 0.387
  • Human-Computer Interaction 0.382
  • Hip Hop Music 0.378
  • Self-Improvement 0.378
  • Food in San Francisco 0.375
  • Quora (company) 0.374
  • Quora Infrastructure 0.373
  • iPhone 0.371
  • Square (company) 0.369
  • Social Psychology 0.369
  • Network Effects 0.366
  • Chris Sacca 0.365
  • Walt Mossberg 0.364
  • Salesforce.com 0.362
  • Sex 0.361
  • Etiquette 0.361
  • David Pogue 0.361
  • Gowalla 0.36
  • iOS Development 0.354
  • Palantir Technologies 0.353
  • Mobile Computing 0.347
  • Sports 0.346
  • Video Games 0.345
  • Burning Man 0.345
  • Engineering Management 0.343
  • Cognitive Science 0.342
  • Dating & Relationships 0.341
  • Fred Wilson (venture investor) 0.337
  • Taiwan 0.333
  • Natural Language Processing 0.33
  • Eric Schmidt 0.329
  • Social Advice 0.329
  • Engineering Recruiting 0.328
  • Job Interviews 0.325
  • Mobile Phones 0.324
  • Twitter Inc. (company) 0.321
  • Engineering in Silicon Valley 0.321
  • San Francisco Bay Area 0.321
  • Google Analytics 0.32
  • Fashion 0.315
  • Interaction Design 0.314
  • Open Graph 0.313
  • Drugs & Pharmaceuticals 0.312
  • Electronic Music 0.312
  • Facebook Inc. (company) 0.309
  • Fitness 0.309
  • YouTube 0.308
  • TED Talks 0.308
  • Freakonomics (2005 Book) 0.307
  • Jack Dorsey 0.306
  • Nutrition 0.305
  • Puzzles 0.305
  • Silicon Valley Mergers & Acquisitions 0.304
  • Viral Growth & Analytics 0.304
  • Amazon Web Services 0.304
  • StumbleUpon 0.303
  • Exceptional Comment Threads 0.303

Harvard

  • Harvard Business School 0.968
  • Harvard Business Review 0.922
  • Harvard Square 0.912
  • Harvard Law School 0.912
  • Jimmy Fallon 0.899
  • Boston Red Sox 0.658
  • Klout 0.644
  • Oprah Winfrey 0.596
  • Ivanka Trump 0.587
  • Dalai Lama 0.569
  • Food in New York City 0.565
  • U2 0.562
  • TwitPic 0.534
  • 37signals 0.522
  • David Lynch (director) 0.512
  • Al Gore 0.508
  • TechStars 0.49
  • Baseball 0.487
  • Private Equity 0.471
  • Classical Music 0.46
  • Startups in New York City 0.458
  • HootSuite 0.449
  • Kiva 0.442
  • Ultimate Frisbee 0.441
  • Huffington Post 0.436
  • New York City 0.433
  • Charlie Cheever 0.433
  • The New York Times 0.431
  • Technology Journalism 0.431
  • McKinsey & Company 0.427
  • TweetDeck 0.422
  • How Does X Work? 0.417
  • Ashton Kutcher 0.414
  • Coldplay 0.402
  • Conan O’Brien 0.397
  • Fast Company 0.397
  • WikiLeaks 0.394
  • Michael Jackson 0.389
  • Guy Kawasaki 0.389
  • Journalism 0.384
  • Wall Street Journal 0.384
  • Cambridge 0.371
  • Seattle 0.37
  • Cities & Metro Areas 0.357
  • Boston 0.353
  • Tim Ferriss (author) 0.35
  • The New Yorker 0.343
  • Law 0.34
  • Mashable 0.338
  • Politics 0.335
  • The Economist 0.334
  • Barack Obama 0.333
  • Skiing 0.329
  • McKinsey Quarterly 0.325
  • Wired (magazine) 0.316
  • Bill Gates 0.31
  • Mad Men (TV series) 0.308
  • India 0.306
  • TED Talks 0.306
  • Netflix 0.304
  • Wine 0.303
  • Angel Investors 0.302
  • Facebook Ads 0.301

UC Berkeley

  • Berkeley 0.978
  • California Golden Bears 0.91
  • Internships 0.717
  • Web Marketing 0.484
  • Google Social Strategy 0.453
  • Southwest Airlines 0.451
  • WordPress 0.429
  • Stock Market 0.429
  • BMW (automobile) 0.428
  • Web Applications 0.423
  • Flickr 0.422
  • Snowboarding 0.42
  • Electronic Music 0.404
  • MySQL 0.401
  • Internet Advertising 0.399
  • Search Engine Optimization (SEO) 0.398
  • Yelp 0.396
  • Groupon 0.393
  • In-N-Out Burger 0.391
  • The Matrix (1999 movie) 0.389
  • Trading (finance) 0.385
  • jQuery 0.381
  • Hedge Funds 0.378
  • Social Media Marketing 0.377
  • San Francisco 0.376
  • Stealth Startups 0.362
  • Yahoo! 0.36
  • Cascading Style Sheets 0.359
  • Angel Investors 0.355
  • UX Design 0.35
  • StarCraft 0.348
  • Los Angeles Lakers 0.347
  • Mountain View 0.345
  • How I Met Your Mother (TV series) 0.338
  • Google+ 0.337
  • Ruby on Rails 0.333
  • Reading 0.333
  • Social Media 0.326
  • China 0.322
  • Palantir Technologies 0.319
  • Facebook Platform 0.315
  • Basketball 0.315
  • Education 0.314
  • Business Development 0.312
  • Online & Mobile Payments 0.305
  • Restaurants in San Francisco 0.302
  • Technology Companies 0.302
  • Seth Godin 0.3

Caltech

  • Pasadena 0.969
  • Chess 0.748
  • Table Tennis 0.671
  • UCLA 0.67
  • MacBook Pro 0.618
  • Physics 0.618
  • Haskell 0.582
  • Los Angeles 0.58
  • Electrical Engineering 0.567
  • Star Trek (movie 0.561
  • Disruptive Technology 0.545
  • Science 0.53
  • Biology 0.526
  • Quantum Mechanics 0.521
  • LaTeX 0.514
  • Mathematics 0.488
  • xkcd 0.488
  • Genetics & Heredity 0.487
  • Chemistry 0.47
  • Medicine & Healthcare 0.448
  • Poker 0.445
  • C++ (programming language) 0.442
  • Data Structures 0.434
  • Emacs 0.428
  • MongoDB 0.423
  • Neuroscience 0.404
  • Science Fiction 0.4
  • Mac OS X 0.394
  • Board Games 0.387
  • Computers 0.386
  • Research 0.385
  • Finance 0.385
  • The Future 0.379
  • Linux 0.378
  • The Colbert Report 0.376
  • The Beatles 0.374
  • The Onion 0.365
  • Ruby 0.363
  • Cars & Automobiles 0.361
  • Quantitative Finance 0.359
  • Academia 0.359
  • Law 0.355
  • Cooking 0.354
  • Psychology 0.349
  • Eminem 0.347
  • Football (Soccer) 0.346
  • Computer Programming 0.343
  • Algorithms 0.343
  • Evolutionary Biology 0.337
  • Behavioral Economics 0.335
  • California 0.329
  • Machine Learning 0.326
  • Futurama 0.324
  • Social Advice 0.324
  • StarCraft II 0.319
  • Job Interview Questions 0.318
  • Game Theory 0.316
  • This American Life 0.315
  • Economics 0.314
  • Vim 0.31
  • Graduate School 0.309
  • Git (revision control) 0.306
  • Computer Science 0.303

What do we see?

  • First, in a nice validation of this approach, we find that each school is interested in exactly the locations we’d expect: Caltech is interested in Pasadena and Los Angeles; MIT and Harvard are both interested in Boston and Cambridge (Harvard is interested in New York City as well); Stanford is interested in Palo Alto, Northern California, and San Francisco Bay Area; and Berkeley is interested in Berkeley, San Francisco, and Mountain View.
  • More interestingly, let’s look at where each school likes to eat. Stereotypically, we expect Harvard, Stanford, and Berkeley students to be more outgoing and social, and MIT and Caltech students to be more introverted. This is indeed what we find:
    • Harvard follows Food in New York City; Stanford follows Food & Restaurants in Palo Alto, Restaurants in San Francisco, and Food in San Francisco; and Berkeley follows Restaurants in San Francisco and In-N-Out Burger. In other words, Harvard, Stanford, and Berkeley love eating out.
    • Caltech, on the other hand, loves Cooking, and MIT loves Healthy Eating – both signs, perhaps, of a preference for eating in.
  • And what does each university use to quench their thirst? Harvard students like to drink wine (classy!), while Stanford students prefer beer (the social drink of choice).
  • What about sports teams? MIT and Caltech couldn’t care less, though Harvard follows the Boston Red Sox, Stanford follows the San Francisco Giants (as well as their own Stanford Football and Stanford Cardinal), and Berkeley follows the Los Angeles Lakers (and the California Golden Bears).
  • For sports themselves, MIT students like skiing; Stanford students like general exercise, fitness, and sports; Harvard students like baseball, ultimate frisbee, and skiing; and Berkeley students like snowboarding. Caltech, in a league of its own, enjoys table tennis and chess.
  • What does each school think of social? Caltech students look for Social Advice. Berkeley students are interested in Social Media and Social Media Marketing. MIT, on the more technical side, wants Social Search. Stanford students, predictably, love the whole spectrum of social offerings, from Social Dance and The Social Network, to Social Psychology and Social Advice. (Interestingly, Caltech and Stanford are both interested in Social Advice, though I wonder if it’s for slightly different reasons.)
  • What’s each school’s relationship with computers? Caltech students are interested in Computer Science, MIT hackers are interested in Computer Security, and Stanford students are interested in Human-Computer Interaction.
  • Digging into the MIT vs. Caltech divide a little, we see that Caltech students really are more interested in the pure sciences (Physics, Science, Biology, Quantum Mechanics, Mathematics, Chemistry, etc.), while MIT students are more on the applied and engineering sides (Mechanical Engineering, Engineering, Distributed Databases, Cryptography, Computer Security, Biotechnology, Operating Systems, Compilers, etc.).
  • Regarding programming languages, Caltech students love Haskell (hardcore purity!), while MIT students love Lisp.
  • What does each school like to read, both offline and online? Caltech loves science fiction, xkcd, and The Onion; MIT likes Hacker News; Harvard loves journals, newspapers, and magazines (Huffington Post, the New York Times, Fortune, Wall Street Journal, the New Yorker, the Economist, and so on); and Stanford likes TechMeme.
  • What movies and television shows does each school like to watch? Caltech likes Star Trek, the Colbert Report, and Futurama. MIT likes Fight Club (I don’t know what this has to do with MIT, though I will note that on my first day as a freshman in a new dorm, Fight Club was precisely the movie we all went to a lecture hall to see). Stanford likes The Social Network and Inception. Harvard, rather fittingly, likes Mad Men and Ted Talks.
  • Let’s look at the startups each school follows. MIT, of course, likes Ksplice. Berkeley likes Yelp and Groupon. Stanford likes just about every startup under the sun (Instagram, Flipboard, Tumblr, Path, Color Labs, etc.). And Harvard, that bastion of hard-won influence and prestige? To the surprise of precisely no one, Harvard enjoys Klout.

Let’s end with a summarized view of each school:

  • Caltech is very much into the sciences (Physics, Biology, Quantum Mechanics, Mathematics, etc.), as well as many pretty nerdy topics (Star Trek, Science Fiction, xkcd, Futurama, Starcraft II, etc.).
  • MIT is dominated by everything engineering and tech.
  • Stanford loves relationships (interpersonal relationships, people skills, love, network effects, sex, etiquette, dating and relationships, romance), health and appearance (fashion, fitness, nutrition, happiness), and startups (Instagram, Flipboard, Path, Color Labs, etc.).
  • Berkeley, sadly, is perhaps too large and diverse for an overall characterization.
  • Harvard students are fascinated by famous figures (Jimmy Fallon, Oprah Winfrey, Invaka Trump, Dalai Lama, David Lynch, Al Gore, Bill Gates, Barack Obama), and by prestigious newspapers, journals, and magazines (Fortune, the New York Times, the Wall Street Journal, the Economist, and so on). Other very fitting interests include Kiva, classical music, and Coldplay.

*I pulled about 400 followers from each school, and added a couple filters, to try to ensure that followers were actual attendees of the schools rather than general people simply interested in them. Topics are sorted using a naive Bayes score and filtered to have at least 5 counts. Also, a word of warning: my dataset was fairly small and users on Quora are almost certainly not representative of their schools as a whole (though I tried to be rigorous with what I had).

Edwin Chen

Surge AI CEO: data labeling and RLHF, designed for the next generation of AI.


Need high-quality, human-powered data? We help top AI and LLM companies around the world create powerful, human-labeled datasets.


Ex: AI, data science at Google, Facebook, Twitter, Dropbox, MSR. Pure math and linguistics at MIT.


Surge AI
Surge AI Blog
Surge AI Twitter
Surge AI LinkedIn
Surge AI Github

Twitter
LinkedIn
Github
Quora
Email

Recent Posts

A Visual Tool for Exploring Word Embeddings

Surge AI: A New Data Labeling Platform and Workforce for NLP

How Could Facebook Align its ML Systems to Human Values? A Data-Driven Approach

Exploring LSTMs

Moving Beyond CTR: Better Recommendations Through Human Evaluation

Propensity Modeling, Causal Inference, and Discovering Drivers of Growth

Product Insights for Airbnb

Improving Twitter Search with Real-Time Human Computation

Edge Prediction in a Social Graph: My Solution to Facebook's User Recommendation Contest on Kaggle

Soda vs. Pop with Twitter

Infinite Mixture Models with Nonparametric Bayes and the Dirichlet Process

Instant Interactive Visualization with d3 + ggplot2

Movie Recommendations and More via MapReduce and Scalding

Quick Introduction to ggplot2

Introduction to Conditional Random Fields

Winning the Netflix Prize: A Summary

Stuff Harvard People Like

Information Transmission in a Social Network: Dissecting the Spread of a Quora Post

Introduction to Latent Dirichlet Allocation

Introduction to Restricted Boltzmann Machines

Topic Modeling the Sarah Palin Emails

Filtering for English Tweets: Unsupervised Language Detection on Twitter

Choosing a Machine Learning Classifier

Kickstarter Data Analysis: Success and Pricing

A Mathematical Introduction to Least Angle Regression

Introduction to Cointegration and Pairs Trading

Counting Clusters

Hacker News Analysis

Layman's Introduction to Measure Theory

Layman's Introduction to Random Forests

Netflix Prize Summary: Factorization Meets the Neighborhood

Netflix Prize Summary: Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights

Prime Numbers and the Riemann Zeta Function

Topological Combinatorics and the Evasiveness Conjecture

Item-to-Item Collaborative Filtering with Amazon's Recommendation System