Review: The Secret Life of Pronouns: What Our Words Say About Us

The Secret Life of Pronouns: What Our Words Say About Us
The Secret Life of Pronouns: What Our Words Say About Us by James W. Pennebaker

More interesting than just a look at pronouns grammatically, this is really a psychology book about how we put our words together (and what that can tell us), encompassing the class of “function words” (including pronouns) that make up a substantial part of our speech. Because of the role that function words play in establishing the structure we use to fill in the rest of the words we use (i.e. nouns, verbs, words with primarily semantic content), Pennebaker looks at patterns in function word frequencies and finds strong correlations with interesting real-world classifications: personality types, rhetoric, political speech, gender differences, even income and education gaps. Elucidating these various correlations forms the majority of the book.

Using function word analysis and modern Natural Language Processing techniques, Pennebaker shows how you can make predictions about the author of an anonymous text, and perform simple culturomics (e.g. gauging national “mood” after the 9/11 disaster) by surveying the text of blog posts across on the internet, all without recourse to more complex semantic information.

A recommended read for anyone interested in psychology and language, and also for those curious to see what modern technology applied to language analysis can tell us about ourselves.

View my other reviews on Goodreads

Vanishing Words Tell Illuminating Tales

The Library of Congress set up a deal a few weeks ago to acquire Twitter’s complete archive of public messages. It’s not a particularly impressive number of bytes by itself, but it’s a goldmine for computational analysis. And that academic potential is behind the government wanting to obtain what might seem like a vast cacophony of meaningless chatter.

In the WNYC Radiolab podcast released today, “Vanishing Words“, Jad and Robert look at linguistic computation. Specifically, the idea that you can identify and predict dementia using word analysis of personal history, say a collection of letters or diary entries. Or if you’re Agatha Christie, crime novels. If you’ve got a minute let Jad Abumrad & Robert Krulwich tell you about this:

Working with Jad’s mention of “the age of Twitter”: online services like Twitter, Facebook, Google, and so on are quite earnestly working with words as scientific data; it’s a core element of staying competitive in their business. Computational language analysis is a fascinating field, and luckily it also seems to have powerful economic incentive.

Word data is probably still the easiest way to directly get highly personalized information about a person (e.g. a status update, a tweet). Facebook Data Scientists, for example, work primarily to teach computer models to interpret the words used in Facebook status updates into meaningful demographic data. The computers gather information and the scientists pick out interesting patterns so that better, more personalized advertising can be served. Better targeted ads translate to actual interest in ads, which translates to business.

Computational research and analysis (like the studies mentioned in this Radiolab podcast) is exploding commercially and academically, like a virtual internet gold rush. Supply is growing exponentially as hundreds of millions of people use online services to communicate publicly. Demand is blowing up too, because we’re realizing, like these scientists discovering something deeply personal about Agatha Christie, just how much we can learn from a simple collection of words.

It’s exciting to consider how much we may be able to learn about ourselves using non-contextual information. Words unrelated to each other in everyday usage still form patterns unseen on a larger scale. Everything you do leaves a mark on the world, and soon we may be able to better understand our markings and appreciate our histories holistically.

I imagine the future like learning the answers to questions we never thought to ask.

Edit 5/11/10: Agatha Christie also wrote dozens of diary entries and notes about books that may have shown signs of dementia. (via @JadAbumradAgatha Christie’s deranged notebooks (interesing to read after the latest @wnycradiolab podcast) –

Edit 5/14/10: For an interesting exemplar of Facebook linguistic data-mining, see their Gross National Happiness trend index. The study describing the methodology used is cited below the chart.

What do you mean he’s not singing? Just look!

Эдуард Анатольевич Хиль (Edward Anatolevich Hill) is gaining fame again for a once-forgotten performance in Soviet Russia over thirty years ago in 1976. Back then it was considered genuine pop TV entertainment, but in today’s culture it has resurfaced as the “trololo” internet meme because of its strangeness more than its catchy tune. Why it is strange to modern viewers isn’t hard to see once you start watching:

I realize many of you don’t speak Russian, so I’ve transcribed the complete lyrics here so you can follow along:

Ahhhhh ya ya yaaaah, ya ya yaaah, yaaah, ya yah.
Ohohohoooo! Oh ya yaaah, ya ya yaaah, yaaah, ya yah.
Ye-ye-ye-ye-yeh ye-ye-yeh ye-ye-yeh, oh hohohoh.
Ye-ye-ye-ye-yeh ye-ye-yeh ye-ye-yeh, oh hohohooooooooooo!
-aaaaoooooh, aaaooo hooo haha

Nah-nah-nah-nah-nuh-nuh, nah nuhnuh, nah nuh-nuh, nah nuhnuh, nuh-nah.
Nah-nah-nah-nun, nun-ah-nah, nun-ah-nah, nah-nah-nah-nah-nah!
Nah-nah-nah-nah-naaaaaaaaaaaaaaaaaaaaaaaaaah! Dah dah daaaaaaaaah…
Da-da-daaah, daaah, daa-daah.

Lololololoooooooo! La la-laaaaaah, la la laah, lol, haha.
Oh-ho-ho-ho-ho, ho-ho-ho, ho-ho-ho, oh-ho-ho-ho-ho!
Oh-ho-ho-ho, ho-ho-ho, ho-ho-ho, lo-lo-loooo!

Luh luh lah, lah, lah-lah.
Da-da-daaah, daaah, daa-daah.

Lololololo, lololo, lololol, la la la la yaah!
Trolololo la, la-la-la, la-la-la-
Oh hahahaho! Hahaheheho! Hohohoheho! Hahahaheho!
Lolololololololo, lololololololol, lololololololol, lololo LOL! *

Ahhhhh! La-la-laaah! La la-laaah, laaah, la-la.
Oh-ho-ho-ho-hoooooo! La, la-laaaah, lalala, lol, haha.
Lolololo-lololo-lololo, oh-ho-ho-ho-ho!
Lolololo-lololo-lololo, oh-ho-ho-ho hooooooooooooooooooooo!
(Wave goodbye)

Note: Transcribing these lyrics took longer than you might think.
Note: You can download Trololo Sing Along with the lyrics from
Vimeo for free (see About This Video).


Getting serious now, why does a song like this, with no discernible words (vokaliz style) still work as a music video? Body language! Eduard isn’t using words, but he’s a recognizable performer singing a story about a feeling, Ostrovskii’s “I Am So Happy to Finally Be Back Home”(Cyrillic: Я очень рад, ведь я, наконец, возвращаюсь домой), using his facial expression, posture, and tonality.

It’s a strange sight in contrast to modern Western norms, but considering that human communication is more non-verbal than verbal, a singer lip-syncing to non-words is actually saying a lot.

Edit 3/7/10: Looks like since this writing the meme has picked up enough momentum to generate an English Wikipedia page for Eduard Khil’ (in addition to its Russian counterpart). There’s an interesting quote from Hill, now living in St. Petersburg Russia, who was asked about his new-found internet fame by a Russian news outlet recently. Here’s his reply:

I haven’t heard anything about it. It’s nice, of course!
Thereby hangs a tale about this song. Lyrics were written for it, but they were poor. I mean, they were good, but one couldn’t publish them at that time. They contained words like these: “I’m riding my stallion, so-and-so mustang, and my beloved Mary is thousand miles away knitting a stocking for me”. Of course, we failed to publish it at that time, and we, Arkady Ostrovsky and I, decided to make it a vocalise. But the essence remained in the title. Yes, it’s a little prankish – it has no lyrics, so we had to make up something for people would listen to it, and so there was an interesting arrangement.
Eduard Khil, Life News (Russian)

Edit 3/15/10: Eduard has been further pressed by Russian media and he seems to be gladly embracing the new popularity trend. He’s even posted a video address to the world and recently sat down to watch YouTube parodies on live TV.

Addendum 4/14/10: Read more about trololo and the reasoning behind the vocal lyrics in the new thought posted here.