Thiessen, E. (2009). Statistical learning. In E.L. Bavin (Ed.), The Cambridge handbook of child language (pp. 35-50). New York: Cambridge University Press.

“That’s one small step for (a) man, one giant leap for mankind”. After more than 40 years since these inspirational words first being uttered, most of us still cannot hear the “a” between “for” and “man”, despite Neil Armstrong’s insistence that he definitely said the “a”. This one short sentence has thus become one giant annoyance to Armstrong and maybe the whole space exploration community. After doing some analyses on the audio, I think the framework of statistical learning might explain why people cannot hear the “a”. The following figure shows the pitch (F0) contour of Armstrong’s quote. The audio file was downloaded from the NASA website(NASA, 2014), and was analyzed in speech signal processing software PRAAT(Boersma & Weenink, 2001).

As stated in Thiessen (2009), word boundaries in English are accompanied by pitch and/or loudness change. It is apparent from the graph above that every word starts with a relatively higher pitch and then the pitch goes down. The pitch contour shows a dramatic shift from low to high at each word boundary, a phenomenon linguists call “pitch reset”. Listeners are thus very likely to associate pitch reset with word boundaries. Although there is a time interval for “a”, the background noise has unfortunately masked the pitch signature of “a”. Had “for” has a little pitch increase at the end, most of us would probably have heard the “a”. Therefore, I think the misperception of Armstrong’s famous “quote” is very likely a result of the transitional probability between word boundaries and pitch reset.

The extraordinary implication of this misperception is that the effect of statistical learning is so strong that it can even cause people to ignore semantics. Since “for man” also meant “for mankind” in 1969, the whole misperceived utterance means “a small step for mankind, and a giant leap for mankind”, which is repetitive if not nonsensical. Therefore, people who only heard “for man” were essentially ignoring the meaning of the whole sentence.This finding draws contrast with claims that adults are more sensitive to semantic information than phonological information (we consider prosody phonological), an opinion that is echoed in Thiessen (2009). No matter which cues people are biased toward, the bias itself indicates that statistical learning is constrained.

The preference for one linguistic cue over another is what Bayesian statisticians would call “the prior”. But, what is this prior? Where is it from? Thiessen (2009) tentatively answered these questions by invoking the Universal Grammar (UG). He walked back from the UG a little bit by claiming that our biases are not limited to learning linguistic stimuli, but non-linguistic stimuli as well. However, it does not answer the fundamental question why infants prefer patterns that are likely to occur cross-linguistically over patterns that are unlikely to occur cross-linguistically. Chomskyans would say that infants’ preference for patterns that are language-universial is a good indicator of the language acquisition device (LAD), which to me is more of a philosophical concept. It does not claim that we have a black box in our heads that only governs language activity. I also don’t think our brains are modular. Many different parts of the brain might all contribute to language production and perception. However, the LAD speaks to the very fact that our brains are pre-wired in a certain way that causes different kinds of biases. These biases are not learned. They pre-exist. Theologians might say it is God given. Evolutionists might say it is a result of adaptation, the mutation of the FOXP2 gene for example.

Thiessen’s objection to the UG is perhaps that we have biases in domains other than language, and the biases are not limited to human. This is a good point. I would very much like to admit that humans are no better than animals. However, as we humans tend to over-generalize things, a careful comparison between similar biases between human and animal might be needed to substantiate any claim that we are not “special”. Rats, for example, cannot easily learn the association between audiovisual cues and nausea (p.45). Humans, on the other hand, are very sensitive to the relationship between audio cues and food taste. A new study by Oxford scholar Charles Spence (Ward, 2015) shows that people enjoy their Chinese food more if the background music is one of Taylor Swift’s songs. When the music switched to Justin Bieber’s “Baby”, however, the Chinese food suddenly becomes less tasteful. Given the fact that participants are not familiar with either Swift or Bieber, we can probably say that some types of music melody are more closely associated with good taste. The reason for this association is yet to be explained. Is the association innate, or is it learned? We don’t know. What we do know is that human and nonhuman biases are probably not the same. Human biases tend to be more sophisticated.

As far as biases in other domains, evidence is still needed to show that biases in other domains are equivalent to biases in language domain. I realize that it is the burden of the Chomskyans’ to show the inequivalence. However, it is beyond my capability right now to adequately support or refute innateness. What I came to realize, after reading Thiessen’s account on statistical learning, is that language learning involves not only language input, but also prior biases. The biases, innate or otherwise, might relate to how our brains are wired.
NASA. (2014). July 20, 1969: One Giant Leap For Mankind. Retrieved February 10, 2016, from
Boersma, P., & Weenink, D. (2001). Praat, a system for doing phonetics by computer, 5, 341–345.
Ward, V. (2015). A blast of Taylor Swift with your Chinese takeaway will make it taste better, claims Oxford academic. Retrieved from