Back in 2009, Heidi Harley and I wrote a few inter-related posts looking at the linguistics job market and how it compares with the distribution of new PhD theses. Since then, people have occasionally written to me (and probably Heidi too) about updating the posts with new numbers. I’ve been reluctant to do that because I’ve always been worried that my data-gathering methods (scraping Linguist List and ProQuest) were problematic. (It would be great if Linguist List released analyses based on its internal database of jobs ads!)

I’m pleased to report that Stephanie Shih and Rebecca Starr have done the work for me, and they did a careful job indeed. Here’s the summary picture; below the fold, I’ve included their notes on where the data came from. Comments are open so that people can add their own analyses.

Linguistics dissertations and academic jobs 2009-2012

Thanks Stephanie and Rebecca!

Here are Stephanie and Rebecca’s notes, in their own words, on the data that went into the above plot:

We tried to follow the methodology of the original 2009 study, with a few modifications.

  • We focused on academic jobs, excluding industry positions because we felt that Linguist List wasn’t the most accurate representation of all of the industry jobs available to linguistics graduates. And because we were mostly curious about the ability of academia to absorb the number of PhDs produced.
  • “All academic jobs” includes permanent and temporary (e.g., postdoc, visiting) positions advertised on Linguist List. “Tenure track jobs” includes permanent positions (in the UK, for instance, this would include “Lecturer” positions).
  • Job ads were counted by hand from the Linguist List archives, using keywords for each subfield (the less obvious ones listed below):
    • sociolinguistics: variation, discourse analysis, socio, anthro
    • psycholinguistics: psychology, psycholinguistics
    • cognitive science: cognition, cognitive
    • computational: comp, nlp, natural language processing, machine translation
  • Dissertations on Proquest were done automatically with the following search terms for each subfield:
    dissertations/thesis
    since 1/1/2009
    SU(linguistics) AND
    
    IF("phonetics")
    IF("phonology")
    IF("morphology")
    IF("syntax")
    IF("semantics")
    IF("historical linguistics")
    IF("sociolinguistics")
    IF("psycholinguistics")
    IF("cognition")
    IF("computation") OR IF("natural language processing") OR IF("natural language engineering")
    
  • We added morphology and historical linguistics as subfields.
  • We excluded the “applied” subfield from this graph because the overall number of jobs listing applied linguistics as their subfield was huge and swamped the rest of the graph, and because the vast majority of those jobs were specific to teaching a particular language, so this number does not relate clearly to number of phds in applied linguistics.
  • One caveat here is that there is probably still overinflation in the job numbers. Many job listings ask for more than one subfield: for example, a “Syntax and Semantics” job was counted once for syntax and once for semantics, too. Furthermore, we made no effort to subcategorize within subfields (e.g., jobs specifically for theoretical syntax, or jobs calling for specific language specialties like Spanish phonology).

from Language Log http://languagelog.ldc.upenn.edu/nll/?p=4349