Phonetics and Phonology （retrieved from UPenn Phonetics Lab web)
This is a program for creating (typically orthographic) transcriptions of sound recordings, time linked (typically at the phrase level) to a digital audio file. It will conveniently deal with long recordings — an hour or more.
The handling of audio I/O and waveform displays is based on the Snack sound toolkit, which is the same foundation as wavesurfer (see below) and in the near future, it should be integrated with it to some extent. Versions will be available soon that can maintain multiple transcripts in parallel (for highly-interactive conversation, for instance), that are specialized for interlinear transcription on several levels (e.g. orthographic, morphemic, phonemic, phonetic), and so on.
Source code and binary distributions are available at http://www.ldc.upenn.edu/mirror/Transcriber/
A recent MS Windows binary is available here.
On the unix machines, the command is “trans”.
This is a simple but powerful program for interactive display of waveforms, spectrograms, pitch tracks and transcriptions (phonetic, orthographic etc.). Source code and various binary distributions are available athttp://www.speech.kth.se/wavesurfer/
The current MS Windows binary is here.
On the unix machines, the command is “wavesurfer”.
Praat is a “research, publication, and productivity tool for phoneticians.” It includes a comprehensive set of capabilities, usable both interactively and via a scripting language. Although it is not yet free software, it soon will be, according to its creator, Paul Boersma.
For now, you can download it from here.
On the unix machines, the command is “praat”.
R is a free-software version of the improved version of the S statistics language, whose proprietary version goes by the name of “Splus”.
A page containing lots of useful information about R, especially useful as a local Penn reference, is:http://finzi.psych.upenn.edu/
The main page for R is at: http://www.r-project.org/
If you must use Microsoft Windows, a binary version of R can be downloaded from here.
A repository of code and datasets for S and Splus, most of which will also run under R, can be found athttp://lib.stat.cmu.edu/S/.
A nice, short, and simple introduction to R can be found at:http://lib.stat.cmu.edu/R/CRAN/doc/contrib/kickstart/index.html.
SoX (“Sound eXchange”) is a command line utility that can convert various formats of computer audio files to other formats, also changing sampling rate and performing some other modifications as instructed. The command line syntax is difficult. Here are instructions on how to perform the usual tasks.
AWK is a text-processing language commonly used for massaging data. Details.
Other interesting things
(These are not necessarily installed on all lab machines).
Emu : “a collection of software tools for the creation, manipulation and analysis of speech databases.” It is designed to work with the R statistics package (see below).
The Festival speech synthesis package.
Speech software from ISIP at Mississippi State.
Intra is a transcription tool, that incorporates synthesis for checking.
Slang, a C++-based software platform for speech processing (and especially speech recognition) research.
Sphinx, afree-software speech recognition system from CMU.
The CSLU speech toolkit.
The UCL SFS (Speech Filing System).
Pointers to some other systems can be found in the Linguistic Annotation Page.