The Acoustic Features of Speech Sound
Assignment Sheet
Speech is the most preferred means of human communication. Language is the acoustic code that gives meaning to a sequence of spoken sounds. The smallest constituent of spoken language is the phoneme. Every language has a unique set of phonemes. The English language has approximately 40 phonemes. Different combinations of these constituent phonemes make up every word in the English language.
Phonemes can be characterized in a number of ways: manner of production, manner of articulation, temporal characteristics, or spectral characteristics. This project will analytically examine the phonemes of the English language using a MatLab function, phoneme_analyzer. This MatLab function takes as input the name of the .WAV file, name of the example word, phoneme symbol, phoneme start time, and phoneme end time. It outputs 3 figure windows:
Figure 1 : Example word time waveform, 30ms phoneme waveform, phoneme's magnitude spectrum and spectral envelope
Figure 2 : Example word time waveform, narrowband spectrogram of word, wideband spectrogram of word
Figure 3 : 3-D plot of frequency vs. time vs. power spectral density
Below are links to each of the 40 phonemes examined. Each link includes the function output for each phoneme's recorded .WAV file. Each word was recorded as a mono-channel .WAV file at 16kHz sampling rate and 16 bits/sample bit depth. Silence was removed from the beginning and end of the recorded word, and the recording was normalized. The function phoneme_analyzer will work with any .WAV file of a word as long as the user knows the temporal location of the phoneme in the example word used. Also below is a .ZIP archive containing all 40 recordings used for the links below, as well as the MatLab function created to do the analysis.
Click HERE for an archive containing the .WAV of all 40 phonemes. (LeonDangio_English_Phonemes.zip)
Click HERE for the MatLab function used to analyze each phoneme. (phoneme_analyzer.m)
Vowels
Vowels are produced by the vibration of the vocal folds and the length of the vocal tract. The three categories above (center, front, and back) describe the position of the tongue in the mouth. The position dictates the effective length of the vocal tract.
A temporal characteristic of a vowel is its quasi-periodic waveform. This is caused by the vibration of the vocal tract. By taking the reciprocal of the approximate period yields the fundamental frequency of the speaker's voice. This quasi-periodicity also gives rise to harmonic characteristics in the frequency domain. These harmonics can clearly be seen in the spectrograms for each word as either horizontal striations in the narrowband spectrogram or vertical striations in the wideband spectrogram. Another spectral characteristic of the vowel is the presence of resonances in the spectral envelope. These resonances, or formants, are the theoretical poles if the vocal tract was modeled as a system.
Semi-Vowels
Semi-vowels, characterized as glides or liquids, are vowels that form dipthongs with full syllabic vowels. They have similar characteristics to vowels.
Consonants
- Nasals
- Plosives
- Whispers
- Fricatives
Consonants are articulated with full or partial closure of the vocal tract.
Affricates
Dipthongs
Comments (0)
You don't have permission to comment on this page.