Acoustic properties

Why do we need it?

Sound properties

  1. Vibration (longitudinal wave)
  2. Propagates through a medium
  3. Variations in air pressure
  4. Different configs of the vocal tract shapes airstream

Reading a waveform

Air pressure –> Recognition in human ears

Ear’s first response is to break down into different frequencies.

Extract the frequency components of speech sounds

Linguists don’t use waveforms because they don’t show frequencies.

Reading a spectrogram

Speech sounds have different frequency profiles. Which is why spectrograms are more useful for linguists, as it is easier to deduce articulatory profiles.

You need the waveform to get the spectrogram.

Spectral Analysis

Taking 1s of time from the waveform and calculating the frequencies based on the period of waves. and then stacking it in a box.

Sound can have more than one frequency! In fact all sounds except pure tone of a tuning fork has complex structure.

Speech sounds contains different number of pitches at the same time.

Pitch

High/low pitch $\approx$ rate of vibration of the vocal folds

Overtone pitches or Formants

Formants are actually echoes in our vocal tract.

High/low pass filter

Vowels

  1. First formant (F1)
    1. [i, ɪ, ɛ, æ], [u, ʊ, ɔ, ɑ]: From the lowest F1 to the highest
    2. High vowels are associated with low F1 [i]
    3. Low vowels are associated with high F1 [æ]
  2. Second formants (F2)
    1. Front vowels have higher F2 than back vowels
    2. Front High vowels are associated with high F2 [i]
    3. Front Low vowels are associated with lower F2 [æ]
    4. implies that [i] is more front than [æ]

These first two formants are showing the following.

Vowel Chart and Acoustics

The ppl who first created the vowel chart thoguht they were mapping the tongue placement

BUT it actually represents the acoustics (frequency profiles) of the sounds.

The absolute formant values differ

Vowels look relatively clear on a chart, but what about consonants??

Diphthongs: The formant will morph throughout

Consonants

Consonants convey the quality by their effects on adjacent vowels.

Oral closure causes lowering of the first formant (which is why higher vowels have lower F1s.)

Plosives/Stops

Bilabial: Has lower F2 and F3 at the transition

Dental/alveolars: Raises F2/F3 at the boundary

Velars: Cause F2/F3 to come together (velar pinch)

The ear doesn’t register silence. So it makes use of the transitional cues

Voiced sounds: faint striations near the baseline for every single voiced sound (including vowels).

Fricatives

Looks like random noise on the spectrogram

Releasing air from the glottis

Dental/Alveolar/Post-alveolar is very easier to identify for fricatives:

Very high frequency (going into 8000Hz)

Voiced: similar to voiceless counterparts but have voice bars

Approximants