134
JCPSLP
Volume 14, Number 3 2012
Journal of Clinical Practice in Speech-Language Pathology
correlate with reduced PVI_Dur measures (see Discussion).
The significantly reduced PVI_
f0
and PVI_dB values were
consistent with the perception of reduced pitch variability
and possibly low speaking volume.
Discussion
The aim of this study was to demonstrate the use of a small
set of acoustic measures of speech using accessible
software and readily executed measurements. By exploring
the relationships between various acoustic measures and
our perceptions of different aspects of speech and voice
quality, we can develop more objective and reliable
measures of change with time and with treatment. We can
also start to unpack the different acoustic signals that come
together to form our perceptions of, at times, more
wholistic constructs (Kent, 1997).
We predicted that the individuals with spastic or
flaccid dysarthria would demonstrate abnormal vocal
quality measures (e.g., jitter, shimmer, HNR), associated
with perceived abnormal quality. The individual with
ataxic dysarthria and pitch breaks and vocal tremor was
expected to show high variability of
f0
on sustained
ah
. All
participants were expected to have reduced speech rate
in diadochokinetic and connected speech tasks. Reduced
PVI_Dur should be associated with perception of equal
stress and reduced PVI_
f0
and PVI_dB with perception of
reduced pitch and loudness variability in connected speech.
Vocal quality
HNR appears to be a useful indicator of abnormal vocal
quality (Bhuta et al., 2004; Kent et al., 2000; Yumoto &
Gould, 1982). It has been linked to hoarseness, although
here P1 and P3 were perceived to have strained-strangled
and breathy quality, respectively. It is possible that HNR is
useful as an indicator of pathology, rather than a specific
type, or alternatively that the different vocal quality
descriptors are difficult to differentiate in clinical practice
(Kreiman & Gerratt, 2000). As reported here, previous
studies have not found strong links between jitter and
shimmer measures and abnormal vocal quality (e.g., Bhuta,
et al., 2004; see Thompson-Ward & Theodoros, 1998).
Inclusion of HNR in a diagnostic protocol is worthwhile to
aid objective identification of abnormal quality or to track
changes with intervention, provided recording and
measurement methods are controlled across time points.
The measures of average
f0
and standard deviation of
f0
during sustained ah production were equivocal here.
P1 had elevated average
f0
, counter to the tendency for
reduced pitch with laryngeal spasticity (Duffy, 2005). This
was not likely to be due to perceived mild pitch breaks,
as these were minimal during the
ah
sample. The average
f0
was 5.2 Hz outside the normal range; possibly the
threshold for perceiving high pitch does not correspond
precisely with the normal range. As predicted, the elevated
variability of P2 supported the perception of irregular pitch
breaks and vocal tremor in sustained
ah
.
Speech rate and prosody
The measures of speech rate are by no means novel but
are made considerably easier within the visual spectro
graphic display of PRAAT. As reported numerous times, all
participants showed slowed rate in all tasks (Duffy, 2005).
The measures of prosody are less widespread. The PVI
is a useful measure that correlates well with perceptions
of stress production in words and connected speech
(Ballard et al., 2010; Low et al., 2000). Our hypotheses
were largely supported with equal stress and monopitch
and monoloudness reflected in reduced PVI values. Kim,
Hasegawa, and Perlman (2010) have reported similar
findings in spastic dysarthria from cerebral palsy. The lack
of a significant difference for PVI_
f0
and PVI_dB for P1 and
P2 suggests that poor control over syllable/vowel duration
was mainly responsible for the perception of equal stress.
This result is not surprising for P2, as her irregular pitch and
loudness variations were distributed relatively randomly with
respect to the distribution of stress. P1 was perceived to
have monopitch and monoloudness, but this was not borne
out in the PVI measures.
P3 had significantly reduced PVI for all three measures.
While he was not perceived to have equal or excess stress,
the reduced duration variability may be related to perceived
vowel and consonant prolongations. Such prolongations
are also a feature of acquired apraxia of speech, with
these individuals disproportionately prolonging vowels
in unstressed syllables (Vergis & Ballard, 2012). P3 was
perceived to have
consistently
reduced pitch variation,
which appeared more related to PVI_
f0
than the irregular
pitch variation of P1 and P3.
Conclusions
The aim of this paper was to demonstrate how some
acoustic measurements are within easy reach of standard
speech pathology clinics and can provide quick objective
measures for supporting diagnostic and treatment
decisions. While not all measures match squarely onto
perceptual constructs, there is value in exploring how
different acoustic features may combine to map onto more
holistic percepts. We must also be aware that the inherent
variability of the pathological speech signal and/or
limitations in applying a “generic” software algorithm to
pathological speech may at times yield inaccurate
measurements. The need to use a good quality
microphone, to ensure samples are collected in a quiet
environment, and to standardise recording and analysis
protocols across time points cannot be overstated.
The measures and methods presented here provide the
clinician with a starting point for documenting treatment
effectiveness and accountability in a less subjective manner
than using perceptual measures alone. We hope that,
by documenting some of these methods with illustrative
cases, we may encourage and facilitate translation of these
techniques into clinical practice (Graham et al., 2006) and,
over time, stimulate development of large normative and
patient databases for comparison.
Acknowledgments
The initial stage of this work was conducted while the first
two authors were employed as speech pathologists in the
Brain Injury Rehabilitation Service at the Royal Rehabilitation
Centre Sydney. We thank the three patients for their
participation in the study.
References
Ballard, K. J., Robin, D. A., McCabe, P., & McDonald, J.
(2010). Treating dysprosody in childhood apraxia of speech.
Journal of Speech, Language, and Hearing Research
,
53
,
1227–1245.
Bhuta, T., Patrick, L., & Garnett, J. D. (2004). Perceptual
evaluation of voice quality and its correlation with acoustic
measurements.
Journal of Voice
,
18
(3), 299–304.
Boersma, P., & Weenink, D. (2010).
PRAAT: Doing
phonetics by computer
(version 5.1.31). Amsterdam,
Netherlands: Institute of Phonetic Sciences.




