Purpose The purpose of this study was to evaluate the potential for estimating subglottal air pressure using a neck-surface accelerometer and to compare the accuracy of predicting subglottal air pressure relative to predicting acoustic sound pressure level (SPL).
Method Indirect estimates of subglottal pressure (Psg′) were obtained from 10 vocally healthy speakers during loud-to-soft repetitions of 3 different /p/–vowel gestures (/pa/, /pi/, /pu/) at 3 pitch levels in the modal register. Intraoral air pressure, neck-surface acceleration, and radiated acoustic pressure were recorded, and the root-mean-square amplitude of the acceleration signal was correlated with Psg′ and SPL.
Results The coefficient of determination between accelerometer level and Psg′ was high when data were pooled from all vowel and pitch contexts for each participant (r 2 = .68–.93). These relationships were stronger than corresponding relationships between accelerometer level and SPL (r 2 = .46–.81). The average 95% prediction interval for estimating Psg′ using accelerometer level was ±2.53 cm H2O, ranging from ±1.70 to ±3.74 cm H2O across participants.
Conclusions Accelerometer signal amplitude correlated more strongly with Psg′ than with SPL. Future work is warranted to investigate the robustness of the relationship in nonmodal voice qualities, individuals with voice disorders, and accelerometer-based ambulatory monitoring of subglottal pressure.
Monitoring subglottal neck-surface acceleration has received renewed attention due to the ability of low-profile accelerometers to confidentially and noninvasively track properties related to normal and disordered voice characteristics and behavior. This study investigated the ability of subglottal necksurface acceleration to yield vocal function measures traditionally derived from the acoustic voice signal and help guide the development of clinically functional accelerometer-based measures from a physiological perspective. Results are reported for 82 adult speakers with voice disorders and 52 adult speakers with normal voices who produced the sustained vowels /A/, /i/, and /u/ at a comfortable pitch and loudness during the simultaneous recording of radiated acoustic pressure and subglottal necksurface acceleration. As expected, timing-related measures of jitter exhibited the strongest correlation between acoustic and necksurface acceleration waveforms (r 0:99), whereas amplitudebased measures of shimmer correlated less strongly (r 0:74). Additionally, weaker correlations were exhibited by spectral measures of harmonics-to-noise ratio (r 0:69) and tilt (r 0:57), whereas the cepstral peak prominence correlated more strongly (r 0:90). These empirical relationships provide evidence to support the use of accelerometers as effective complements to acoustic recordings in the assessment and monitoring of vocal function in the laboratory, clinic, and during an individual’s daily activities.
Objectives: Clinical management of phonotraumatic vocal fold lesions (nodules, polyps) is based largely on assumptions that abnormalities in habitual levels of sound pressure level (SPL), fundamental frequency (f0), and/or amount of voice use play a major role in lesion development and chronic persistence. This study used ambulatory voice monitoring to evaluate if significant differences in voice use exist between patients with phonotraumatic lesions and normal matched controls.Methods: Subjects were 70 adult females: 35 with vocal fold nodules or polyps and 35 age-, sex-, and occupation-matched normal individuals. Weeklong summary statistics of voice use were computed from anterior neck surface acceleration recorded using a smartphone-based ambulatory voice monitor.Results: Paired t tests and Kolmogorov-Smirnov tests resulted in no statistically significant differences between patients and matched controls regarding average measures of SPL, f0, vocal dose measures, and voicing/voice rest periods. Paired t tests comparing f0 variability between the groups resulted in statistically significant differences with moderate effect sizes.Conclusions: Individuals with phonotraumatic lesions did not exhibit differences in average ambulatory measures of vocal behavior when compared with matched controls. More refined characterizations of underlying phonatory mechanisms and other potentially contributing causes are warranted to better understand risk factors associated with phonotraumatic lesions.
Abstract Purpose: The authors discuss the rationale behind the term laryngeal high-speed videoendoscopy to describe the application of high-speed endoscopic imaging techniques to the visualization of vocal fold vibration. Method: Commentary on the advantages of using accurate and consistent terminology in the field of voice research is provided. Specific justification is described for each component of the term high-speed videoendoscopy, which is compared and contrasted with alternative terminologies in the literature. Results: In addition to the ubiquitous high-speed descriptor, the term endoscopy is necessary to specify the appropriate imaging technology and distinguish among modalities such as ultrasound, magnetic resonance imaging, and nonendoscopic optical imaging. Furthermore, the term video critically indicates the electronic recording of a sequence of optical still images representing scenes in motion, in contrast to strobed images using high-speed photography and non-optical high-speed magnetic resonance imaging. High-speed videoendoscopy thus concisely describes the technology and can be appended by the desired anatomical nomenclature such as laryngeal. Conclusions: Laryngeal high-speed videoendoscopy strikes a balance between conciseness and specificity when referring to the typical high-speed imaging method performed on human participants. Guidance for the creation of future terminology provides clarity and context for current and future experiments and the dissemination of results among researchers.
In, the third sentence of the second paragraph in Section III-D should have read as follows: “We first divided data using leave-one-out cross validation (LOOCV) to generate 12 subject subsets, where each subject subset consisted of randomly selected data across the 12 pairs. For each test subset, all windows from the 11 other subsets were then subdivided using fivefold cross validation (1/5th validation and 4/5th training in each fold).”