Publications by Year: 2021

J. H. Van Stan, et al., “Changes in the Daily Phonotrauma Index following the use of voice therapy as the sole treatment for phonotraumatic vocal hyperfunction in females,” Journal of Speech, Language, and Hearing Research, vol. 64, no. 9, pp. 3446-3455, 2021. Publisher's VersionAbstract
Purpose The aim of this study was to use the Daily Phonotrauma Index (DPI) to quantify group-based changes in the daily voice use of patients with phonotraumatic vocal hyperfunction (PVH) after receiving voice therapy as the sole treatment. This is part of an ongoing effort to validate an updated theoretical framework for PVH. Method A custom-designed ambulatory voice monitor was used to collect 1 week of pre- and posttreatment data from 52 female patients with PVH. Normative weeklong data were also obtained from 52 matched controls. Each week was represented by the DPI, which is a combination of neck-surface acceleration magnitude skewness and the standard deviation of the difference between the first and second harmonic magnitudes. Results Compared to pretreatment, the DPI statistically decreased towards normal in the patient group after treatment (Cohen's d = -0.25). The posttreatment patient group's DPI was still significantly higher than the control group (d = 0.68). Conclusions The DPI showed the pattern of improved ambulatory voice use in a group of patients with PVH following voice therapy that was predicted by the updated theoretical framework. Per the prediction, voice therapy was associated with a decreased potential for phonotrauma in daily voice use, but the posttreatment patient group data were still significantly different from the normative control group data. This posttreatment difference is interpreted as reflecting the impact on voice use of the persistence of phonotrauma-induced structural changes to the vocal folds. Further validation of the DPI is needed to better understand its potential clinical use.
L. E. Toles, et al., “Differences between female singers with phonotrauma and vocally healthy matched controls in singing and speaking voice use during 1 week of ambulatory monitoring,” American Journal of Speech-Language Pathology, vol. 30, no. 1, pp. 199–209, 2021. Publisher's VersionAbstract
Purpose Previous ambulatory voice monitoring studies have included many singers and have combined speech and singing in the analyses. This study applied a singing classifier to the ambulatory recordings of singers with phonotrauma and healthy controls to determine if analyzing speech and singing separately would reveal voice use differences that could provide new insights into the etiology and pathophysiology of phonotrauma in this at-risk population. Method Forty-two female singers with phonotrauma (vocal fold nodules or polyps) and 42 healthy matched controls were monitored using an ambulatory voice monitor. Weeklong statistics (average, standard deviation, skewness, kurtosis) for sound pressure level (SPL), fundamental frequency, cepstral peak prominence, the magnitude ratio of the first two harmonics (H1-H2 ), and three vocal dose measures were computed from the neck surface acceleration signal and separated into singing and speech using a singing classifier. Results Mixed analysis of variance models found expected differences between singing and speech in each voice parameter, except SPL kurtosis. SPL skewness, SPL kurtosis, and all H1-H2 distributional parameters differentiated patients and controls when singing and speech were combined. Interaction effects were found in H1-H2 kurtosis and all vocal dose measures. Patients had significantly higher vocal doses in speech compared to controls. Conclusions Consistent with prior work, the pathophysiology of phonotrauma in singers is characterized by more abrupt/complete glottal closure (decreased mean and variation for H1-H2 ) and increased laryngeal forces (negatively skewed SPL distribution) during phonation. Application of a singing classifier to weeklong data revealed that singers with phonotrauma spent more time speaking on a weekly basis, but not more time singing, compared to controls. Results are used as a basis for hypothesizing about the role of speaking voice in the etiology of phonotraumatic vocal hyperfunction in singers.
J. H. Van Stan, et al., “Differences in daily voice use measures between female patients with nonphonotraumatic vocal hyperfunction and matched controls,” Journal of Speech, Language, and Hearing Research, vol. 64, no. 5, pp. 1457–1470, 2021. Publisher's VersionAbstract
Purpose The purpose of this study was to obtain a more comprehensive understanding of the pathophysiology and impact on daily voice use of nonphonotraumatic vocal hyperfunction (NPVH). Method An ambulatory voice monitor collected 1 week of data from 36 patients with NPVH and 36 vocally healthy matched controls. A subset of 11 patients with NPVH were monitored after voice therapy. Daily voice use measures included neck-skin acceleration magnitude, fundamental frequency (f (o)), cepstral peak prominence (CPP), and the difference between the first and second harmonic magnitudes (H1-H2). Additional comparisons included 118 patients with phonotraumatic vocal hyperfunction (PVH) and 89 additional vocally healthy controls. Results The NPVH group, compared to the matched control group, exhibited increased f (o) (Cohen's d = 0.6), reduced CPP (d = -0.9), and less positive H1-H2 skewness (d = -1.1). Classifiers used CPP mean and H1-H2 mode to maximally differentiate the NPVH and matched control groups (area under the receiver operating characteristic curve of 0.78). Classifiers performed well on unseen data: the logit decreased in patients with NPVH after therapy; ≥ 85% of the control and PVH groups were identified as "normal" or "not NPVH," respectively. Conclusions The NPVH group's daily voice use is less periodic (CPP), is higher pitched (f (o)), and has less abrupt vocal fold closure (H1-H2 skew) compared to the matched control group. The combination of CPP mean and H1-H2 mode appears to reflect a pathophysiological continuum in NPVH patients of inefficient phonation with minimal potential for phonotrauma. Further validation of the classification model is needed to better understand potential clinical uses. Supplemental Material
D. D. Mehta, et al., “Direct measurement and modeling of intraglottal, subglottal, and vocal fold collision pressures during phonation in an individual with a hemilaryngectomy,” Applied Sciences, vol. 11, no. 16, pp. 7256, 2021. Publisher's VersionAbstract
The purpose of this paper is to report on the first in vivo application of a recently developed transoral, dual-sensor pressure probe that directly measures intraglottal, subglottal, and vocal fold collision pressures during phonation. Synchronous measurement of intraglottal and subglottal pressures was accomplished using two miniature pressure sensors mounted on the end of the probe and inserted transorally in a 78-year-old male who had previously undergone surgical removal of his right vocal fold for treatment of laryngeal cancer. The endoscopist used one hand to position the custom probe against the surgically medialized scar band that replaced the right vocal fold and used the other hand to position a transoral endoscope to record laryngeal high-speed videoendoscopy of the vibrating left vocal fold contacting the pressure probe. Visualization of the larynx during sustained phonation allowed the endoscopist to place the dual-sensor pressure probe such that the proximal sensor was positioned intraglottally and the distal sensor subglottally. The proximal pressure sensor was verified to be in the strike zone of vocal fold collision during phonation when the intraglottal pressure signal exhibited three characteristics: an impulsive peak at the start of the closed phase, a rounded peak during the open phase, and a minimum value around zero immediately preceding the impulsive peak of the subsequent phonatory cycle. Numerical voice production modeling was applied to validate model-based predictions of vocal fold collision pressure using kinematic vocal fold measures. The results successfully demonstrated feasibility of in vivo measurement of vocal fold collision pressure in an individual with a hemilaryngectomy, motivating ongoing data collection that is designed to aid in the development of vocal dose measures that incorporate vocal fold impact collision and stresses.
E. J. Ibarra, et al., “Estimation of subglottal pressure, vocal fold collision pressure, and intrinsic laryngeal muscle activation from neck-surface vibration using a neural network framework and a voice production model,” Frontiers in Physiology, vol. 12, no. 732244, 2021. Publisher's VersionAbstract
The ambulatory assessment of vocal function can be significantly enhanced by having access to physiologically based features that describe underlying pathophysiological mechanisms in individuals with voice disorders. This type of enhancement can improve methods for the prevention, diagnosis, and treatment of behaviorally based voice disorders. Unfortunately, the direct measurement of important vocal features such as subglottal pressure, vocal fold collision pressure, and laryngeal muscle activation is impractical in laboratory and ambulatory settings. In this study, we introduce a method to estimate these features during phonation from a neck-surface vibration signal through a framework that integrates a physiologically relevant model of voice production and machine learning tools. The signal from a neck-surface accelerometer is first processed using subglottal impedance-based inverse filtering to yield an estimate of the unsteady glottal airflow. Seven aerodynamic and acoustic features are extracted from the neck surface accelerometer and an optional microphone signal. A neural network architecture is selected to provide a mapping between the seven input features and subglottal pressure, vocal fold collision pressure, and cricothyroid and thyroarytenoid muscle activation. This non-linear mapping is trained solely with 13,000 Monte Carlo simulations of a voice production model that utilizes a symmetric triangular body-cover model of the vocal folds. The performance of the method was compared against laboratory data from synchronous recordings of oral airflow, intraoral pressure, microphone, and neck-surface vibration in 79 vocally healthy female participants uttering consecutive /pæ/ syllable strings at comfortable, loud, and soft levels. The mean absolute error and root-mean-square error for estimating the mean subglottal pressure were 191 Pa (1.95 cm H(2)O) and 243 Pa (2.48 cm H(2)O), respectively, which are comparable with previous studies but with the key advantage of not requiring subject-specific training and yielding more output measures. The validation of vocal fold collision pressure and laryngeal muscle activation was performed with synthetic values as reference. These initial results provide valuable insight for further vocal fold model refinement and constitute a proof of concept that the proposed machine learning method is a feasible option for providing physiologically relevant measures for laboratory and ambulatory assessment of vocal function.
H. Ghasemzadeh, D. D. Deliyski, R. E. Hillman, and D. D. Mehta, “Method for horizontal calibration of laser-projection transnasal fiberoptic high-speed videoendoscopy,” Applied Sciences, vol. 11, no. 2, pp. 822, 2021. Publisher's VersionAbstract

Objective: Calibrated horizontal measurements (e.g., mm) from endoscopic procedures could be utilized for advancement of evidence-based practice and personalized medicine. However, the size of an object in endoscopic images is not readily calibrated and depends on multiple factors, including the distance between the endoscope and the target surface. Additionally, acquired images may have significant non-linear distortion that would further complicate calibrated measurements. This study used a recently developed in-vivo laser-projection fiberoptic laryngoscope and proposes a method for calibrated spatial measurements.

Method: A set of circular grids were recorded at multiple working distances. A statistical model was trained that would map from pixel length of the object, the working distance, and the spatial location of the target object into its mm length.

Result: A detailed analysis of the performance of the proposed method is presented. The analyses have shown that the accuracy of the proposed method does not depend on the working distance and length of the target object. The estimated average magnitude of error was 0.27 mm, which is three times lower than the existing alternative.

Conclusion: The presented method can achieve sub-millimeter accuracy in horizontal measurement.

Significance: Evidence-based practice and personalized medicine could significantly benefit from the proposed method. Implications of the findings for other endoscopic procedures are also discussed.

Keywords: Flexible endoscopy; High-speed videoendoscopy; Horizontal calibrated measurements; Image distortion; Instrumental voice assessment; Laser calibration; Laser projection endoscope.

M. Brockmann-Bauser, J. H. Van Stan, M. Carvalho Sampaio, J. E. Bohlender, R. E. Hillman, and D. D. Mehta, “Effects of vocal intensity and fundamental frequency on cepstral peak prominence in patients with voice disorders and vocally healthy controls,” Journal of Voice, vol. 35, no. 3, pp. 411-417, 2021. Publisher's VersionAbstract


Cepstrum-based voice measures, such as smoothed cepstral peak prominence (CPPS), are influenced by voice sound pressure level (SPL) in vocally healthy adults. Since it is unclear if similar effects hold in voice disordered adults and how these interact with natural fundamental frequency (fo) changes, this study examines voice SPL and fo effects on CPPS in women with vocal hyperfunction and vocally healthy controls.

Study Design

Retrospective matched case-control study.


Fifty-eight women with vocal hyperfunction were individually matched with 58 vocally healthy women for occupation and approximate age. The patient group comprised women exhibiting phonotraumatic vocal hyperfunction associated with vocal fold nodules (n = 39) or polyps (n = 5), and nonphonotraumatic vocal hyperfunction associated with primary muscle tension dysphonia (n = 14). All participants sustained the vowel /a/ at soft, comfortable, and loud loudness conditions. Voice SPL, fo, and CPPS (dB) were computed from acoustic voice recordings using Praat. The effects of loudness condition, measured voice SPL, and fo on CPPS were assessed with linear mixed models. Pairwise correlations among voice SPL, fo, and CPPS were assessed using multiple regression analysis.


Increasing voice SPL correlated significantly (P < 0.001) with higher CPPS in both patient (r2 = 0.53) and normative groups (r2 = 0.45). fo had statistically significant effects on CPPS (P < 0.001), but with a weak relation for the patient (r2 = 0.02) and control groups (r2 = 0.05).


In women with and without voice disorder, CPPS is highly affected by the individual's voice SPL in vowel phonation. Future studies could investigate how these effects should be controlled for to improve the diagnostic value of acoustic-based cepstral measures.

D. D. Deliyski, et al., “Laser-calibrated system for transnasal fiberoptic laryngeal high-speed videoendoscopy,” Journal of Voice, vol. 35, no. 1, pp. 122-128, 2021. Publisher's Version