Publications

2015
T. F. Quatieri, et al., “Vocal biomarkers to discriminate cognitive load in a working memory task,” Proceedings of InterSpeech, pp. 2684-2688, 2015. Paper
Y. - A. S. Lien, et al., “Voice relative fundamental frequency via neck-skin acceleration in individuals with voice disorders,” Journal of Speech, Language, and Hearing Research, vol. 58, no. 5, pp. 1482-1487, 2015. Publisher's VersionAbstract

Abstract Purpose: This study investigated the use of neck-skin acceleration for relative fundamental frequency (RFF) analysis. Method: Forty individuals with voice disorders associated with vocal hyperfunction and 20 age- and sex-matched control participants were recorded with a subglottal neck-surface accelerometer and a microphone while producing speech stimuli appropriate for RFF. Rater reliabilities, RFF means, and RFF standard deviations derived from the accelerometer were compared with those derived from the microphone. Results: RFF estimated from the accelerometer had slightly higher intrarater reliability and identical interrater reliability compared with values estimated with the microphone. Although sensor type and the Vocal Cycle × Sensor and Vocal Cycle × Sensor × Group interactions showed significant effects on RFF means, the typical RFF pattern could be derived from either sensor. For both sensors, the RFF of individuals with vocal hyperfunction was lower than that of the controls. Sensor type and its interactions did not have significant effects on RFF standard deviations. Conclusions: RFF can be reliably estimated using an accelerometer, but these values cannot be compared with those collected via microphone. Future studies are needed to determine the physiological basis of RFF and examine the effect of sensors on RFF in practical voice assessment and monitoring settings.

Paper
2014
M. L. Cooke, D. D. Mehta, and R. E. Hillman, “Relationships between the Cepstral-Spectral Index of Dysphonia and vocal fold vibratory function during phonation,” Proceedings of the 43rd Annual Symposium of the Voice Foundation: Care of the Professional Voice, 2014. Poster
J. Guðnason, D. D. Mehta, and T. F. Quatieri, “Closed phase estimation for inverse filtering the oral airflow waveform,” Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 920-924, 2014.Abstract

Glottal closed phase estimation during speech production is critical to inverse filtering and, although addressed for radiated acoustic pressure analysis, must be better understood for the analysis of the oral airflow volume velocity signal that provides important properties of healthy and disordered voices. This paper compares the estimation of the closed phase from the acoustic speech signal and the oral airflow waveform recorded using a pneumotachograph mask. Results are presented for ten adult speakers with normal voices who sustained a set of vowels at a comfortable pitch and loudness. With electroglottography as reference, the identification rate and accuracy of glottal closure instants for the oral airflow are 96.8 % and 0.28 ms, whereas these metrics are 99.4 % and 0.10 ms for the acoustic signal. We conclude that glottal closure detection is adequate for close phase inverse filtering but that improvements to detection of glottal opening instants on the oral airflow signal are warranted.

Paper
D. D. Mehta, J. H. Van Stan, and R. E. Hillman, “Deriving acoustic voice quality measures from subglottal neck-surface acceleration,” Proceedings of the International Conference on Voice Physiology and Biomechanics, 2014. Poster
M. Ghassemi, et al., “Learning to detect vocal hyperfunction from ambulatory neck-surface acceleration features: Initial results for vocal fold nodules,” IEEE Transactions on Biomedical Engineering, vol. 61, no. 6, pp. 1668-1675, 2014. Publisher's VersionAbstract

Voice disorders are medical conditions that often result from vocal abuse/misuse which is referred to generically as vocal hyperfunction. Standard voice assessment approaches cannot accurately determine the actual nature, prevalence, and pathological impact of hyperfunctional vocal behaviors because such behaviors can vary greatly across the course of an individual's typical day and may not be clearly demonstrated during a brief clinical encounter. Thus, it would be clinically valuable to develop noninvasive ambulatory measures that can reliably differentiate vocal hyperfunction from normal patterns of vocal behavior. As an initial step toward this goal we used an accelerometer taped to the neck surface to provide a continuous, noninvasive acceleration signal designed to capture some aspects of vocal behavior related to vocal cord nodules, a common manifestation of vocal hyperfunction. We gathered data from 12 female adult patients diagnosed with vocal fold nodules and 12 control speakers matched for age and occupation. We derived features from weeklong neck-surface acceleration recordings by using distributions of sound pressure level and fundamental frequency over 5-min windows of the acceleration signal and normalized these features so that intersubject comparisons were meaningful. We then used supervised machine learning to show that the two groups exhibit distinct vocal behaviors that can be detected using the acceleration signal. We were able to correctly classify 22 of the 24 subjects, suggesting that in the future measures of the acceleration signal could be used to detect patients with the types of aberrant vocal behaviors that are associated with hyperfunctional voice disorders.

Paper
R. E. Hillman, D. Mehta, J. H. Van Stan, M. Zañartu, M. Ghassemi, and J. V. Guttag, “Subglottal ambulatory monitoring of vocal function to improve voice disorder assessment,” The Journal of the Acoustical Society of America, vol. 136, pp. 2260-2260, 2014.
J. R. Williamson, T. F. Quatieri, B. S. Helfer, G. Ciccarelli, and D. D. Mehta, “Vocal and facial biomarkers of depression based on motor incoordination and timing,” Proceedings of the Fourth International Audio/Visual Emotion Challenge (AVEC 2014), 22nd ACM International Conference on Multimedia, pp. 65-72, 2014. Paper
2013
J. R. Williamson, T. F. Quatieri, B. S. Helfer, R. L. HORWITZ, B. Yu, and D. D. Mehta, “Vocal and facial biomarkers of depression based on motor incoordination,” Third International Audio/Visual Emotion Challenge (AVEC 2013), 21st ACM International Conference on Multimedia. pp. 1-4, 2013. Paper
M. Zañartu, J. C. Ho, D. D. Mehta, R. E. Hillman, and G. R. Wodicka, “Acoustic coupling during incomplete glottal closure and its effect on the inverse filtering of oral airflow,” Proceedings of Meetings on Acoustics, vol. 19, pp. 060241-7, 2013. Paper
N. Roy, et al., “Evidence-based clinical voice assessment: A systematic review,” American Journal of Speech-Language Pathology, vol. 22, pp. 212-226, 2013. Publisher's VersionAbstract

PurposeTo determine what research evidence exists to support the use of voice measures in the clinical assessment of patients with voice disorders. MethodThe American Speech-Language-Hearing Association (ASHA) National Center for Evidence-Based Practice in Communication Disorders staff searched 29 databases for peer-reviewed English-language articles between January 1930 and April 2009 that included key words pertaining to objective and subjective voice measures, voice disorders, and diagnostic accuracy. The identified articles were systematically assessed by an ASHA-appointed committee employing a modification of the critical appraisal of diagnostic evidence rating system. ResultsOne hundred articles met the search criteria. The majority of studies investigated acoustic measures (60%) and focused on how well a test method identified the presence or absence of a voice disorder (78%). Only 17 of the 100 articles were judged to contain adequate evidence for the measures studied to be formally considered for inclusion in clinical voice assessment. ConclusionResults provide evidence for selected acoustic, laryngeal imaging-based, auditory-perceptual, functional, and aerodynamic measures to be used as effective components in a clinical voice evaluation. However, there is clearly a pressing need for further high-quality research to produce sufficient evidence on which to recommend a comprehensive set of methods for a standard clinical voice evaluation.

Paper
R. E. Hillman, et al., “Future directions in the development of ambulatory monitoring for clinical voice assessment,” Proceedings of the 10th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research, 2013.
D. D. Mehta, et al., “High-speed videomicroscopy and acoustic analysis of ex vivo vocal fold vibratory asymmetry,” Proceedings of the 10th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research, 2013. Paper
D. D. Mehta, M. Zañartu, J. H. Van Stan, S. W. Feng, H. A. Cheyne II, and R. E. Hillman, “Smartphone-based detection of voice disorders by long-term monitoring of neck acceleration features,” Proceedings of the IEEE International Conference on Body Sensor Networks, pp. 1-6, 2013. Paper
M. Zañartu, J. C. Ho, D. D. Mehta, R. E. Hillman, and G. R. Wodicka, “Subglottal impedance-based inverse filtering of voiced sounds using neck surface acceleration,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, pp. 1929-1939, 2013.Abstract

A model-based inverse filtering scheme is proposed for an accurate, non-invasive estimation of the aerodynamic source of voiced sounds at the glottis. The approach, referred to as subglottal impedance-based inverse filtering (IBIF), takes as input the signal from a lightweight accelerometer placed on the skin over the extrathoracic trachea and yields estimates of glottal airflow and its time derivative, offering important advantages over traditional methods that deal with the supraglottal vocal tract. The proposed scheme is based on mechano-acoustic impedance representations from a physiologically-based transmission line model and a lumped skin surface representation. A subject-specific calibration protocol is used to account for individual adjustments of subglottal impedance parameters and mechanical properties of the skin. Preliminary results for sustained vowels with various voice qualities show that the subglottal IBIF scheme yields comparable estimates with respect to current aerodynamics-based methods of clinical vocal assessment. A mean absolute error of less than 10% was observed for two glottal airflow measures—maximum flow declination rate and amplitude of the modulation component—that have been associated with the pathophysiology of some common voice disorders caused by faulty and/or abusive patterns of vocal behavior (i.e., vocal hyperfunction). The proposed method further advances the ambulatory assessment of vocal function based on the neck acceleration signal, that previously have been limited to the estimation of phonation duration, loudness, and pitch. Subglottal IBIF is also suitable for other ambulatory applications in speech communication, in which further evaluation is underway.

Paper
M. Zañartu, et al., “Toward an objective aerodynamic assessment of vocal hyperfunction using a voice health monitor,” Proceedings of the 8th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2013. Paper
2012
D. D. Mehta and R. E. Hillman, “Current role of stroboscopy in laryngeal imaging,” Current Opinion in Otolaryngology & Head and Neck Surgery, vol. 20, no. 6, pp. 429-436, 2012. Publisher's VersionAbstract

PURPOSE OF REVIEW: To summarize recent technological advancements and insight into the role of stroboscopy in laryngeal imaging. RECENT FINDINGS: Although stroboscopic technology has not undergone major technological improvements, recent clarifications have been made to the application of stroboscopic principles to video-based laryngeal imaging. Also recent advances in coupling stroboscopy with high-definition video cameras provide higher spatial resolution of vocal fold vibratory function during phonation. Studies indicate that the interrater reliability of visual stroboscopic assessment varies depending on the laryngeal feature being rated and that only a subset of features may be needed to be representative of an entire assessment. High-speed videoendoscopy (HSV) judgments have been shown to be more sensitive than stroboscopy for evaluating vocal fold phase asymmetry, pointing to the future potential of complementing stroboscopy with alternative imaging modalities in hybrid systems. Laryngeal videostroboscopy alone continues to play a central role in clinical voice assessment. Even though HSV may provide more detailed information about phonatory function, its eventual clinical adoption will depend on how remaining practical, technical, and methodological challenges will be met. SUMMARY: Laryngeal videostroboscopy continues to be the modality of choice for imaging vocal fold vibration, but technological advancements in HSV and associated research findings are driving increased interest in the clinical adoption of HSV to complement videostroboscopic assessment.

Paper
M. Ghassemi, et al., “Detecting voice modes for vocal hyperfunction prevention,” Proceedings of the 7th Annual Workshop for Women in Machine Learning. 2012.
D. D. Mehta, et al., “Duration of ambulatory monitoring needed to accurately estimate voice use,” Proceedings of InterSpeech: Annual Conference of the International Speech Communication Association, 2012. Paper Poster
D. D. Mehta and R. E. Hillman, “The evolution of methods for imaging vocal fold phonatory function,” Perspectives on Speech Science and Orofacial Disorders, vol. 22, no. 1, pp. 5-13, 2012. Publisher's VersionAbstract

In this article, we provide a brief summary of the major technological advances that led to current methods for imaging vocal fold vibration during phonation including the development of indirect laryngoscopy, imaging of rapid motion, fiber optics, and digital image capture. We also provide a brief overview of new emerging technologies that could be used in the future for voice research and clinical voice assessment, including advances in laryngeal high-speed videoendoscopy, depth-kymography, and dynamic optical coherence tomography.

Paper

Pages