Publications

2013
M. Zañartu, et al., “Toward an objective aerodynamic assessment of vocal hyperfunction using a voice health monitor,” Proceedings of the 8th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2013. Paper
2012
D. D. Mehta and R. E. Hillman, “Current role of stroboscopy in laryngeal imaging,” Current Opinion in Otolaryngology & Head and Neck Surgery, vol. 20, no. 6, pp. 429-436, 2012. Publisher's VersionAbstract

PURPOSE OF REVIEW: To summarize recent technological advancements and insight into the role of stroboscopy in laryngeal imaging. RECENT FINDINGS: Although stroboscopic technology has not undergone major technological improvements, recent clarifications have been made to the application of stroboscopic principles to video-based laryngeal imaging. Also recent advances in coupling stroboscopy with high-definition video cameras provide higher spatial resolution of vocal fold vibratory function during phonation. Studies indicate that the interrater reliability of visual stroboscopic assessment varies depending on the laryngeal feature being rated and that only a subset of features may be needed to be representative of an entire assessment. High-speed videoendoscopy (HSV) judgments have been shown to be more sensitive than stroboscopy for evaluating vocal fold phase asymmetry, pointing to the future potential of complementing stroboscopy with alternative imaging modalities in hybrid systems. Laryngeal videostroboscopy alone continues to play a central role in clinical voice assessment. Even though HSV may provide more detailed information about phonatory function, its eventual clinical adoption will depend on how remaining practical, technical, and methodological challenges will be met. SUMMARY: Laryngeal videostroboscopy continues to be the modality of choice for imaging vocal fold vibration, but technological advancements in HSV and associated research findings are driving increased interest in the clinical adoption of HSV to complement videostroboscopic assessment.

Paper
M. Ghassemi, et al., “Detecting voice modes for vocal hyperfunction prevention,” Proceedings of the 7th Annual Workshop for Women in Machine Learning. 2012.
D. D. Mehta, et al., “Duration of ambulatory monitoring needed to accurately estimate voice use,” Proceedings of InterSpeech: Annual Conference of the International Speech Communication Association, 2012. Paper Poster
D. D. Mehta and R. E. Hillman, “The evolution of methods for imaging vocal fold phonatory function,” Perspectives on Speech Science and Orofacial Disorders, vol. 22, no. 1, pp. 5-13, 2012. Publisher's VersionAbstract

In this article, we provide a brief summary of the major technological advances that led to current methods for imaging vocal fold vibration during phonation including the development of indirect laryngoscopy, imaging of rapid motion, fiber optics, and digital image capture. We also provide a brief overview of new emerging technologies that could be used in the future for voice research and clinical voice assessment, including advances in laryngeal high-speed videoendoscopy, depth-kymography, and dynamic optical coherence tomography.

Paper
D. D. Mehta, S. M. Zeitels, J. A. Burns, A. D. Friedman, D. D. Deliyski, and R. E. Hillman, “High-speed videoendoscopic analysis of relationships between cepstral-based acoustic measures and voice production mechanisms in patients undergoing phonomicrosurgery,” Annals of Otology, Rhinology, and Laryngology, vol. 121, pp. 341-347, 2012. Paper
D. D. Mehta, D. Rudoy, and P. J. Wolfe, “Kalman-based autoregressive moving average modeling and inference for formant and antiformant tracking,” The Journal of the Acoustical Society of America, vol. 132, no. 3, pp. 1732-1746, 2012. Publisher's Version Paper code
D. D. Mehta, M. Zañartu, S. W. Feng, H. A. Cheyne II, and R. E. Hillman, “Mobile voice health monitoring using a wearable accelerometer sensor and a smartphone platform,” IEEE Transactions on Biomedical Engineering, vol. 59, no. 11, pp. 3090-3096, 2012. Publisher's VersionAbstract

Many common voice disorders are chronic or recurring conditions that are likely to result from faulty and/or abusive patterns of vocal behavior, referred to generically as vocal hyperfunction. An ongoing goal in clinical voice assessment is the development and use of noninvasively derived measures to quantify and track the daily status of vocal hyperfunction so that the diagnosis and treatment of such behaviorally based voice disorders can be improved. This paper reports on the development of a new, versatile, and cost-effective clinical tool for mobile voice monitoring that acquires the high-bandwidth signal from an accelerometer sensor placed on the neck skin above the collarbone. Using a smartphone as the data acquisition platform, the prototype device provides a user-friendly interface for voice use monitoring, daily sensor calibration, and periodic alert capabilities. Pilot data are reported from three vocally normal speakers and three subjects with voice disorders to demonstrate the potential of the device to yield standard measures of fundamental frequency and sound pressure level and model-based glottal airflow properties. The smartphone-based platform enables future clinical studies for the identification of the best set of measures for differentiating between normal and hyperfunctional patterns of voice use.

Paper
2011
R. E. Hillman and D. D. Mehta, “Ambulatory monitoring of daily voice use,” Perspectives on Voice and Voice Disorders, vol. 21, no. 2, pp. 56-61, 2011. Publisher's Version Paper
S. S. Karajanagi, et al., “Assessment of canine vocal fold function after injection of a new biomaterial designed to treat phonatory mucosal scarring,” Annals of Otology, Rhinology, and Laryngology, vol. 120, no. 3, pp. 175-184, 2011. Publisher's Version Paper
D. D. Mehta, D. D. Deliyski, T. F. Quatieri, and R. E. Hillman, “Automated measurement of vocal fold vibratory asymmetry from high-speed videoendoscopy recordings,” Journal of Speech, Language, and Hearing Research, vol. 54, no. 1, pp. 47-54, 2011. Publisher's VersionAbstract

Purpose: In prior work, a manually derived measure of vocal fold vibratory phase asymmetry correlated to varying degrees with visual judgments made from laryngeal high-speed videoendoscopy (HSV) recordings. This investigation extended this work by establishing an automated HSV-based framework to quantify 3 categories of vocal fold vibratory asymmetry. Method: HSV-based analysis provided for cycle-to-cycle estimates of left-right phase asymmetry, left-right amplitude asymmetry, and axis shift during glottal closure for 52 speakers with no vocal pathology producing comfortable and pressed phonation. An initial cross-validation of the automated left-right phase asymmetry measure was performed by correlating the measure with other objective and subjective assessments of phase asymmetry. Results: Vocal fold vibratory asymmetry was exhibited to a similar extent in both comfortable and pressed phonations. The automated measure of left-right phase asymmetry strongly correlated with manually derived measures and moderately correlated with visual-perceptual ratings. Correlations with the visual-perceptual ratings remained relatively consistent as the automated measure was derived from kymograms taken at different glottal locations. Conclusions: An automated HSV-based framework for the quantification of vocal fold vibratory asymmetry was developed and initially validated. This framework serves as a platform for investigating relationships between vocal fold tissue motion and acoustic measures of voice function.

Paper
M. Döllinger, J. B. Kobler, D. A. Berry, D. D. Mehta, G. Luegmair, and C. Bohr, “Experiments on analysing voice production: Excised (human, animal) and in vivo (animal) approaches,” Current Bioinformatics, vol. 6, no. 3, pp. 286-304, 2011. Publisher's Version Paper
D. D. Mehta, S. M. Zeitels, J. A. Burns, A. D. Friedman, D. D. Deliyski, and R. E. Hillman, “High-speed videoendoscopic analysis of relationships between cepstral-based acoustic measures and voice production mechanisms in patients undergoing phonomicrosurgery,” Proceedings of the American Laryngological Association. 2011.
D. D. Mehta, M. Zañartu, T. F. Quatieri, D. D. Deliyski, and R. E. Hillman, “Investigating acoustic correlates of human vocal fold vibratory phase asymmetry through modeling and laryngeal high-speed videoendoscopy,” The Journal of the Acoustical Society of America, vol. 130, pp. 3999-4009, 2011.Abstract

Vocal fold vibratory asymmetry is often associated with inefficient sound production through its impact on source spectral tilt. This association is investigated in both a computational voice production model and a group of 47 human subjects. The model provides indirect control over the degree of left-right phase asymmetry within a nonlinear source-filter framework, and high-speed videoendoscopy provides in vivo measures of vocal fold vibratory asymmetry. Source spectral tilt measures are estimated from the inverse-filtered spectrum of the simulated and recorded radiated acoustic pressure. As expected, model simulations indicate that increasing left-right phase asymmetry induces steeper spectral tilt. Subject data, however, reveal that none of the vibratory asymmetry measures correlates with spectral tilt measures. Probing further into physiological correlates of spectral tilt that might be affected by asymmetry, the glottal area waveform is parameterized to obtain measures of the open phase (open/plateau quotient) and closing phase (speed/closing quotient). Subjects' left-right phase asymmetry exhibits low, but statistically significant, correlations with speed quotient (r=0.45) and closing quotient (r=-0.39). Results call for future studies into the effect of asymmetric vocal fold vibration on glottal airflow and the associated impact on voice source spectral properties and vocal efficiency.

Paper
D. D. Mehta, D. Rudoy, and P. J. Wolfe, “Joint source-filter modeling using flexible basis functions,” Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 5888-5891, 2011. Paper
M. Zañartu, D. D. Mehta, J. C. Ho, G. R. Wodicka, and R. E. Hillman, “Observation and analysis of in vivo vocal fold tissue instabilities produced by nonlinear source-filter coupling: A case study,” The Journal of the Acoustical Society of America, vol. 129, pp. 326-339, 2011. Paper
D. D. Mehta, M. Zañartu, T. F. Quatieri, D. D. Deliyski, and R. E. Hillman, “Use of laryngeal high-speed videoendoscopy systems to study voice production mechanisms in human subjects,” Proceedings of the Acoustical Society of America, vol. 130, pp. 2439-2439, 2011.
2010
D. D. Mehta, M. Zañartu, T. F. Quatieri, D. D. Deliyski, and R. E. Hillman, “Acoustic correlates of human vocal fold vibratory phase asymmetry through modeling and laryngeal high-speed videoendoscopy,” Proceedings of the International Conference on Advances in Quantitative Laryngology, 2010.
S. S. Karajanagi, et al., “Assessment of canine vocal fold function after injection of a new biomaterial designed to treat phonatory mucosal scarring,” Proceedings of the American Broncho-Esophagological Association. 2010.
D. D. Mehta, D. D. Deliyski, and R. E. Hillman, “Commentary on why laryngeal stroboscopy really works: Clarifying misconceptions surrounding Talbot's law and the persistence of vision,” Journal of Speech, Language, and Hearing Research, vol. 53, no. 5, pp. 1263-1267, 2010. Publisher's VersionAbstract

PURPOSE: The purpose of this article is to clear up misconceptions that have propagated in the clinical voice literature that inappropriately cite Talbot's law (1834) and the theory of persistence of vision as the scientific principles that underlie laryngeal stroboscopy. METHOD: After initial research into Talbot's (1834) original studies, it became clear that his experiments were not designed to explain why stroboscopy works. Subsequently, a comprehensive literature search was conducted for the purpose of investigating the general principles of stroboscopic imaging from primary sources. RESULTS: Talbot made no reference to stroboscopy in designing his experiments, and the notion of persistence of vision is not applicable to stroboscopic motion. Instead, two visual phenomena play critical roles: (a) the flicker-free perception of light and (b) the perception of apparent motion. In addition, the integration of stroboscopy with video-based technology in today's voice clinic requires additional complexities to include synchronization with camera frame rates. CONCLUSIONS: References to Talbot's law and the persistence of vision are not relevant to the generation of stroboscopic images. The critical visual phenomena are the flicker-free perception of light intensity and the perception of apparent motion from sampled images. A complete understanding of how laryngeal stroboscopy works will aid in better interpreting clinical findings during voice assessment.

Paper

Pages