Publications

2012
D. D. Mehta, S. M. Zeitels, J. A. Burns, A. D. Friedman, D. D. Deliyski, and R. E. Hillman, “High-speed videoendoscopic analysis of relationships between cepstral-based acoustic measures and voice production mechanisms in patients undergoing phonomicrosurgery,” Annals of Otology, Rhinology, and Laryngology, vol. 121, pp. 341-347, 2012. Paper
D. D. Mehta, D. Rudoy, and P. J. Wolfe, “Kalman-based autoregressive moving average modeling and inference for formant and antiformant tracking,” The Journal of the Acoustical Society of America, vol. 132, no. 3, pp. 1732-1746, 2012. Publisher's Version Paper code
D. D. Mehta, M. Zañartu, S. W. Feng, H. A. Cheyne II, and R. E. Hillman, “Mobile voice health monitoring using a wearable accelerometer sensor and a smartphone platform,” IEEE Transactions on Biomedical Engineering, vol. 59, no. 11, pp. 3090-3096, 2012. Publisher's VersionAbstract

Many common voice disorders are chronic or recurring conditions that are likely to result from faulty and/or abusive patterns of vocal behavior, referred to generically as vocal hyperfunction. An ongoing goal in clinical voice assessment is the development and use of noninvasively derived measures to quantify and track the daily status of vocal hyperfunction so that the diagnosis and treatment of such behaviorally based voice disorders can be improved. This paper reports on the development of a new, versatile, and cost-effective clinical tool for mobile voice monitoring that acquires the high-bandwidth signal from an accelerometer sensor placed on the neck skin above the collarbone. Using a smartphone as the data acquisition platform, the prototype device provides a user-friendly interface for voice use monitoring, daily sensor calibration, and periodic alert capabilities. Pilot data are reported from three vocally normal speakers and three subjects with voice disorders to demonstrate the potential of the device to yield standard measures of fundamental frequency and sound pressure level and model-based glottal airflow properties. The smartphone-based platform enables future clinical studies for the identification of the best set of measures for differentiating between normal and hyperfunctional patterns of voice use.

Paper
2011
R. E. Hillman and D. D. Mehta, “Ambulatory monitoring of daily voice use,” Perspectives on Voice and Voice Disorders, vol. 21, no. 2, pp. 56-61, 2011. Publisher's Version Paper
S. S. Karajanagi, et al., “Assessment of canine vocal fold function after injection of a new biomaterial designed to treat phonatory mucosal scarring,” Annals of Otology, Rhinology, and Laryngology, vol. 120, no. 3, pp. 175-184, 2011. Publisher's Version Paper
D. D. Mehta, D. D. Deliyski, T. F. Quatieri, and R. E. Hillman, “Automated measurement of vocal fold vibratory asymmetry from high-speed videoendoscopy recordings,” Journal of Speech, Language, and Hearing Research, vol. 54, no. 1, pp. 47-54, 2011. Publisher's VersionAbstract

Purpose: In prior work, a manually derived measure of vocal fold vibratory phase asymmetry correlated to varying degrees with visual judgments made from laryngeal high-speed videoendoscopy (HSV) recordings. This investigation extended this work by establishing an automated HSV-based framework to quantify 3 categories of vocal fold vibratory asymmetry. Method: HSV-based analysis provided for cycle-to-cycle estimates of left-right phase asymmetry, left-right amplitude asymmetry, and axis shift during glottal closure for 52 speakers with no vocal pathology producing comfortable and pressed phonation. An initial cross-validation of the automated left-right phase asymmetry measure was performed by correlating the measure with other objective and subjective assessments of phase asymmetry. Results: Vocal fold vibratory asymmetry was exhibited to a similar extent in both comfortable and pressed phonations. The automated measure of left-right phase asymmetry strongly correlated with manually derived measures and moderately correlated with visual-perceptual ratings. Correlations with the visual-perceptual ratings remained relatively consistent as the automated measure was derived from kymograms taken at different glottal locations. Conclusions: An automated HSV-based framework for the quantification of vocal fold vibratory asymmetry was developed and initially validated. This framework serves as a platform for investigating relationships between vocal fold tissue motion and acoustic measures of voice function.

Paper
M. Döllinger, J. B. Kobler, D. A. Berry, D. D. Mehta, G. Luegmair, and C. Bohr, “Experiments on analysing voice production: Excised (human, animal) and in vivo (animal) approaches,” Current Bioinformatics, vol. 6, no. 3, pp. 286-304, 2011. Publisher's Version Paper
D. D. Mehta, S. M. Zeitels, J. A. Burns, A. D. Friedman, D. D. Deliyski, and R. E. Hillman, “High-speed videoendoscopic analysis of relationships between cepstral-based acoustic measures and voice production mechanisms in patients undergoing phonomicrosurgery,” Proceedings of the American Laryngological Association. 2011.
D. D. Mehta, M. Zañartu, T. F. Quatieri, D. D. Deliyski, and R. E. Hillman, “Investigating acoustic correlates of human vocal fold vibratory phase asymmetry through modeling and laryngeal high-speed videoendoscopy,” The Journal of the Acoustical Society of America, vol. 130, pp. 3999-4009, 2011.Abstract

Vocal fold vibratory asymmetry is often associated with inefficient sound production through its impact on source spectral tilt. This association is investigated in both a computational voice production model and a group of 47 human subjects. The model provides indirect control over the degree of left-right phase asymmetry within a nonlinear source-filter framework, and high-speed videoendoscopy provides in vivo measures of vocal fold vibratory asymmetry. Source spectral tilt measures are estimated from the inverse-filtered spectrum of the simulated and recorded radiated acoustic pressure. As expected, model simulations indicate that increasing left-right phase asymmetry induces steeper spectral tilt. Subject data, however, reveal that none of the vibratory asymmetry measures correlates with spectral tilt measures. Probing further into physiological correlates of spectral tilt that might be affected by asymmetry, the glottal area waveform is parameterized to obtain measures of the open phase (open/plateau quotient) and closing phase (speed/closing quotient). Subjects' left-right phase asymmetry exhibits low, but statistically significant, correlations with speed quotient (r=0.45) and closing quotient (r=-0.39). Results call for future studies into the effect of asymmetric vocal fold vibration on glottal airflow and the associated impact on voice source spectral properties and vocal efficiency.

Paper
D. D. Mehta, D. Rudoy, and P. J. Wolfe, “Joint source-filter modeling using flexible basis functions,” Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 5888-5891, 2011. Paper
M. Zañartu, D. D. Mehta, J. C. Ho, G. R. Wodicka, and R. E. Hillman, “Observation and analysis of in vivo vocal fold tissue instabilities produced by nonlinear source-filter coupling: A case study,” The Journal of the Acoustical Society of America, vol. 129, pp. 326-339, 2011. Paper
D. D. Mehta, M. Zañartu, T. F. Quatieri, D. D. Deliyski, and R. E. Hillman, “Use of laryngeal high-speed videoendoscopy systems to study voice production mechanisms in human subjects,” Proceedings of the Acoustical Society of America, vol. 130, pp. 2439-2439, 2011.
2010
D. D. Mehta, M. Zañartu, T. F. Quatieri, D. D. Deliyski, and R. E. Hillman, “Acoustic correlates of human vocal fold vibratory phase asymmetry through modeling and laryngeal high-speed videoendoscopy,” Proceedings of the International Conference on Advances in Quantitative Laryngology, 2010.
S. S. Karajanagi, et al., “Assessment of canine vocal fold function after injection of a new biomaterial designed to treat phonatory mucosal scarring,” Proceedings of the American Broncho-Esophagological Association. 2010.
D. D. Mehta, D. D. Deliyski, and R. E. Hillman, “Commentary on why laryngeal stroboscopy really works: Clarifying misconceptions surrounding Talbot's law and the persistence of vision,” Journal of Speech, Language, and Hearing Research, vol. 53, no. 5, pp. 1263-1267, 2010. Publisher's VersionAbstract

PURPOSE: The purpose of this article is to clear up misconceptions that have propagated in the clinical voice literature that inappropriately cite Talbot's law (1834) and the theory of persistence of vision as the scientific principles that underlie laryngeal stroboscopy. METHOD: After initial research into Talbot's (1834) original studies, it became clear that his experiments were not designed to explain why stroboscopy works. Subsequently, a comprehensive literature search was conducted for the purpose of investigating the general principles of stroboscopic imaging from primary sources. RESULTS: Talbot made no reference to stroboscopy in designing his experiments, and the notion of persistence of vision is not applicable to stroboscopic motion. Instead, two visual phenomena play critical roles: (a) the flicker-free perception of light and (b) the perception of apparent motion. In addition, the integration of stroboscopy with video-based technology in today's voice clinic requires additional complexities to include synchronization with camera frame rates. CONCLUSIONS: References to Talbot's law and the persistence of vision are not relevant to the generation of stroboscopic images. The critical visual phenomena are the flicker-free perception of light intensity and the perception of apparent motion from sampled images. A complete understanding of how laryngeal stroboscopy works will aid in better interpreting clinical findings during voice assessment.

Paper
D. D. Mehta, R. E. Hillman, and T. F. Quatieri, “Impact of human vocal fold vibratory asymmetries on acoustic characteristics of sustained vowel phonation,” Massachusetts Institute of Technology, 2010. Thesis
R. E. Hillman and D. D. Mehta, “The science of stroboscopic imaging”, K. A. Kendall and R. J. Leonard, Ed. New York, NY: Thieme Medical Publishers, Inc. 2010, pp. 101-109. Publisher's Version
D. D. Mehta, D. D. Deliyski, S. M. Zeitels, T. F. Quatieri, and R. E. Hillman, “Voice production mechanisms following phonosurgical treatment of early glottic cancer,” Annals of Otology, Rhinology, and Laryngology, vol. 119, pp. 1-9, 2010. Publisher's VersionAbstract

Objectives: Although near-normal conversational voices can be achieved with the phonosurgical management of earlyglottic cancer, there are still acoustic and aerodynamic deficits in vocal function that must be better understood to helpfurther optimize phonosurgical interventions. Stroboscopic assessment is inadequate for this purpose.Methods: A newly developed color high-speed videoendoscopy (HSV) system that included time-synchronized recordingsof the acoustic signal was used to perform a detailed examination of voice production mechanisms in 14 subjects.Digital image processing techniques were used to quantify glottal phonatory function and to delineate relationships betweenvocal fold vibratory properties and acoustic perturbation measures.Results: The results for multiple measurements of vibratory asymmetry showed that 31% to 62% of subjects displayedhigher-than-normal average values, whereas the mean values for glottal closure duration (open quotient) and periodicityof vibration fell within normal limits. The average HSV-based measures did not correlate significantly with the acousticperturbation measures, but moderate correlations were exhibited between the acoustic measures and the SDs of the HSVbasedparameters.Conclusions: The use of simultaneous, time-synchronized HSV and acoustic recordings can provide new insights intopostoperative voice production mechanisms that cannot be obtained with stroboscopic assessment.

Paper
2009
M. Zañartu, J. C. Ho, D. D. Mehta, R. E. Hillman, and G. R. Wodicka, “An impedance-based inverse filtering scheme with glottal coupling,” Proceedings of the Acoustical Society of America, 2009.
D. D. Mehta, D. D. Deliyski, S. M. Zeitels, M. Zañartu, and R. E. Hillman, “Integration of ultra high-speed color videoendoscopy with time-synchronized measures of vocal function,” Proceedings of The Triological Society (Eastern Section), 2009. Poster

Pages