《Speaker height estimation from speech: Fusing spectral regression and statistical acoustic models: The Journal of the Acoustical Society of America: Vol 138, No 2》

  • 来源专题:水声领域信息监测
  • 编译者: ioalib
  • 发布时间:2016-12-20
  • Estimating speaker height can assist in voice forensic analysis and provide additional side knowledge to benefit automatic speaker identification or acoustic model selection for automatic speech recognition. In this study, a statistical approach to height estimation that incorporates acoustic models within a non-uniform height bin width Gaussian mixture model structure as well as a formant analysis approach that employs linear regression on selected phones are presented. The accuracy and trade-offs of these systems are explored by examining the consistency of the results, location, and causes of error as well a combined fusion of the two systems using data from the TIMIT corpus. Open set testing is also presented using the Multi-session Audio Research Project corpus and publicly available YouTube audio to examine the effect of channel mismatch between training and testing data and provide a realistic open domain testing scenario. The proposed algorithms achieve a highly competitive performance to previously published literature. Although the different data partitioning in the literature and this study may prevent performance comparisons in absolute terms, the mean average error of 4.89 cm for males and 4.55 cm for females provided by the proposed algorithm on TIMIT utterances containing selected phones suggest a considerable estimation error decrease compared to past efforts.

相关报告
  • 《Acoustics of Italian Historical Opera Houses: The Journal of the Acoustical Society of America: Vol 138, No 2》

    • 来源专题:水声领域信息监测
    • 编译者:ioalib
    • 发布时间:2016-12-20
    • Opera houses represent a large group of performance spaces characterized by great complexity and, at the same time, versatility with respect to different usage (from opera to symphonic music and ballet). This kind of building originated in Italy during the 17th century and later spread across the country and then Europe and the rest of the world, slowly evolving into modern theatre shapes. As a consequence of the changes undergone by the interior space, the original acoustic features, which likely influenced many composers, experienced important variations. Thanks to acoustic measurement campaigns inside Italian Historical Opera Houses, promoted by National and Regional Projects, the distinctive features of these spaces were investigated in comparison to modern spaces. In this work, the newly acquired data are merged with data in the literature in order to present and discuss some of the distinctive acoustic features of historical spaces as regards their original function. Moreover, specific issues such as listening in stalls and boxes and the criteria governing the preference judgment of listeners are considered. The concept and the crucial role of the balance between stage and pit sources are also discussed by means of previous literature studies.
  • 《Numerical analysis of three-dimensional acoustic propagation in the Catoche Tongue: The Journal of the Acoustical Society of America: Vol 138, No 4》

    • 来源专题:水声领域信息监测
    • 编译者:ioalib
    • 发布时间:2016-12-20
    • Analysis of modeled time series data is presented to provide insight into propagation physics of horizontally refracted sound in the Catoche Tongue region of the Gulf of Mexico. The analysis is motivated by the observation of out-of-plane arrivals in measured time series data. In particular, the extended duration of the refracted arrivals is shown to be caused by interaction with multiple locations along the steep sides of the Tongue. Comparison of the modeled time series is made to previous work by Sturm [J. Acoust. Soc. Am. 117(3), 1058–1079 (2005)], who examined the frequency dependence of out-of-plane modal arrivals for the wedge-shaped ocean.