《Relative contribution of envelope and fine structure to the subcortical encoding of noise-degraded speech》

  • 来源专题:水声领域信息监测
  • 发布时间:2016-11-14
  • Brainstem frequency-following responses (FFR) were elicited to the speech token /ama/ in noise containing only envelope (ENV) or fine structure (TFS) cues to assess the relative contribution of these temporal features to the neural encoding of degraded speech. Successive cue removal weakened FFRs with noise having the most deleterious effect on TFS coding. Neuro-acoustic and response-to-response correlations revealed speech-FFRs are dominated by stimulus ENV for clean speech, with TFS making a stronger contribution in moderate noise levels. Results suggest that the relative weighting of temporal ENV and TFS cues to the neural transcription of speech depends critically on the degree of noise in the soundscape.

相关报告
  • 《Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain》

    • 来源专题:水声领域信息监测
    • 发布时间:2016-11-14
    • A speech intelligibility prediction model is proposed that combines the auditory processing front end of the multi-resolution speech-based envelope power spectrum model [mr-sEPSM; Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134(1), 436–446] with a correlation back end inspired by the short-time objective intelligibility measure [STOI; Taal, Hendriks, Heusdens, and Jensen (2011). IEEE Trans. Audio Speech Lang. Process. 19(7), 2125–2136]. This “hybrid” model, named sEPSMcorr, is shown to account for the effects of stationary and fluctuating additive interferers as well as for the effects of non-linear distortions, such as spectral subtraction, phase jitter, and ideal time frequency segregation (ITFS). The model shows a broader predictive range than both the original mr-sEPSM (which fails in the phase-jitter and ITFS conditions) and STOI (which fails to predict the influence of fluctuating interferers), albeit with lower accuracy than the source models in some individual conditions. Similar to other models that employ a short-term correlation-based back end, including STOI, the proposed model fails to account for the effects of room reverberation on speech intelligibility. Overall, the model might be valuable for evaluating the effects of a large range of interferers and distortions on speech intelligibility, including consequences of hearing impairment and hearing-instrument signal processing.
  • 《Influences of noise-interruption and information-bearing acoustic changes on understanding simulated electric-acoustic speecha》

    • 来源专题:水声领域信息监测
    • 发布时间:2016-11-25
    • In simulations of electrical-acoustic stimulation (EAS), vocoded speech intelligibility is aided by preservation of low-frequency acoustic cues. However, the speech signal is often interrupted in everyday listening conditions, and effects of interruption on hybrid speech intelligibility are poorly understood. Additionally, listeners rely on information-bearing acoustic changes to understand full-spectrum speech (as measured by cochlea-scaled entropy [CSE]) and vocoded speech (CSECI), but how listeners utilize these informational changes to understand EAS speech is unclear. Here, normal-hearing participants heard noise-vocoded sentences with three to six spectral channels in two conditions: vocoder-only (80–8000 Hz) and simulated hybrid EAS (vocoded above 500 Hz; original acoustic signal below 500 Hz). In each sentence, four 80-ms intervals containing high-CSECI or low-CSECI acoustic changes were replaced with speech-shaped noise. As expected, performance improved with the preservation of low-frequency fine-structure cues (EAS). This improvement decreased for continuous EAS sentences as more spectral channels were added, but increased as more channels were added to noise-interrupted EAS sentences. Performance was impaired more when high-CSECI intervals were replaced by noise than when low-CSECI intervals were replaced, but this pattern did not differ across listening modes. Utilizing information-bearing acoustic changes to understand speech is predicted to generalize to cochlear implant users who receive EAS inputs.