Binaural rendering of spherical microphone array recordings by directly synthesizing the spatial pattern of the head-related transfer function

Shuichi Sakamoto, César Salvador, Jorge Treviño, Yôiti Suzuki

Producción científica: Contribución a una conferenciaArtículorevisión exhaustiva

Resumen

Binaural technologies can convey rich spatial auditory information to listeners using simple equipment such as headphones. Advanced binaural recording and reproduction methods use spherical microphone arrays and head-related transfer function (HRTF) datasets. Mainstream techniques, such as binaural Ambisonics, characterize the recorded sound field as a weighted sum of spherical harmonics functions. In contrast, this research seeks to generate individualized binaural signals directly from the microphone recordings, without relying on intermediate sound field representations. The approach, known as SENZI, applies a set of weighting filters to the recorded microphone signals resulting in the target spatial pattern defined by the HRTF dataset. In this sense, the proposal requires finding the appropriate weighting filters by inverting a linear system. Binaural synthesis methods based on the solution to an inverse problem belong to one of two categories: HRTF modeling (type 1) or microphone signal modeling (type 2). The SENZI method considered here belongs to the HRTF modeling category. In addition, the problem is generally over- or underdetermined, depending on the number of microphones in the array and HRTFs in the dataset. This also impacts the accuracy of the synthesized binaural signals. A design problem, therefore, is to choose the most appropriate number of microphones and HRTFs. Fortunately, large HRTF datasets, as well as massively multi-channel arrays are now available. An example of the latter is a real-time implementation of the SENZI method using a 252-channel spherical microphone array and a FPGA-based processing subsystem. This research evaluates the binaural synthesis accuracy in relation to the number of microphones and HRTFs used to derive the weighting filters. Numerical simulations show that underdetermined systems generally yield better results than overdetermined ones.

Idioma originalInglés
EstadoPublicada - 2017
Publicado de forma externa
Evento24th International Congress on Sound and Vibration, ICSV 2017 - London, Reino Unido
Duración: 23 jul. 201727 jul. 2017

Conferencia

Conferencia24th International Congress on Sound and Vibration, ICSV 2017
País/TerritorioReino Unido
CiudadLondon
Período23/07/1727/07/17

Huella

Profundice en los temas de investigación de 'Binaural rendering of spherical microphone array recordings by directly synthesizing the spatial pattern of the head-related transfer function'. En conjunto forman una huella única.

Citar esto