Resumen
The Moving Picture Experts Group - 1 (MPEG-1) perceptual audio compression scheme is a successful family of audio codecs described in standard ISO/IEC 11172–3. Currently, there is no general framework to emulate nor MPEG-1 neither any other psychoacoustic model, which is a core piece of many perceptual codecs. This work presents a successful implementation of a convolutional neural network which emulates psychoacoustic model 1 from the MPEG-1 standard, termed “MCNN-PM” (Multiscale Convolutional Neural Network – Psychoacoustic Model). It is then implemented as part of the MPEG-1, Layer I codec. Using the objective difference grade (ODG) to evaluate audio quality, the MCNN-PM MPEG-1, Layer I codec outperforms the original MPEG-1, Layer I codec by up to 17% at 96 kbps, 14% at 128 kbps and performs almost equally at 192 kbps. This work shows that convolutional neural networks are a viable alternative to standard psychoacoustic models and can be used as part of perceptual audio codecs successfully.
| Idioma original | Inglés |
|---|---|
| Páginas (desde-hasta) | 6963-6974 |
| Número de páginas | 12 |
| Publicación | Multimedia Tools and Applications |
| Volumen | 83 |
| N.º | 3 |
| DOI | |
| Estado | Publicada - ene. 2024 |
Huella
Profundice en los temas de investigación de 'MPEG-1 psychoacoustic model emulation using multiscale convolutional neural networks'. En conjunto forman una huella única.Prensa/Medios de comunicación
-
Research Data from Peruvian University of Applied Sciences Update Understanding of Networks (Mpeg-1 Psychoacoustic Model Emulation Using Multiscale Convolutional Neural Networks)
Kemper Vasquez, G. L. & Sanchez Huapaya, A. S.
7/07/23
1 elemento de Cobertura del medio de comunicación
Prensa/medios de comunicación