IMPACT CASE STUDY

Sound check: audio quality technologies for entertainment, communication, and accessibility

  • 22 January 2025
  • Dr Andrew Hines
  • Academic, Cultural, Economic, Health, Technological

 

Summary

Audio quality plays a crucial role in our daily lives, significantly impacting the human experience. High-quality audio enhances our enjoyment of music, movies, and online content, making entertainment more immersive and enjoyable. Audio quality is also the cornerstone of communication, ensuring that video calls and voice messages are clear and intelligible. Bad audio can cause issues such as miscommunication of important information and mental listening fatigue. For a brand to remain trusted, it must be able to deliver high-quality audio on a large scale. This requires new technologies to continuously monitor audio quality.

Dr Hines has developed a suite of software tools for testing audio quality at scale: ViSQOL (2015), AMBIQUAL (2018) and Go Listen (2020). These tools have been widely adopted by some of the largest companies in the world including Google, Meta and many others. ViSQOL now processes the top 30k YouTube videos every day to ensure audio quality meets the required standards before 1 billion YouTube viewers consume the content. In 2018, ViSQOL won the US-Ireland Research Innovation Award for this collaboration with Google. Dr Hines's subjective listening test platform, Go Listen, has also seen wide adoption from researchers in industry and academia. Dr Hines’s internal metrics show that over 1000 experiments using this technology have now been conducted by more than 300 researchers involving greater than 6000 subjective listeners.

Research description

In 2012, Dr Hines began a collaboration with Google that led to the  development of ViSQOL. ViSQOL is a software algorithm that evaluates the quality of speech or music and estimates how a human listener would perceive the audio quality. As a result, millions of hours of audio content can be assessed without the need for human evaluations. Until 2018, no technology existed to evaluate immersive spatial audio, so the team developed AMBIQUAL , a first-of-its-kind model to predict audio quality for surround sound spatial audio.

To validate models like ViSQOL or any audio enhancement technologies, it is necessary to conduct subjective listening tests with human listeners. In 2020, the team developed Go Listen, a web-based subjective listening test platform which allows audio researchers anywhere in the world to create and share subjective listening tests in minutes. Dr Hines’s future work around audio quality will focus on audio enhancement for hearing-impaired listeners and augmented reality systems.

Research team and collaborators

 

  • Dr Andrew Hines, Dr Dan Barry, Dr Alessandro Ragano, Qijian Zhang, Pheobe Wenyi Sun  - School of Computer Science - Insight Centre for Data Analytics - University College Dublin;
  • Prof. Anil Kokaram , Prof. Naomi Harte - Sigmedia Group, School of Engineering, Trinity College Dublin, Ireland;
  • Dr Jan Skogland, Andrew Allen, Dr Damien Kelly, Dr Michael Chinen, Felicia Lim, Nikita Gureev - Google, San Francisco Ca, USA;
  • Dr Miroslaw Narbutt - School of Computing, Technical University Dublin, Ireland;
  • Prof. Peter Pocta- Department of Telecommunications and Multimedia, FEE, University of Zilina, Slovakia;
  • Dr Hugh Melvin- College of Engineering & Informatics, National University of Ireland, Galway, Ireland.

 

Funding

 

  • The research has been funded through a variety of awards from Google’s University Programme, two Google Research Unrestricted Gifts, Enterprise Ireland IPP, SFI Centre co-funded project and an EU Marie Curie Fellowship. Following collaboration with product groups based in Google, such as Chrome and YouTube, the team have competed for continued funding on an annual basis. The team had to demonstrate the success of previous projects in subsequent proposals through tangible impacts such as Return on Investment, patents, and integration of tools into products or pipelines. Co-funded projects allowed multi-year engagements to be established and the work has been funded from 2012 with current projects funded until 2026.
  • Google: (2012) USD$127,055; (2019) €139,300; (2023) €100,000; (2024) €100,000
  • SFI Connect-Google co-fund: (2013) €238,166; (2017) €276,967
  • SFI Insight: (2019) €642,638
  • EU MCSA Fellowship: (2024) €215,534

Research impact

Technological impact

Dr Hines’s audio quality testing tools have now been widely adopted by some of the largest companies in the world. Some recognisable names actively using the technology in research and product development include: Google, Meta, Apple, Microsoft, Sony, Dolby (Spatial Audio), Nvidia (Consumer Electronics), Starkey (Hearing Aids), Mathworks (MATLAB R&D Software) and many others from startups to multinationals.

Google is the most prolific user of the technology and ViSQOL has been adopted as a quality metric across multiple product groups within the company. It is used for a variety of purposes from developers creating new ways to compress and transmit audio files so they sound clear and take up less space for Google Meet, to quality testing on YouTube and designing codecs for Pixel phones. More than half of YouTube’s billion daily views come from mobile devices and uploaded content needs to be converted into lots of different formats to accommodate the various devices and network conditions. This process can introduce reduced quality and content must be monitored daily to ensure quality standards.

ViSQOL is deployed within YouTube as an automated regression testing tool that can provide quality warning alerts. Essentially, ViSQOL ‘listens’ to the top 30k YouTube videos each day, in all formats, to ensure audio quality meets the required standards before 1 billion + YouTube viewers consume the content.

ViSQOL has become a crucial tool for both development and production. We are utilising the metric extensively in the development of speech and audio enhancement... Every day, YouTube probes the quality of the 30K most popular videos worldwide using ViSQOL as the measure... We have heard from [...] Microsoft and Apple that it is frequently used in [... ] their organisations.

Adrian Grange, Senior Technical Program Manager, Google

Another example of industry adoption of the tools is MATLAB, a computing platform that is used for engineering and scientific applications like data analysis, audio and image processing, machine learning and robotics. It is used by over 4 million researchers worldwide. In 2024, ViSQOL was included in Mathworks' MATLAB 2024a release as their standard speech and audio quality metric.

Academic impact

ViSQOL has received several awards for its novelty and impact including: the SFI Connect 2018 Impact Award, QoMEX Best Paper Award 2013 and the 2018 RIA/Amcham US-Ireland Research Innovation Award. ViSQOL and AMBIQUAL have also been granted 3 US patents. The project’s quality metrics and tools have had a significant impact in terms of academic contribution to knowledge, with over 600 citations across the published papers on the work.

Dr Hines’s subjective listening test platform, Go Listen, has also seen wide adoption from researchers in industry and academia. Internal metrics show that more than 300 researchers and 6000 subjective listeners have now conducted over 1000 tests using the platform. The platform continues to grow with new users and organisations joining daily.

Health impact

Dr Hines continues to push the boundaries in audio quality research. The team’s latest project, Hear AI, is developing music enhancement algorithms for people with hearing impairments. This research can have a significant impact in addressing this global health issue affecting over 1.5 billion people globally.

"GoListen makes listening tests easy. User-friendly, simple setup, and free to use. They also store your data, making it a fantastic platform! Carrying out tests on GoListen saves us time and money.”

— Cyran Aouameur, Research Engineer, Sony

“MATLAB is used by more than 5 million engineers and scientists [...] The addition of ViSQOL to Audio Toolbox allowed us to respond to growing requests for easy access to objective audio quality estimation. ViSQOL can match and outperform other well-established audio quality algorithms”.

— Jimmy Lapierre, Principal Engineer, Mathworks

References

 

  • Hines, J. Skoglund, A. C. Kokaram, and N. Harte, “ViSQOL: An objective speech quality model,” vol. 2015, no. 1, 2015. DOI:10.1186/S13636-015-0054-9 https://asmp-eurasipjournals.springeropen.com/articles/10.1186/s13636-015-0054-9; software: https://github.com/google/visqol;
  • MATLAB: https://www.mathworks.com/help/audio/ref/visqol.html?s_tid=doc_ta
  • Sloan, N. Harte, D. Kelly, A. C. Kokaram, and A. Hines, “Objective assessment of perceptual audio quality using ViSQOL Audio,” IEEE Transactions on Broadcasting, vol. 63, no. 4, pp. 693–705, 2017. DOI:10.1109/TBC.2017.2704421 https://ieeexplore.ieee.org/document/7940042
  • Narbutt, A. Allen, J. Skoglund, M. Chinen, and A. Hines, “Ambiqual-a full reference objective quality metric for ambisonic spatial audio,” in 2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–6, IEEE, 2018. DOI: 10.1109/QoMEX.2018.8463408. https://ieeexplore.ieee.org/document/8463408
  • M. Narbutt, A. Allen, J. Skoglund, M. Chinen, and A. Hines, “Objective quality metrics for ambisonics spatial audio,” US Patent 10,672,40, Google LLC, 2018. https://patents.google.com/patent/US10672405B2/en
  • Barry, D., Zhang, Q., Sun, P.W. and Hines, A., 2021. Go Listen: An End-to-End Online Listening Test Platform. Journal of Open Research Software, 9(1), p.20. DOI: http://doi.org/10.5334/jors.361; platform website: https://golisten.ucd.ie
  • Hines, P. Počta and H. Melvin, "Detailed comparative analysis of PESQ and ViSQOL behaviour in the context of playout delay adjustments introduced by VOIP jitter buffer algorithms," 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX), Klagenfurt am Wörthersee, Austria, 2013, pp. 18-23, DOI: https://doi.org/10.1109/QoMEX.2013.6603195
  • CONNECT Centre, CONNECT wins US-Ireland research innovation award  
  • AJ Hines, J Skoglund, N Harte, A Kokaram, “Detection of chopped speech,” US Patent 9,263,06 https://patents.google.com/patent/US9263061B2/en
  • J Skoglund, AJ Hines, N Harte, A Kokaram, “Objective speech quality metric’” US Patent 9,524,733 https://patents.google.com/patent/US9524733B2/en
  • Ragano, J. Skoglund, and A. Hines, “Nomad: Unsupervised learning of perceptual embeddings for speech enhancement and non-matching reference audio quality assessment,” in ICASSP 2024 - 2024
  • IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Apr. 2024. DOI: 10.1109/ICASSP48485.2024.10448028 https://ieeexplore.ieee.org/document/1044802