Personalised Headphone Listening • Part One
Only recently have those who record and present audio mechanically, using microphones, loudspeakers and headphones, geared up to approach the true realism of how we hear audio. Now, we talk about ‘immersive audio’, meaning audio that actually tries to mimic what we experience in reality.
For a wide seating area, we can introduce a dense grid of loudspeakers - real sources - that everyone in the audience can locate correctly. This allows audio to reach the listener from both horizontal and vertical directions and is the foundation for all modern loudspeaker-based immersive audio presentations.
The localisation of sound for humans is traditionally explained using the concepts of the Interaural Time Difference (ITD) and Interaural Level Difference (ILD). ILD is suggested to be the main localisation mechanism above the frequency where the human head becomes a significant shadow for sound (above 1500 Hz), while ITD comes into effect below that frequency. However, both of these are part of a more elaborate mechanism, that of the Head-Related Transfer Function (HRTF).
HRTF acts as a filter for each ear separately, with a different value for each direction of arrival on the horizontal plane and vertical plane, incorporating all spectral and time-domain effects, including ILD and ITD. HRTF describes the total effect to the sound produced by the sound approaching an ear from a given orientation, and to create a complete understanding of the directional effects to sound we need measurements of a large number of HRTFs, for all the directions of arrival over a sphere surrounding the listener’s head.
When headphones are placed over the ears (or inside the ears), the human sound localisation system of the head size and external ear shape is completely removed from the equation – and this is the reason why headphone sound mostly seems to be ‘inside’ the listener’s head. However, once a listener’s HRTF measurements are recorded, we then have the possibility to deliver sound to them properly via headphones. The process is relatively simple but requires an amount of signal processing and also additional information.
The key to enabling this process is access to the HRTF information. Scientists have found that although we all share the same broad principles of how our HRTFs appear, we are all individual with slight differences in details. These details, they have concluded, do not successfully translate from one person to another - and borrowing someone else’s HRTFs for listening will not be successful. You need your own, personal HRTF information.
This HRTF information can be directly measured, but this needs to be done in an anechoic room - thus eliminating the sound of the room in the process - and by placing microphones in the two ears of the listener and positioning a loudspeaker in all the orientations needed. This is obviously a complicated, time consuming process that can still be prone to errors.
However, after years of research, Genelec Aural ID now offers a more robust method of obtaining personal HRTFs, yielding anechoic HRTFs but without the requirement for an anechoic room.
Aural ID is able to achieve this by using numerical modelling of the acoustic field at the head instead of direct acoustic measurement with microphones. This method requires a video to be taken of the listener, which captures the head and upper torso (shoulders) of the listener while the camera circles them, showing these details at all angles. After this, the Aural ID calculation process analyses the video and builds a three-dimensional model of the listener using photogrammetric methods, by looking at the differences that occur between different images in the video. This model is then used in calculating the impact that sound experiences when it arrives at the two ears from different orientations. Using this information, finally, the HRTF filters can be obtained in almost a thousand different orientations. This method is advantageous because it eliminates the uncertainty related to placing the microphones in the listeners ears, and any listener movements do not play a role in the quality of the final result.
At the moment, there are many companies working on HRTF-related solutions using a method of matching to some anatomic features in the listener. Then, based on some criteria, the best match for their real HRTFs is selected in a pre-existing HRTF data base. These methods are fast, but are unable to include sufficient anatomic detail about the person’s actual physical characteristics. The library-chosen HRTFs may not render audio very well - because they are not actually the listener's.
So, who is Genelec Aural ID aimed at? In reality, anyone who relies on information about the direction of arrival for sound will benefit, including users of Virtual Reality and Augmented Reality systems, game engines calculating the audio presentations dynamically as part of the gaming process, and researchers working with 3D audio. Ultimately, we can all benefit, once the method of presenting immersive audio develops to include information about the direction of sound arrival and the processing needed to plug in the personal HRTF information to obtain the correct presentation over headphones. This will elevate headphone listening to be a much more reliable experience.
Genelec’s Aural ID measurement method is aimed at audio professionals, opening new possibilities to the whole audio industry. It does this at a totally personal level, maintaining complete accuracy and full personal detail, with the robustness and convenience needed to bring HRTFs to all professional applications.
Aki Mäkivirta
R&D Director
Read How Does it Sound on Headphones? Personalised Headphone Listening • Part Two by Thomas Lund.