Formation Audio Ltd

View Original

Future paths with multi-channel audio

The following discussion is taken from my Audio Mastering book ‘Separating the science from fiction’, chapter 16, 'Restrictions of delivery formats' and chapter 17, ‘What's next?’.

Available at www.routledge.com/9781032359021

Future paths with multi-channel audio:

Multi-channel surround sound music formats for the general consumer have been a recurring theme in audio over many years, from ambisonics principles, Quad, encoding to UHJ, DVD Audio in 5.1 and now to Dolby Atmos with Apple Spatial Audio. The previous formats for music have never widely caught on for music only consumption. It maybe the same outcome for Atoms/Apple Spatial Audio for purely music listening delivery. A consumer just wants to listen to a song and feel its power emotionally from the melodies and lyrics and changing the listener’s mood/perspective. Both of which do not require surround sound to appreciate, neither is stereo required. Many listeners use mono playback to listen to music though we do have two ears, and most appreciate sound coming from more than one source. But with more than two sources, inherent phase issues smear the audio unless set up correctly, which brings us back to two or one speaker as it sounds more coherent. Not forgetting headphone use, especially with the advent of better outcomes and usability in the last decades does mean many observe stereo or binaural encoding of surround formats this way. But encoding can lead to phase issues, smearing and image instability unless the product was recorded for binaural or is matrixed from an Ambisonics and not surround sound source. This is in part why these outcomes do not catch on, coming back to convenience over quality. If people do not feel an added value, where is the incentive for a consumer? Most just want the music to come out of a playback device. Our job as the mastering engineer is to make sure it sounds as good as it can when it does.

There is one format, stereo vinyl, that did easily catch on, because by design it was 100% compatible with mono vinyl. You can have a stereo deck and play mono vinyl, and have a mono deck and play a stereo record. With convenience over quality, listeners could play whatever and however they wanted. As a supplier, in time this leads to stopping manufacturing mono because stereo is mono and so on. Stereo is the source being supplied more and more by studios so the transition over time was to manufacturing stereo vinyl for the majority of releases.

In essence as a consumer, if a listener can hear the tune and lyric, job done. Who cares about the format. In the end the Queen and the King of music are mono.

Virtual reality (VR) utilising ambisonics and binaural playback configuration are very effective at linking sound to the visual aspects in view. Though music as an ingest into this soundscape is often most effective if the playback source has context in the environment, which leads us back to mono or stereo playback from a point of source. Or the music is just simply playing stereo in the headphone separate to the visual environment as a differing layer. At a virtual gig, the sound is coming directly from the stacks left and right of the stage. Unless we can walk between the performers where a control of the mix would be required, mastering would not be a consideration on the outcome of this type of production other than some master bus control, which would have to be real-time in the playback engine to respond to the ever-changing mix balance in the same way as with any real-time game audio engine.

I truly appreciate the aesthetic virtues of multi-channel delivery from a technical perspective and an immersive one when correctly set up. There are few ways to fully appreciate it as a consumer without buying a correctly set up listening room or VR system. Encoding to binaural for headphones has varied results whereas stereo and mono down mixing are clearly controllable. Maybe I am too old to see the future, but this future was invented decades before I was born and it still has not caught on with the general consumer in a purely listening context. Obviously as soon as this audio is linked to the visual, it becomes much bigger than its individual parts. In this regard every new format can become an opportunity for the engineer/entrepreneur who is quick to adapt to change and learn from the most recent development in the field.

For the consumer to effectively listen to an album that is mixed for Dolby Atmos, they will require a dedicated playback system to fully enjoy it, such as a 7.1.2 as a minimum in their home, which you would imagine should not be a high background noise environment. Equally, headphones using spatial audio would not be subject to high background noise if the user was trying to have an immersive experience. This means the concerns with dynamic range and translation in mastering should not apply. This potential useable wide dynamic range should be used to the full in the mixing process. The EQ balance should also be easy to manage because the Dolby mix room should sound excellent and be well controlled. All this means there is no need ‘to master’ the outcome in the same way with all commercial film outcomes mixed in post-production. You still require an engineer with excellent critical listening and sound balance skills, but not a mastering engineer directly. But this could be one and the same. Apple are keen to develop a multi-channel consumer base in this area though the Atmos mix will still be supplied with a stereo mix version as part of the delivery. This mix could be mastered for normal consumer use on Apple Music, but the spatial audio playback on this platform in binaural should be unique to the Atmos mix. Otherwise, the spatial audio encoder is just up-mixing stereo audio to create the impression of an immersive experience. Personally, I have not heard any music that ever sounded better in an up-mix than the stereo version using any multi-channel tool for this purpose.

Perspectives with multi-channel audio

Music does not need to be more than mono to be enjoyed. It does not need more than one speaker, and that speaker does not even need to be good quality. As the previous consumer formats have taught us, multi-channel systems are not needed by the average consumer to get full enjoyment out of their music, but a small minority might buy into the requirements. The majority just want to listen to the music. The rationale is different for cinematic sound or multi-channel sound design, especially in gaming/VR where the sonics are part of the visual landscape. Overall, the consumer dictates the playback outcomes, not the technology or our wants as audio engineers or artists. Focus needs to be on making our output the best quality possible for the simplest format translation from mono and stereo, and it will translate on all current and future systems. We can deal with other multi-channel formats in whatever configuration is required, as and when, but mono is the queen and king.

That said, I do not mean to sound negative toward this exciting area of creativity. I have an open heart towards multi-channel audio, it is a wonderful experience to work with sonically. Professionally the opportunities are limited in a purely audio arena, but if you find your niche, especially with a genre, there is a world of opportunities to embrace. For further reading in this area, Darcy Proper and Thor Legvold submitted an informative chapter for Mastering In Music [11] published by Routledge.

In fully understanding the translation/downmix from stereo to mono, all other multi-channel systems can be understood. LR is MS, this is one dimension, add another MS front/back, another up/down and the outcome is basically B-format, WXYZ. Ambisonics or true phase coherent three-dimensional audio sound capture and playback are used in VR/gaming audio engines. Everything in audio eventually comes back around to amplitude and phase. Understand that fully and you can interact with any audio outcome proficiently and professionally applying knowledge and skill to the challenges presented.