Serial ADM Metadata and Next Generation Audio

By Larry Schindel on Feb 9, 2024 7:55:04 PM

 

Next Gen TV audio promises viewers a tailored experience. How does promise become reality?

TA_WatermarkTV_2021_Hero Image

A Quick Look At Metadata

Metadata, as we all know, is information that describes the configuration of the audio program. Since its creation, metadata has been used, misused, set properly, improperly, or simply left at the default values. It is often misunderstood, yet it is key to getting modern audio correct and will become even more important in the coming years.

Currently, the most important and commonly used metadata describes: 

• The number of channels contained in an audio program (mono, stereo, 5.1, etc.)

• The average loudness of a program (sometimes called “dialnorm” or Program Reference Level)

•Dynamic range compression information, which is used by decoders to adjust the program’s dynamic range to what’s appropriate for the playback system and environment.

Other metadata exists to describe the type of stereo downmix to perform when necessary, gain of the center and surround channels when downmixing, whether a 2-channel program is a stereo mix or has been surround encoded, copyright information, and so on, but this is less critical.

NGA + Metadata: A Match Made in Heaven?

Where metadata really starts to get interesting and becomes far more mission-critical is when we talk about emerging audio formats such as those used in Next Gen Audio (NGA). NGA supports advanced features such as immersive audio, enhanced dialog for improved clarity and listenability, and personalization.

Personalization is where NGA really shines and this is accomplished with the use of Object Based Audio (OBA), where each sound or audio element can stand by itself and is considered its own “object”. This allows the broadcaster to send multiple dialog objects (such as different languages, or in the case of sports, different team announcers). 

From a production standpoint, this isn’t much different than what’s done today by creating multiple mixes simultaneously from their mixing console. There’s the International “Clean” Feed with no dialog. There’s the home team’s feed with their announcer mixed in, and there’s the visiting team’s feed with their announcer mixed in. Oftentimes a visually impaired description track (AD) is added further down the line before the final transmission to the viewers. 

Today these requirements are satisfied by transmitting multiple complete mixes, and often these additional mixes are limited to stereo —or even mono! But NGA allows them to be delivered to viewers in a far more efficient manner, by using personalization to deliver a common bed (base) mix and individual dialogs to the viewer. The viewer then selects which dialog(s) they want to listen to, and their receiver mixes these signals together based on metadata set by the broadcaster and settings chosen by the viewer. The net effect to the viewer is that now, no matter which dialog(s) they choose to listen to, they can hear any language, announcer, and/or AD service with the full multichannel bed mix - which is a vast improvement over many of today’s experiences.

Serial ADM To The Rescue

NGA and S-ADM

The broadcaster needs to create the metadata to deliver this experience to the viewer, and they need to get it right. Imagine if the viewer’s receiver doesn’t know which channels are the bed mix, which is the English announcer, which is the Spanish one, and which is the AD service. This is easy with NGA authoring/creation tools, but what happens at program transitions? What happens when the commercial doesn’t support all the languages? What happens when you transition from a football game containing a home team announcer and an away team announcer to a drama which includes English dialog, Spanish dialog, and an AD service? How do you signal that, and do so in a reliable manner?

An emerging technology called Serial ADM (S-ADM) might just be the answer. Serial ADM comes from the EBU (BS.2125) and is a serialized version of ADM (Audio Definition Model) metadata. 

ADM is not really suitable or intended for live production and streaming applications, whereas Serial ADM enables these workflows and provides a method to define the critical metadata about an audio program and transmit it with the audio program in a real-time/streaming manner. Similar in concept to metadata that was able to be carried alongside the PCM program audio as defined in SMPTE ST 2020, S-ADM defines the audio elements that make up a complete program, and various presentations or presets, and can be carried alongside the related PCM audio tracks. S-ADM can be used to simplify (“squeeze”) more complex post-produced cinematic content for broadcast TV, or be used in natively produced NGA content

The metadata defined in S-ADM can be static for an entire program or it can be dynamic, adding or changing dialog and/or AD tracks for the viewer to choose from, or (eventually) varying the position or level of various audio objects during the program.

Since S-ADM is more of an open standard than a proprietary metadata format, it is, in theory, a more universal set of metadata to define NGA audio programs, whether the final delivery format is a Dolby® format, MPEG-H, or something else. There is still some work to be done to make it a truly interchangeable and universal format, but it is much closer to being that than where the industry was previously. Although Serial ADM is still an emerging format, it is definitely one to watch over the next few years. It has the capability to make more complex, personalized, workflows easy to support, regardless of the final delivery format.

Conclusion

Metadata has long played a crucial role in shaping the audio experience, from basic parameters like channel count and loudness to more advanced features in emerging formats like Next Gen Audio (NGA). The advent of Object Based Audio (OBA) in NGA introduces a new level of personalization, allowing viewers to choose specific audio elements. Serial ADM (S-ADM), an evolving technology, addresses the challenges of delivering dynamic metadata in real-time and streaming scenarios. As an open standard, S-ADM holds the promise of simplifying complex workflows and enhancing the universal definition of NGA audio programs. While still emerging, Serial ADM is poised to play a pivotal role in the evolution of audio production and delivery over the coming years.


TV TechnologyThis article first appeared in TV Technology Magazine's January, 2024 eBook entitled Sounding Off: How IP, AI are advancing audio for TV.
Reprinted with permission.

Topics: Television Audio, Next Generaton Audio, Metadata

Recent Posts

Subscribe

If you love broadcast audio, you'll love Telos Alliance's newsletter. Get it delivered to your inbox by subscribing below!