Back to news

White Paper – ST2110-41: Revolutionizing IP-Based Metadata Workflows

Blog & white papers

19 March 2025

WP BBright 2025

Executive Summary

SMPTE ST 2110-41 is a pivotal standard within the SMPTE ST 2110 suite, designed to enhance the transport of metadata in IP-based broadcast infrastructures. It enables the carriage, synchronization, and description of separate elementary essence streams over IP for real-time production, playout, and other professional media applications. This standard has been under discussion for several years, but the SMPTE specifications were recently approved in March 2024 and now complete the 2110 family of standards related to IP transport.

BBright White Paper 2110 family

In traditional Serial Digital Interface (SDI) systems, metadata is embedded within the same stream as audio and video, leading to challenges in flexibility and scalability. SMPTE ST 2110-41 addresses these issues by allowing for the separate routing of video, audio, and metadata streams over IP networks. This separation facilitates more efficient workflows and enables broadcasters to manage each component independently, leading to improved operational efficiency.  Utilizing standard Ethernet networks, SMPTE ST 2110-41 enables broadcasters to expand operations by adding metadata streams without the need for new physical infrastructure. This scalability is crucial to accommodate the growing demand for high-quality content and services. 

In summary, SMPTE ST 2110-41 plays a crucial role in modernizing metadata transport within IP-based broadcast infrastructures, offering enhanced flexibility, scalability, and future-proofing capabilities for broadcasters.

2. Why do we need a new standard such as 2110-41 to handle metadata?

For several years, metadata in 2110 environments have used the 2110-40 format, which is a transposition of the historical HANC/VANC from the SDI format into the 2110 environment. SMPTE ST 2110-40 is a standard that defines the transport of ancillary data (ANC) packets, such as closed captions, subtitles, and timecodes, over IP networks within professional media environments. While it offers significant advancements over traditional SDI-based systems by enabling the separate routing of ANC data, it also presents certain limitations:

SDI Dependency: ST 2110-40 is tailored for metadata formats defined in the SDI realm, specifically SMPTE ST 291-1. This focus restricts its ability to handle newer, non-SDI metadata types that have emerged with advanced media applications.

Limited Flexibility: The standard’s design is closely aligned with SDI’s structure, which can be rigid and less adaptable to the diverse and dynamic metadata requirements of modern IP-based workflows.

Scalability Issues: As media productions evolve, there’s an increasing need to transmit a broader range of metadata. ST 2110-40’s framework may not efficiently support the scalability demanded by these expanding metadata types.

The 2110-40 format was an essential first step, perfectly compatible with the SDI metadata used since the 1990s. However, the limitations of this standard necessitated the emergence of a new standard: 2110-41.

3. Overview of 2110-41

SMPTE ST 2110-41 is a standard developed by the Society of Motion Picture and Television Engineers (SMPTE) that defines a flexible framework for transporting metadata over IP networks in professional media environments, released in March 2024. This standard is part of the broader SMPTE ST 2110 suite, which focuses on the carriage, synchronization, and description of separate elementary essence streams—such as video, audio, and data—over IP for real-time production and other professional media applications. 

Key Features of SMPTE ST 2110-41:

  • Flexible RTP Payload Framework: The standard introduces a versatile Real-time Transport Protocol (RTP) payload format designed to handle various types of data items. This framework accommodates metadata that is either tightly synchronized with video or audio streams or operates independently, ensuring precise timing and integration across different media components. It’s natively designed to transport payload such as XML, JSON or binary files.
  • Support for Diverse Metadata Types: SMPTE ST 2110-41 expands beyond the capabilities of previous standards by enabling the transport of a wide array of metadata formats. This includes traditional ancillary data as well as emerging metadata types, facilitating the integration of new services and technologies within existing IP-based infrastructures. 
  • Enhanced Synchronization and Interoperability: By providing a standardized method for metadata transport, the standard ensures that data remains synchronized with corresponding audio and video streams. This synchronization is crucial for applications like audio enhancement, closed captioning, subtitles, or dynamic ad insertion, where timing accuracy directly impacts the viewer experience. 

SMPTE ST 2110-41 plays a pivotal role in modernizing metadata workflows within IP-based broadcast infrastructures. Its flexible framework and support for diverse metadata types offer broadcasters enhanced capabilities for delivering rich, synchronized content in today’s dynamic media landscape. 2110-41 uses RTP like other ST2110 streams, using ptp for timing synchronization. Data Item Type (DIT) defines the application of the metadata, and all DITs values are registered with SMPTE. 

SMPTE ST 2110-41BBright White Paper

This standard being new, as of today (February 2025), only a few DITs values have been registered so far, but it is highly likely that this list of DIT values will grow rapidly:

DIT values ST2110-41 BBright White Paper

(source: https://www.smpte-ra.org/smpte-st2110-41-ar)

Future applications are still under discussion, such as ISXD for Dolby Vision.

If we want to compare pro & cons:

Pros

Cons

2110-40

  • ST 291 / SDI like
  • Always in sync with video (by design)
  • Perfect for hybrid SDI/2110 environment
  • Limited bandwidth
  • Keep “old style” data mapping from SDI world

2110-41

  • High bandwidth for future applications
  • Can natively transmit xml, json, or binary data.
  • PTP synchronization
  • It’s new… So we will have to deal with interop issues…

4. ST2110-41 & Audio applications

One significant application of ST 2110-41 is in the carriage of Serial Audio Definition Model (S(ADM) metadata. S(ADM provides a standardized method for describing complex audio scenes, including immersive and object-based audio formats. By utilizing the ST 2110-41 framework, SADM metadata can be transmitted efficiently alongside corresponding audio streams, ensuring precise synchronization and consistency in live production environments. 

ST2110-41 & Audio applications

S-ADM facilitates advanced audio features such as immersive sound and dialogue enhancement by providing detailed descriptions of audio elements and their spatial attributes. This capability allows for object-based personalization in TV productions, enabling viewers to customize their audio experience. For instance, during live sports broadcasts, S-ADM can manage multiple audio objects, such as different commentary tracks or ambient sounds, allowing audiences to select their preferred audio mix. As an open standard, S-ADM promotes interoperability across various Next Generation Audio (NGA) systems, including formats like Dolby Atmos and MPEG-H. Its adoption is gaining momentum, with broadcasters in Europe planning to implement S-ADM for large-scale events, enhancing the delivery of immersive and personalized audio experiences to audiences.

Before the advent of SMPTE ST 2110 and specifically ST 2110-41, certain applications of Serial Audio Definition Model (S-ADM) metadata were transported over SDI by allocating dedicated audio channels for S-ADM data. However, these proprietary and complex solutions made it challenging to integrate such metadata into broadcast workflows effectively. Thanks to SMPTE ST 2110-41, the use of Serial Audio Definition Model (S-ADM) has been greatly simplified, leveraging an all-IP architecture.

« By leveraging Serialized ADM metadata within ST 2110-41, our processors can now deliver future-proof, object-based audio — enabling immersive, interactive, and accessible productions », explains Roman Rehausen, Senior Product Manager at Jünger Audio. « Already today, embedding audio metadata directly at the production stage ensures it stays reliably linked to the corresponding audio streams, making a separate channel documentation a thing of the past. » Roman Rehausen, Senior Product Manager at Jünger Audio

Compared to SDI, which is often limited to 16 audio channels (with minimal metadata transmitted via AES), the combination of SMPTE ST 2110-30 (or 31) with ST 2110-41 opens new possibilities by managing a significantly larger number of channels. For example, it enables the handling of multiple Next Generation Audio (NGA) services in various languages, all utilizing Serial Audio Definition Model (S-ADM) metadata.

5. ST2110-41 & other applications

SMPTE ST 2110-41 is a standard that facilitates the transport of various data types, including file-based data formats like XML or JSON, within a 2110 stream. This capability enables numerous applications, such as the transmission of dynamic HDR metadata—like Dolby Vision ISXD—synchronized with video content. By embedding this metadata directly into the 2110 stream, broadcasters can ensure precise temporal alignment, enhancing the delivery of high-quality HDR experiences to viewers.

It is worth noting that some companies have already registered ‘private data’ DITs for SCTE-104-related applications. In fact, ST 2110-41 appears ideally suited for the transmission of ad insertion markers. 

ST 2110-41 is also an excellent candidate for defining future formats for synchronized subtitle transmission. However, ST 2110-43 specifically addresses the carriage of Timed Text Markup Language (TTML) for subtitles and captions, ensuring synchronized and accurate delivery of textual content alongside video streams.

Example of companion 2110-41 tracks for Video (HDR) & Audio (S-ADM) on BBright playout

Example of companion 2110-41 tracks for Video (HDR) & Audio (S-ADM) on BBright playout

The flexibility of the format allows for many future applications: for example, a contribution decoder (TS to 2110) could retain statistical information about the properties of the TS encoding, transmit it in the form of 2110-41, so that ultimately, at the end of the chain, the broadcast encoder can use this data to optimize its encoding.

There are surely many applications to be developed in channel branding as well. Since 2110-41 can transport binary files, why not offload graphic insertion to a third-party device by sending the graphic elements (PNG or HTML5) via 2110-41?

« As broadcasters strive to deliver Next Generation Audio and High Dynamic Range Video to consumers, new technology is required to cope with the production and distribution of increasingly complex content », says James Cowdery, Senior Staff Architect at Dolby Laboratories, Inc. « Metadata format standards such as SMPTE ST 2110-41 are important building blocks that the industry needs to meet this challenge, while maintaining interoperability. Dolby is excited to collaborate with organizations like BBright who are creating the products enabling these new workflows. »

6.  Let’s no longer think of media workflows as limited to video and audio: Metadata is crucial!

With the emergence of new broadcast formats and their associated use cases, metadata is becoming crucial:

  • S-ADM for advanced audio applications such as immersive sound experiences and personalized audio streams.
  • HDR static or dynamic metadata (Dolby Vision, Advanced HDR by Technicolor, etc.).
  • Ad insertion markers for personalized advertising.
  • Subtitles & captioning.

The use of ST 2110-41, along with standardization efforts at the media file level, enables the development of end-to-end workflows where metadata is preserved throughout the entire chain—from post-production or live production to the end user.

The ST 2127 standard, for example, details and specifies how to store a set of PCM audio tracks and their accompanying S-ADM as MGA (Metadata Guided Audio) tracks within an MXF file, which can be natively used for playout. These internationally standardized formats define solutions that preserve native post-production metadata, ensuring high-quality data is available to optimize broadcast encoders.

MXF File -> MGA (Metadata Guided Audio) parsing and 2110-30 + 41 flowsMXF File -> MGA (Metadata Guided Audio) parsing and 2110-30 + 41 flows

Below is a highly simplified example of a complete workflow enabling a TV channel to ensure metadata preservation from post-production and live production through to the end user: ST 2110-41 allows the seamless transport of this metadata without alteration to the distribution encoder, which can then utilize it for NGA audio encoding (such as Dolby Atmos or AC4), or adapt it for transport via SEI messages within the compressed bitstream.

Complete workflow from Live/Post production to end user, with full metadata preservation through ST 2110-41

Complete workflow from Live/Post production to end user, with full metadata preservation through ST 2110-41

With an ecosystem that natively supports these 2110-41 streams, it becomes entirely feasible to mix premium content featuring NGA audio across 10 or even 16 channels with standard stereo content (commercials, legacy catalogs, etc.). The SADM metadata transported via 2110-41 enables dynamic reconfiguration of the distribution encoder on the fly. Similarly for HDR, premium content can carry dynamic HDR metadata, such as Dolby Vision, and seamlessly follow standard HDR10 content. The 2110-41 stream allows HDR metadata and configuration to be adjusted precisely on a frame-by-frame basis.

Pierre Maillat, Technical Architect at Canal +, explains: « As our company is committed to offering the best quality of experience, adapting SADM and 2110-41 will allow to enhance audio experience of our subscribers adding immersion experience, personalization and dialog enhancement , isxd plus 2110-41 is the opportunity to offer a full end to end Dolby vision workflow. Also, using 2110-41 will allow to carry localization information to support the international growth of our group. »

7.  Joys of interoperability

ST 2110-41 defines a transport mechanism but does not mandate a specific metadata format. This means different implementations may use XML, JSON, or binary formats. Without strict guidelines on metadata encoding and interpretation, interoperability between different manufacturers’ devices can be inconsistent. Efforts such as AMWA NMOS and EBU guidelines help define best practices, but adoption is still evolving.

Let’s take a simple example, such as frame rate conversion: In the case of S-ADM, should we modify the audio cadence of S-ADM, update the audio frame durations, or consider that ST 2110, which natively allows the decoupling of video, audio, and metadata streams, inherently enables audio and its S-ADM to remain unchanged when converting the video frame rate? Over and above this, S-ADM, which is essentially an XML file, can be transported in 2110-41 either in uncompressed form (XML) or compressed form (gzip), adding an additional variable that complicates interoperability.

There are still many gray areas in the 2110-41 use case documentation, but the efforts of AMWA, EBU, manufacturers, and other organizations will facilitate interoperability. Industry groups like JT-NM (Joint Task Force on Networked Media) will also limit -41 adoption issues by multiplying and documenting interop tests. 

Conclusion and Call to Action

SMPTE ST 2110-41 represents a significant advancement in IP-based metadata workflows, providing broadcasters with the flexibility, scalability, and future-proofing essential in today’s rapidly evolving media landscape. By decoupling metadata from traditional audio and video streams, ST 2110-41 enables broadcasters to manage diverse data types more efficiently, from immersive audio metadata (S-ADM) to dynamic HDR metadata and ad insertion markers. Although interoperability and metadata standardization remain ongoing efforts, collaborative initiatives by industry groups like AMWA, EBU, and JT-NM continue to drive clarity and adoption. Broadcasters who embrace ST 2110-41 are well-positioned to deliver innovative services and maintain operational agility in an increasingly metadata-rich environment.

To further advance the adoption of ST 2110-41 for metadata transport, it’s essential to develop more Recommended Practices (RPs), specifications, and technical clarifications. For instance, establishing a direct and bidirectional gateway between ST 2110-40 and ST 2110-41 would facilitate a smoother transition to ST 2110-41. Similarly, providing clear and precise documentation for mapping metadata transmitted via SEI messages in compressed streams into ST 2110-41 would greatly ease the adoption of HDR technologies.

To conclude, let’s focus on the key question: What is the real benefit for the end user? In essence, the new 2110-41 metadata will empower both traditional linear TV and OTT platforms to deliver content with state-of-the-art immersive technologies (UltraHD, HDR, NGA…). This advancement bridges the gap between broadcast and the unparalleled experience of premium cinema—whether for live events or feature films. And the best uses of this format are yet to be invented.

Download the full White Paper in PDF here.