Being a fan of watching video content on a variety of devices, I sometimes get into situations where I can’t make out what is being spoken even though the overall audio levels are fine. Have you experienced these issues? I am sure they are annoying enough and make us think how this content was approved for publishing with such obvious issues.
Many of you would be surprised that such content can pass the Loudness criteria that are commonly used for typical audio QC testing, and if not manually reviewed, the content can indeed pass and be made available to consumers. This is because a more sophisticated level of Loudness testing, called ‘dialog-gated loudness’ criteria must be used in order to verify that the portion of the content with dialog has the proper loudness levels. Not performing the dialog-gated loudness verification could result in content that while passing general Loudness criteria, may still not have audible dialogs for the viewer. This can negatively impact content providers who are continuously vying to gain & retain consumers by maintaining the high quality of their content – both technically and editorially. Now a days OTT service providers like Netflix and others require the dialog-gated loudness compliance.
In this article, we will discuss how such issues can be detected by using Dialog-gating and how our QC products can help content providers achieve this in a fully automated manner.
Gating is the process that only pass audio signal satisfying the criteria while removing the unwanted audio signals from loudness measurement. The gating criteria may be absolute audio level, relative audio level or audio type such as speech or non-speech.
Dialog-gating is a process that only allows the audio signal which has speech content. All other non-speech audio segments are rejected and not passed through for the loudness measurement.
Level-gating is a process that only allows the audio signal higher than particular audio level to pass through. There are two common level gating techniques:
- Absolute Gate. All the audio segments below a particular audio level (mostly -70 LKFS/LUFS) are rejected.
- Relative Gate. All the audio segments that are lower than particular value (mostly 10 LU) below the average absolute loudness are rejected.
Let’s consider the case of a 5.1 audio stream. The list of channels in such audio stream are L, R, C, Ls, Rs, Lfe. Normally the speech content is carried in the Center (C) channel but sometimes it may be carried in Left (L) and Right (R) channels also. For this reason, only L, R and C are considered for calculating the Dialog-gated loudness and all other channels are ignored.
In real workflows requiring dialog-gated loudness measurement, an adaptive gating approach is taken. It means that Dialog-gated measurement should be performed if there is sufficient speech content in audio. If the speech content is not sufficient, then level gating is used to perform the loudness measurement.
The diagram below shows such a workflow for a 5.1 audio stream:
The upper half of this diagram takes the audio content from L, R and C channels for dialog-gated loudness measurement. The content is passed through “Dialogue Intelligence” to determine if the audio contains speech. If the audio has speech, it is assigned a gain of one, else zero. The resultant channels with gain are fed to the dialog gating process. In this case, only the audio segments containing speech will pass through along with corresponding loudness level and amount of speech content. Non-speech audio segments will be dropped. Adaptive gate selection decides whether to use dialog gated loudness or level gated loudness depending on the overall amount of speech content.
This measurement provides a true picture of actual speech levels in audio and content providers can be sure that the audio experience of their audience is preserved.
Venera’s Automated QC tools – Pulsar™ & Quasar® allows automated measurement of dialog-gated loudness measurement. Following options – shown for the EBU mode (popular standard in Europe) – are available in both the solutions:
These options are available for ATSC (popular standard in North America), OP-59 (popular standard in Australia), and ARIB TR-B32 (popular standard in Japan) modes as well.
In addition to measuring the dialog-gated loudness, users can also measure the difference of loudness level using dialog-gated measurement and level-gated measurement. This gives them a practical perspective of audio composition in the content they provide to their consumers.
Pulsar™ also allows users to automatically normalize the audio levels eliminating manual intervention in making the content compliant.
Quasar® is a Native Cloud Content QC service, allowing auto scaling with ability to process hundreds of files simultaneously with wide range of content security capabilities so that our users can process their content with peace of mind. Quasar® can be integrated using REST API for highly automated workflows. Visit www.veneratech.com/quasar to read more about Quasar® and request a free trial.
Pulsar™ is an on-premise Automated File QC systems, allowing scaling with clustering of multiple Verification Units in user’s datacenter or office location. Pulsar™ is the fastest QC system in the market allowing up to 6x faster than real-time speed for HD content. Pulsar™ can be integrated using XML/SOAP API for highly automated workflows. Visit www.veneratech.com/pulsar to read more about Pulsar™ and request a free trial.
Get in touch with us today and we would be happy to discuss with you how we can help solve your content QC challenges efficiently!