In this NoTube blog post we will give you an update of how we have lately improved the loudness normalisation following the results of the user evaluation we made in November 2010. But first, let’s recap…
The problem are annoying loudness jumps that occur when we are viewing and listening to various video clips or pure audio from internet or zapping between TV or radio stations or programs (see also our pages on Loudness Normalisation). And in fact, it is a problem very well known for decades. Of course, there are international standards with respect to audio levelling for decades, too. These are important considering optimal use of available audio channels and international program exchange.
But these recommendations are generally not suited for loudness normalisation. The reason for that is that these recommendations generally refer to the maximum peak level within an audio piece. In other words, peak normalisation is recommended but this is not at all related to loudness normalisation.
First of all, an indispensably requirement for loudness normalisation is a standardised loudness meter. Since introduction of appropriate loudness algorithms and meters in the past years and mainly since the standardisation in ITU-R BS.1770 (2006) and publication of the EBU Technical Recommendation R128 including related documents (2010) the problem of loudness differences between various sources can principally be solved.
In 2008 the European Broadcasting Union (EBU) has established the project group P‑LOUD with more than 100 members, including IRT, working on refinements of the ITU-R standard, such as the definition of target loudness level, short/long term loudness and loudness range and also setting up EBU guidelines on loudness measurements. Considering Europe, now, two years after inception of P-LOUD, the new levelling recommendations based on ITU-R BS.1770 have been established and will probably replace part of the existing level recommendations. The work of P-LOUD is almost finished. It was accompanied by publication of the following bundle of recommendations:
- EBU Technical Recommendation R128: Loudness Normalization and Permitted Maximum Level of Audio Signals
- EBU Technical Document 3341: Loudness Metering. ‘EBU Mode’ Metering to Supplement Loudness Normalization According to EBU Technical Document R128
- EBU Technical Document 3342: Loudness Range. A Descriptor to Supplement Loudness Normalization According to EBU Technical Recommendation R128
- EBU Technical Document 3343: Practical Guidelines in Accordance with EBU Technical Recommendation R128
- EBU Technical Document 3344: Practical Guidelines for Distribution Systems in Accordance with EBU Technical Recommendation R128
The arbitrative impact of these documents is due to the fact that concrete specifications are defined here with respect to measurement and compliance of normalised loudness in broadcasting. This refers to the target loudness plus characteristics of measuring instruments as well as corresponding recommendations which are addressing – amongst others – production, archiving and program exchange. Please note that the EBU group is also giving dedicated workshops for broadcasters to learn what EBU R 128 Loudness monitoring and levelling is all about and how it can be implemented by broadcasters.
How to solve loudness variations of different sources?
With the help of the above mentioned recommendations two fundamental problems of audio reproduction can be solved. The first problem are the loudness jumps described above. They are observed when programs from different sources are not loudness normalised (see also our demo video). In order to carry out loudness normalisation referring to EBU R128, you foremost need a loudness meter with implementation of the ITU-R BS.1770 algorithm and an “EBU Mode” following EBU Technical Document 3341. Thus you are able to measure the so-called “Integrated Loudness” LUFS “I” which is the average over the complete audio sequence.
As an example, let’s assume a loudness of -20 LUFS is measured for a given media item. The actual loudness normalisation is now done by means of a leveller to adapt the measured integrated loudness to the EBU R128 target of -23 LUFS. In the given example a reduction of the audio level of 3 LU (or 3 dB) has to be carried out in order to meet the target of -23 LUFS.
How to solve Loudness Range adaptation to different listening environments?
The second fundamental problem of audio reproduction occurs when the “dynamic range” of a program, i.e. the range between low “ppp” and loud “fff” passages of an audio program, does not fit the reproduction environment. Assume a symphony with high dynamic range shall be reproduced in a city flat with clearly perceptible traffic noise. After setting the volume to a comfortable value it’s imaginable that the ppp passages are covered by the ambiance noise. The important signal characteristics have up to now been specified as “dynamic range” and imply the relationship between defined audio levels. Now, they can be adequately specified referring to EBU R128. In detail the definition is based on the level statistics of “short-term” loudness, whereas the actual loudness range LRA is defined as the range between the 10th and 95th percentile.
Typical measures of LRA are:
- small LRA < 5 LU
- medium LRA ~10 LU
- large LRA > 15 LU.
The following figure presents LRA measurement results of a selection of typical TV test items used in the user evaluation.
The following figures show typical listening situations. The reproducible LRA in these figures is represented by the corresponding maximum listening sound pressure level (SPL) and noise level in each case. It’s easy to recognize which LRA the individual listening situation “allows”, in other words where the corresponding music be listened to relaxed without special passages being covered by the ambiance noise.
Considering a reproduction of an individual LRA extravagating the limits of the listening situation there are corresponding procedures which can help to adapt the audio signal to the reproduction environment. Normally, this can be achieved by so-called audio compressors. The LRA of an individual program can be targeted to the listening situation by choosing suitable static and dynamic response curves.
Analysis stage of the loudness harmonization module
With respect to loudness harmonization in NoTube, IRT has up to now been concentrating on audio analysis. For this purpose we have developed a software program calculating ITU-R BS.1770 and EBU R128 compliant audio descriptors of the audio file under test. A prototype has been demoed at the Loudness Workshop at IRT in January 2011. The input is either the URL of the corresponding audio file or a file is moved to the input mask by simple „drag&drop“.
The output of the analysis program is represented by the following list of audio descriptors referring to EBU R128.
URL = file:/.../Blechschaden-Blue%20Brass.wav
File: /.../Blechschaden-Blues Brass.wav
Target Loudness (integrated) = -23.0 LUFS Peak Sample Headroom = -1.0 dBFS Silence Threshold Loudness = -70.0 LUFS Loudness Gating Offset = -8.0 LU Loudness Range Gating Offset = -20.0 LU Loudness Integration Time = 0.4 s Loudness Range Integration Time = 3.0 s
Loudness (integrated) (ungated) = -26.146193 LUFS Loudness (integrated) (gated) = -22.504574 LUFS Loudness Range (ungated) = 28.604418 LU Loudness Range (gated) = 24.440937 LU Peak Sample = -7.4672046 dBFS True Peak = -7.464197 dBFS Peak Loudness (momentary) = -17.74213 LUFS Peak Loudness (short-term) = -20.006647 LUFS Gain Needed For Equalized Loudness (safe) = -0.49542618 dB Gain Needed For Equalized Loudness = -0.49542618 dB
Synthesis stage of the loudness harmonization module
In the first stadium of the synthesis stage we carried out the loudness normalisation based on the descriptors “Integrated Loudness” LUFS “I”, both “gated” and “ungated” for a selection of typical TV test items (16 short video clips). Moreover an evaluation with respect to ranking (rank indication: 1=excellent / 2=good / 3=fair / 4=poor) has been carried out for the loudness normalised clips including the originals and peak QPPM normalisation (-9 dB QPPM, 10 ms integration time). This evaluation of loudness harmonisation by applying loudness descriptors derived in the analysis stage of the loudness module already shows a significant improvement compared to both the original video clips without any normalisation and compared to valid European QPPM normalisation (see following figure).
In a further stadium oft the synthesis stage we specially want to interpret the audio descriptor loudness range LRA. By developing suitable audio compressors we want to try to adapt the LRA of the reproduced program to the individual listening environment in the replay module in NoTube. In this context, high qualitative requirements are to be fulfilled by the compressors, that is the static and dynamic characteristics are to be designed in order to achieve a variation of LRA without perceivable artifacts like “pumping”, „noise tails“, “distortion” etc.
One important aspect of the investigations regarding the loudness range is to check whether the compression really can be realised in the NoTube replay module by interpreting the corresponding metadata. Alternatively, a number of LRA variations can be generated at the broadcast side, which can be requested by the NoTube user with respect to his listening environment.
(This post was contributed by Gerhard Spikofski, IRT)