Presently, the audio playback of various streams from different sources causes quite often an annoying experience for the listeners. The reason behind that is that the loudness levels of these different streams can be completely different and do not match a comfortable listening experience on the end users’ devices.
Measurements in the past have shown that even similar audio streams which are broadcast via the broadcast network and simultaneously on the Web, e.g. Internet radio services, might have very different loudness levels. Even more different levels can be expected if various streams from completely different sources are collected to form a Rich Media Service.
Why was this of interest to NoTube?
In the NoTube platform, media coming from different sources are presented to the end user. This brings along a high risk of annoying loudness jumps, among others between news items that are provided by different broadcasters. The use of a loudness module in the NoTube platform could eliminate this effect by normalising the loudness levels.
What NoTube has done in this area
A loudness analysis component has been developed as a first element of the NoTube loudness module which analyses the loudness of the media which is entering the NoTube platform. It creates loudness descriptors for each media item which are used subsequently for the normalisation. This metadata type is generated by loudness measurements according to ITU‑R BS.1770 and EBU Technical Recommendation R128.
The loudness analyser has been implemented as a web service. The service returns loudness descriptors. One of those descriptors characterises the loudness level of each audio item. The extracted loudness descriptors are added to the metadata set which is associated to the media item. They can then be applied in order to normalise the loudness level before reproduction, i.e. playback of the media clips.
Furthermore, the harmonisation of the loudness by applying the metadata from this analysis (loudness descriptors) has been evaluated in a comprehensive user test. The evaluation shows a significant improvement compared to the original clip without any normalisation,the Peak normalisation 0 dBFS (used in CD mastering) and to the European broadcast PPM normalisation -9dBFS.
The following video demonstrates the discrepancy of the different ways to normalise the audio level.
In another dedicated evaluation for loudness normalisation on the web we have investigated the applicability of ITU-R BS.1770 and EBU R128 for media on the web. The results show that both recommendations work well for the loudness normalisation of web content. For the loudness range adaptation, the situation is a bit more complex: the individual listening situations differ too much as to provide a simple solution to this. Nevertheless, the evaluation showed that loudness ranges higher than 20 LU should be avoided for web content. For detailed results please check this blog post.
The results of the evaluations showed the applicability of ITU-R BS.1770 and EBU R128 for media on the web. However, a generic approach to realise loudness normalisation on the web with dedicated metadata could not be achieved: Neither the concept of a dedicated player with associated audio metadata exchange nor a real-time normalisation on the end-user side were applicable in the context of NoTube.
Nevertheless, a proof of concept for a loudness normalisation process on the content provider side was realised in collaboration with RAI. Loudness metadata is used in this process to allow reversing the normalisation if requried.
The web evaluation also showed that loudness range normalisation is a much more complex issue. Loudness ranges of more than 20 LU should be avoided for web applications, but a generic approach for individual listening situations is not possible for the time being. However, future research upon the findings made in NoTube will be done at IRT.