First Results of the User Evaluation on Loudness Harmonisation on the Web

The evaluation of different loudness harmonisation methods which was previously carried out in NoTube as part of our work in the context of loudness normalisation clearly showed the excellent performance of EBU-R 128 (see also our previous blog post). However, we have to keep in mind that this test was carried out under quasi studio listening conditions. Concerning listening set-ups in home environments there is a wide range of different listening conditions which we are currently investigating in a dedicated Loudness Web Evaluation that was launched in October 2011.

In this evaluation we consider the actual listening conditions of listeners. Based upon the tools applied by EBU-R 128 and corresponding technical documents we want to investigate the interdependence between loudness harmonisation, loudness range characteristics and listening conditions. In the context of NoTube, the user’s listening environment can typically be characterised as a computer based listening environment with, among others, built-in laptop speakers or small external PC speakers. But taking into account the proceeding fusion of internet and TV (e.g. HbbTV etc.) we consider also studio-quality stereo systems and home cinema speaker systems.

The evaluation is based upon a sufficient number of short video clips covering different genres like “Movie”, “Commentary”, “Concert”, “Sport”, “Show”, “Commercial” and “News”. The audio part of the clips selected for evaluation was varied with respect to Programme Loudness and Loudness Range. In each case three variants are presented to the user. The task of the participants is simply to indicate the preferred variant.

Although EBU-R 128 defines Target Programme Loudness it does not enclose any relation to the reproduced sound pressure level. This relation depends on individual taste, personal preferences, user habits and the listening environment. This fact implies that such loudness evaluations are strongly influenced by the “individual volume” which is set by the listener. In order to include this important individual parameter a corresponding procedure was prepared to adjust and register the “individual volume” (see below). It defines the individual reference volume and shall be constant and not changed during the complete evaluation procedure, so the test persons are asked to not change the volume level finally adjusted in this procedure. In contrast to other audio evaluations where a constant listening level is given each participant is able to choose his preferred listening level individually.

Loudness Adaptation

For the evaluation of the loudness adaptation short extracts were selected from ten video clips representing different genres. The compilation of three variants of audio adaptation under test was carried out by measurement and adaptation of the Programme Loudness using tools referring to EBU-R 128.

Audio adaptation types

Table 1: Audio Adaptation Types

Loudness Range Adaptation

For the evaluation of the Loudness Range adaptation five extracts from video clips were selected from different genres. The Loudness Range adaptation used for the test is strictly based on the measurement of the loudness descriptor “Loudness Range (LRA)” as specified in EBU-TECH 3342. The following figure shows the resulting LRA values in LU (Loudness Units as defined in EBU-R 128) under test of the five loudness range items uncompressed and after LRA compression.

Loudness Range of items under test (each uncompressed and adjusted compression c1_15 dB/c1_25 dB)

Figure 1: Loudness Range of items under test (each uncompressed and adjusted compression c1_15 dB/c1_25 dB)

Listening level

In order to enable the test person to adjust and to indicate his individual listening level (sound pressure) a short extract from a news broadcast was selected as test item. Our ears are especially familiar with the sound of human speech, thus, news anchors are ideally qualified to help adjusting to a convenient (or even the correct respectively original) listening level. The individual listening level that is finally adjusted by the user is considered to be the reference volume and should not be changed throughout the complete test procedure. The individually adjusted listening levels are obtained using a special test signal with announcements in different (loudness) listening levels where the announced listening level meets the corresponding loudness level.

These level announcements are presented after adjustment of the individual reference listening level. The participants are asked to indicate the first level announcement which they clearly can understand by clicking the corresponding button. This method is well-known as “Hearing Threshold Method” and presumed to be notably precise. The differences between the announced levels are 7 LU. From the psychoacoustic point of view this difference can be indicated as “clearly distinguishable“. An estimation of the relationship to the corresponding sound pressure level can be achieved by measuring the resulting reproduced average sound pressure level of the anchorman in an individual listening condition. The relationship between announcement and loudness level is presented in the following table.

Listening level announcements – Relationship between reference loudness and sound pressure listening level

Table 2: Listening level announcements – Relationship between reference loudness and sound pressure listening level

Introduction and evaluation

The   adjustment  of the listening level  is part of the introduction to the evaluation. The introduction contains an additional video sequence which is composed of nine short clips with different loudness adaptations to make the test person familiar with the loudness adaptation under test. The introduction comprises also a questionnaire to collect the age of the test person and the following characteristics of the individual listening condition:

  • age
  • type of speaker
  • size of speaker
  • distance to speaker
  • background noise

The arrangement of the loudness evaluation is like follows, each clip under test is presented in those 3 variants which have been described above. The participant is asked to listen to each of the 3 variants of the actual clip and thereafter to indicate which loudness/loudness range variant he personally prefers considering his taste/custom respectively his individual listening environment. After indicating the preferred variant by clicking the corresponding button the next clip with its three versions is presented for evaluation. The order of the presentation of clip number respectively variant is done randomly.

Preliminary Results

The preliminary results presented here are based on the evaluation period from September 19th 2011. Besides partners from the NoTube consortium we invited participants from other communities, e.g. the EBU audio expert group “FAR-PLOUD” and the “Surround-sound-Forum” within VDT (Verband Deutscher Tomeister). However, the results presented below are only a preliminary subset based on 48 valid test runs which were collected so far. In general, the presented data at this stage of the analysis has to be considered as purely descriptive. Here, only the the calculated percentage collected for each attribute collected is presented. In this preliminary presentation of the results we want to focus only on the aspects listening level, listening situation, background noise and the main results concerning loudness and loudness range evaluation.

The distribution of the listening level in figure 2 shows that the majority of participants chose a rather high listening level. The preferred level was Level 3 which was selected by almost 40% of the test persons.

Distribution of Listening level

Figure 2: Distribution of Listening level

The distribution of the speaker type in figure 3 shows a nearly equal representation of headphones (both in-ear and on-ear), built-in speakers and external speakers (proportion of subwoofers related to PC and stereo speakers only) with a predominance of headphones.

Distribution of Speaker Type

Figure 3: Distribution of Speaker Type

The distribution of the indicated background noise as presented in figure 4 shows a clear dominance of weak background noises. Only one user declared to have strong background noise.

Distribution of Background Noise

Figure 4: Distribution of Background Noise

The results of the evaluation of both loudness and loudness range adaptation are presented in the following figures.

Preferences of Loudness Adaptation

Figure 5: Preferences of Loudness Adaptation

Preferences of Loudness Range Adaptations

Figure 6: Preferences of Loudness Range Adaptations

Conclusions

First conclusions with respect to the evaluation of loudness and loudness range adaptation which can be drawn from the descriptive data presented above approve the excellent performance of EBU-R 128 with respect to Target Programme Loudness. This answers the open question whether the loudness harmonisation following EBU-R 128 which was clearly preferred in the previous evaluation is depending on individual listening conditions. Considering the listening conditions covered in this loudness web evaluation there is clearly no influence observable.

With respect to the evaluation of the different loudness range adaptations there is a tendency identifiable. The participants seem to prefer rather medium or even strong loudness range compression (compression characteristics c1_15 dB/c1_25 dB in figure 1) than uncompressed audio with high loudness range. A first analysis of the data with respect to correlations between loudness range adaptation aspects and the type of speaker or background noise showed no noticeable interrelation. On the other hand this result could be an indication that loudness ranges of 25 LU and more do not meet home listening environments respectively the expectance of the majority of home listeners. But in order to answer these questions an analysis of a larger dataset is necessary.

Call for participation

We thus invite you to participate in the web evaluation which will be open for another two weeks. The test can be carried out online. Simply go to http://survey.irt.de/notube to take part and to learn more about Loudness Normalisation on the Web! The evaluation will be open until Friday, December 16th 2011.

(This post was contributed by Gerhard Spikofski, IRT) 

About Peter

I'm a multimedia research engineer working in the television production systems department at Institut für Rundfunktechnik. We are based in Germany's greatest outdoor city: Munich - which is why I love working and living here... :)
This entry was posted in Audio, Evaluations, Loudness. Bookmark the permalink.

2 Responses to First Results of the User Evaluation on Loudness Harmonisation on the Web

  1. Pingback: Two days left to participate in NoTube’s Loudness Web Evaluation! | NoTube

  2. Pingback: Results from Loudness Web Evaluation now online | NoTube

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s