NoTube: bringing Web and TV closer together

The NoTube project was an EU funded project that ran from February 2009 to January 2012. 

As NoTube began, its vision of bringing Web and TV closer together via shared data models and content across multiple devices was ambitious and visionary. As we close, it is noteworthy how much of the TV industry has caught up with this vision, at least in individual and closed new technologies and products. Yet the NoTube results are now more relevant than ever: TV platforms are proprietary, cross-device communication non-standardized. NoTube services could form the backbone for personalised TV applications where the user still controls their data. See this slideset for an overview of NoTube’s activity and results:

Especially these results have been used to create a set of NoTube showcases, on personalised news from RAI, personalised program guide and advertising from Stoneroos and Thomson, and personalised social TV and second screen sharing from the BBC.

Also NoTube closes with a number of Web based demonstrators of some of the technologies created in the project, including the N-Screen content recommendation and sharing, Beancounter social Web user profiling, and the NoTube services accessible via the NoTube portal.

There’s plenty more in NoTube including TV metadata mappings, LUpedia concept extraction service, automated advertisement insertion in video and much more than you can explore by looking at the Research topics, and Things to read  - or just contact us to explore more how to benefit from NoTube’s results or collaborate with us on new projects!

Posted in About Notube | Leave a comment

Be part of our user research into Social Web & TV

Please help us with our user research by taking a moment of your time to fill in our survey to collect opinions about the implications of integrating the Social Web with TV; including attitudes to privacy, sharing of personal data, and user control.

We’ll be reporting back on on our findings from the survey on this blog soon. Many thanks in advance for your help.

Go to NoTube’s Social Web and TV survey.

Posted in Evaluations, Privacy, Recommendations, Social TV, User Experience | Leave a comment

IRT publishes NoTube article in FKT Journal

IRT has published a comprehensive article about NoTube in the current edition of the German-speaking professional journal FKT. An extract of the article which focuses on TV metadata interoperability can be found at http://fkt.schiele-schoen.de/117/17029/FKT21201_55/NoTube_Metadaten_fuer_TV_und_Web.html.

Posted in Uncategorized | Leave a comment

Visualisation of key findings from N-Screen user testing

N-Screen user testing - summary of findings

You can read a more detailed description of the results in a previous blog post.

Posted in Evaluations, Recommendations, Second screens, Social TV, User Experience | Leave a comment

Two days left to participate in NoTube’s Loudness Web Evaluation!

In NoTube we are investigating loudness normalisation in  multimedia environments. To investigate different listening situations and user preferences, we launched a web evaluation in October 2011 and already published some preliminary results.

To gather more test data in order to improve the results we again invited everyone to participate in the web evaluation which can be carried out online until Friday, December 16th 2011. So, now, there are only two days left! Simply go to http://survey.irt.de/notube to take part and to learn more about Loudness Normalisation on the Web!

Posted in Audio, Evaluations, Loudness | 1 Comment

Preliminary findings of N-Screen user testing

Libby and I recently spent two days testing NoTube’s ‘N-Screen’ prototype with members of the public at the BBC R&D user testing lab in London.

As Libby has described previously on this blog, N-Screen is a second screen prototype application designed to help a small group of people explore a collection of on-demand programmes and choose one to watch together in real-time, with participants either being in the same room together or in separate locations. The scenario imagines a future world in which most people will have their own personalised connected device such as a tablet or smartphone.

We recruited ten participants to test the app: five men and five women across a spread of ages between 20 and 64. All participants described themselves as TV enthusiasts, regularly watching at least 2 hours of TV a day.

Following some introductory questions about watching TV in general, during each session we showed the participant a version of N-Screen containing BBC iPlayer catch-up programmes (about 1,000 programmes) and walked them through a group-watching scenario – with Libby and I taking the role of the participant’s N-Screen ‘friends’!

Programme suggestions and explanations

N-Screen supports different TV recommendation and browsing strategies across the spectrum from cold-start to fully personalised, combined into a single user interface. This provides multiple ways of helping people to find something interesting to watch from a large collection of video content.

Suggestions for you

Each participant in an N-Screen group starts with a different set of personalised programme recommendations based on NoTube’s Beancounter user profiling service. For testing we had to show mock-up examples of these types of suggestions, and we had to ask our participants to imagine these were based on their user profile. Despite this, all the participants liked the concept of seeing programme suggestions based on things they’d done in the past.

Quite a good idea…might bring up programmes that haven’t come to your attention.

Really good – tries to tailor it me.

Suggestions for you

"Suggestions for you" in N-Screen

However, they weren’t so keen on the idea of getting recommendations based solely on age and/or gender. Several participants thought that this just wouldn’t work for them because they thought that their TV tastes didn’t match their age or gender profile; others thought that people didn’t like to be “pigeon-holed” in this way.

Tapping on a programme suggestion in N-Screen displays an overlay with a brief programme synopsis, and an explanation as to why it has been recommended – for example: “Recommended because you watched That’s Britain which also has Nick Knowles in it”. The idea here is to present the pathways through the Linked Data graph showing the connection that led to a recommendation being made. Several participants were particularly keen on the idea of being suggested programmes based on links between the people in them, such as actors or TV personalities, as long as those people are considered significant and interesting.

I like the idea of plotting actors through their career… I like it that you’ve gone for the actor – I want to see more of specific actors I like.

That’s a good idea if it’s a particular actor you follow…but if it was an actor in Eastenders I don’t know if that would really appeal to you because you’re watching Eastenders for Eastenders, and not necessarily the actor. But if it was Kelly Holmes and I’d like her as a character and I saw she was in Bargain Hunt, then I’d think let’s watch that.

I like Stephen Fry and I would be interested to see what he’s doing.

 I like the idea that it’s also got Ian Hislop in it.

 Someone’s in it who you like… A way of trying out something new that you might not know about – I like that.

However, in general, people didn’t seem to care as much about the explanation for a recommendation as we’d expected, based on research we’d read about the value of explanations for enhancing users’ trust in recommendation systems. It’s possible though that the explanations may have had more resonance with our participants if they’d be based on the individual’s real activities and data.

More like this

Beneath the main programme information, the overlay screen also shows a list of programmes related to the selected programme based on collaborative filtering techniques. The idea here is to expand out the selection of potentially interesting programmes for the user because the list of personalised programmes could be quite small. Again, all our participants thought that these types of suggestions could be useful as another means of bubbling up content that may be of interest, but they didn’t find the associated Amazon-style explanations (“Recommended because people who watched DIY SOS also watched this”) particularly useful.

N-Screen recommendations and explanations

N-Screen recommendations and explanations

Random selection

N-Screen also offers a “random selection” option as an alternative means of surfacing content buried in the video collection, or for times when a user might reach a dead-end with the recommendations approach.  The idea is to add an extra element of serendipity to the experience. Our user trial of the NoTube Archive Browser prototype, conducted earlier this year, suggested that people found interesting new programmes in a BBC archive collection regardless of whether they saw similar or random programmes.

Most participants said they thought they would find this feature useful as another way of finding new programmes, and a couple of them said it was one the things they liked best about N-Screen – although we did discover a few usability issues with the user interface.

Sometimes you get stuck. It’s like shuffle on iPod – definitely a good idea.

If you’re not sure what you want or what you’re in the mood for, if you didn’t want to watch the usual…yes I’d give it a try.

Random selection

"Random selection" in N-Screen

Sharing and receiving suggestions from friends

Finding interesting niche video content and using drag and drop to share these ‘hidden gems’ with friends is key to the N-Screen design. Since the earliest iterations of N-Screen,  this aspect of the user experience has always appealed to people – together with the accompanying whooshing noise which provides an audio cue that you’ve received a suggestion from a friend. Similarly, once they’d got the hang of it, the majority of our participants also enjoyed dragging and dropping to swap programme suggestions, because they found it “simple”, “fun”, and “instant”.

We discovered a few initial usability issues around grabbing items to drag, and dropping them in the right place, and most people didn’t realise at first that programme items could be dragged. However, it’s possible that this was because 9 out of 10 of them were not tablet owners, and were therefore not familiar with using drag-and-drop, an interaction style that is becoming increasingly common with the rise in numbers of touchscreen devices.

If all tablets are as easy to use as this then l’d be happy to drag and drop things – it’s so simple, really easy.

Easy to use, even if you hadn’t been here I would have figured it out…it’s easy to drag and drop.

Sharing with friends

Using drag and drop to share programme suggestions with friends

The idea of sharing and receiving suggestions for things to watch with friends in this way was a highlight of the app for many. However, not all participants imagined that that this would necessarily be done in real-time; several of them talked about swapping programme ideas to watch later, in the same way that they might currently use email or texts to send links to interesting programmes.

I don’t think it’s on to recommend things for others to watch instantly.

Neither could the majority of the participants see themselves using N-Screen on multiple devices in the same room as other people; they couldn’t see the point.

Sharing…I think it works more if we’re in a different location. I couldn’t imagine using it in the same room. What would be the point of that?

It takes away the point of chatting.

Obviously that would be us being lazy and not wanting to talk to each other.

It’s considered impolite to get your iPad out when you’re in a social group.

Several participants said they could imagine scenarios for using N-Screen with friends or family located remotely, but these tended to be associated with one specific individual rather than a group: with “my mum”, “my friend back home”, or “my best friend”. Again, several participants talked about sharing items that could be watched later, rather than immediately.

Not for watching something instantly, only for making suggestions for things you could choose to watch later if you wanted to.

My mother keeps saying ‘you should watch this’ – and I’m not always able to at the time she suggests…If my mother sends me a text about a show, I might not be looking at the phone, so it would be good if she could use this and we could watch apart or together, and we could watch it now or later.

I’d liken it to a reading group – I wouldn’t say ‘let’s all watch Eastenders now all together’ but I’d imagine creating a list that you watch on your own later.

Sharing with the group

When we asked participants if they would share different programmes with the whole group in N-Screen, rather than with specific individuals in it, their responses suggest that empathy in considering other people’s preferences is a strong influence over deciding which programmes to share with whom.

Yes, it would depend on the friends and their tastes. For some programmes, like Strictly [Come Dancing], that everyone likes this would be great. I’m into scifi but not all my friends are.

There’s only certain people you can recommend things to on this scale. So it would be limited to people I thought who would be interested. Some friends I have nothing in common with taste-wise when it comes to entertainment.

Yes, I know different friends like different things.

Changing the TV display

Once the group has decided what to watch, the idea is that one of the N-Screen participants drags the programme to the TV icon in the top-right to start playing the programme on the TV screen. All the participants liked this feature.

Ah, this is too much, it’s awesome. I love this.

Fantastic, you’re often watching something on the iPlayer and you really want it on the TV, so this really speeds things up. It’s amazingly quick.

That’s magic. I like that.

Changing the TV

For the scenario in which N-Screen friends are remote, our initial idea was that they could watch something ‘together apart’, with their TVs being sychronised – so that dragging a programme to the ‘shared TV’ icon would automatically start playing the programme on everyone’s TV.

Some participants caught on to the idea of watching ‘together apart’ in real-time and thought it could work well.

I like the idea of me being in my house and a friend being in their house and watching something at the same time, but not in the same place.

That we’d all watch together at the same time, simultaneously in real-time is good.

 If they had the same set-up, I’d expect them to watch same thing at the same time.

However, nearly all participants were against the idea of programmes on someone else’s TV being changed remotely and the majority felt that each individual should be in control of their own TV, unless explicit permission had been given.

I would hope they’d be in control of what they’re watching. I can’t change their TV can I? I’d like to warn them I’m about to change their TV…

It shouldn’t change the other person’s TV. It would be like a ghost…

I’d like to drag and drop for my own TV but I’d be annoyed if someone else changed my TV – we’d end up having wars! It takes control away.

A couple of participants mentioned that a small alert in the corner of the other person’s TV screen might be a useful compromise.

Some initial conclusions

  • Overall, participants were complimentary about trying out N-Screen; they mostly liked it and found it fun and easy to use, but not necessarily for collaborative browsing and watching TV in real-time with others.
  • They were positive about the different types of programme suggestions and the concept of sharing and receiving suggestions with friends.
  • However, several of the older participants couldn’t see it replacing other ways of sharing TV recommendations such as texting or emailing. The latter were also sceptical about the concept of getting together with friends to decide what to watch on TV without having pre-planned it.
  • The idea of dragging a programme to the TV icon as a way of controlling what’s playing on the TV was universally liked, so long as it didn’t also change their friends’ TVs.

Next we’ll be taking a closer look at the implications of these findings.


Posted in Recommendations, Second screens, Showcase, Social TV, User Experience | 1 Comment

First Results of the User Evaluation on Loudness Harmonisation on the Web

The evaluation of different loudness harmonisation methods which was previously carried out in NoTube as part of our work in the context of loudness normalisation clearly showed the excellent performance of EBU-R 128 (see also our previous blog post). However, we have to keep in mind that this test was carried out under quasi studio listening conditions. Concerning listening set-ups in home environments there is a wide range of different listening conditions which we are currently investigating in a dedicated Loudness Web Evaluation that was launched in October 2011.

In this evaluation we consider the actual listening conditions of listeners. Based upon the tools applied by EBU-R 128 and corresponding technical documents we want to investigate the interdependence between loudness harmonisation, loudness range characteristics and listening conditions. In the context of NoTube, the user’s listening environment can typically be characterised as a computer based listening environment with, among others, built-in laptop speakers or small external PC speakers. But taking into account the proceeding fusion of internet and TV (e.g. HbbTV etc.) we consider also studio-quality stereo systems and home cinema speaker systems.

The evaluation is based upon a sufficient number of short video clips covering different genres like “Movie”, “Commentary”, “Concert”, “Sport”, “Show”, “Commercial” and “News”. The audio part of the clips selected for evaluation was varied with respect to Programme Loudness and Loudness Range. In each case three variants are presented to the user. The task of the participants is simply to indicate the preferred variant.

Although EBU-R 128 defines Target Programme Loudness it does not enclose any relation to the reproduced sound pressure level. This relation depends on individual taste, personal preferences, user habits and the listening environment. This fact implies that such loudness evaluations are strongly influenced by the “individual volume” which is set by the listener. In order to include this important individual parameter a corresponding procedure was prepared to adjust and register the “individual volume” (see below). It defines the individual reference volume and shall be constant and not changed during the complete evaluation procedure, so the test persons are asked to not change the volume level finally adjusted in this procedure. In contrast to other audio evaluations where a constant listening level is given each participant is able to choose his preferred listening level individually.

Loudness Adaptation

For the evaluation of the loudness adaptation short extracts were selected from ten video clips representing different genres. The compilation of three variants of audio adaptation under test was carried out by measurement and adaptation of the Programme Loudness using tools referring to EBU-R 128.

Audio adaptation types

Table 1: Audio Adaptation Types

Loudness Range Adaptation

For the evaluation of the Loudness Range adaptation five extracts from video clips were selected from different genres. The Loudness Range adaptation used for the test is strictly based on the measurement of the loudness descriptor “Loudness Range (LRA)” as specified in EBU-TECH 3342. The following figure shows the resulting LRA values in LU (Loudness Units as defined in EBU-R 128) under test of the five loudness range items uncompressed and after LRA compression.

Loudness Range of items under test (each uncompressed and adjusted compression c1_15 dB/c1_25 dB)

Figure 1: Loudness Range of items under test (each uncompressed and adjusted compression c1_15 dB/c1_25 dB)

Listening level

In order to enable the test person to adjust and to indicate his individual listening level (sound pressure) a short extract from a news broadcast was selected as test item. Our ears are especially familiar with the sound of human speech, thus, news anchors are ideally qualified to help adjusting to a convenient (or even the correct respectively original) listening level. The individual listening level that is finally adjusted by the user is considered to be the reference volume and should not be changed throughout the complete test procedure. The individually adjusted listening levels are obtained using a special test signal with announcements in different (loudness) listening levels where the announced listening level meets the corresponding loudness level.

These level announcements are presented after adjustment of the individual reference listening level. The participants are asked to indicate the first level announcement which they clearly can understand by clicking the corresponding button. This method is well-known as “Hearing Threshold Method” and presumed to be notably precise. The differences between the announced levels are 7 LU. From the psychoacoustic point of view this difference can be indicated as “clearly distinguishable“. An estimation of the relationship to the corresponding sound pressure level can be achieved by measuring the resulting reproduced average sound pressure level of the anchorman in an individual listening condition. The relationship between announcement and loudness level is presented in the following table.

Listening level announcements – Relationship between reference loudness and sound pressure listening level

Table 2: Listening level announcements – Relationship between reference loudness and sound pressure listening level

Introduction and evaluation

The   adjustment  of the listening level  is part of the introduction to the evaluation. The introduction contains an additional video sequence which is composed of nine short clips with different loudness adaptations to make the test person familiar with the loudness adaptation under test. The introduction comprises also a questionnaire to collect the age of the test person and the following characteristics of the individual listening condition:

  • age
  • type of speaker
  • size of speaker
  • distance to speaker
  • background noise

The arrangement of the loudness evaluation is like follows, each clip under test is presented in those 3 variants which have been described above. The participant is asked to listen to each of the 3 variants of the actual clip and thereafter to indicate which loudness/loudness range variant he personally prefers considering his taste/custom respectively his individual listening environment. After indicating the preferred variant by clicking the corresponding button the next clip with its three versions is presented for evaluation. The order of the presentation of clip number respectively variant is done randomly.

Preliminary Results

The preliminary results presented here are based on the evaluation period from September 19th 2011. Besides partners from the NoTube consortium we invited participants from other communities, e.g. the EBU audio expert group “FAR-PLOUD” and the “Surround-sound-Forum” within VDT (Verband Deutscher Tomeister). However, the results presented below are only a preliminary subset based on 48 valid test runs which were collected so far. In general, the presented data at this stage of the analysis has to be considered as purely descriptive. Here, only the the calculated percentage collected for each attribute collected is presented. In this preliminary presentation of the results we want to focus only on the aspects listening level, listening situation, background noise and the main results concerning loudness and loudness range evaluation.

The distribution of the listening level in figure 2 shows that the majority of participants chose a rather high listening level. The preferred level was Level 3 which was selected by almost 40% of the test persons.

Distribution of Listening level

Figure 2: Distribution of Listening level

The distribution of the speaker type in figure 3 shows a nearly equal representation of headphones (both in-ear and on-ear), built-in speakers and external speakers (proportion of subwoofers related to PC and stereo speakers only) with a predominance of headphones.

Distribution of Speaker Type

Figure 3: Distribution of Speaker Type

The distribution of the indicated background noise as presented in figure 4 shows a clear dominance of weak background noises. Only one user declared to have strong background noise.

Distribution of Background Noise

Figure 4: Distribution of Background Noise

The results of the evaluation of both loudness and loudness range adaptation are presented in the following figures.

Preferences of Loudness Adaptation

Figure 5: Preferences of Loudness Adaptation

Preferences of Loudness Range Adaptations

Figure 6: Preferences of Loudness Range Adaptations

Conclusions

First conclusions with respect to the evaluation of loudness and loudness range adaptation which can be drawn from the descriptive data presented above approve the excellent performance of EBU-R 128 with respect to Target Programme Loudness. This answers the open question whether the loudness harmonisation following EBU-R 128 which was clearly preferred in the previous evaluation is depending on individual listening conditions. Considering the listening conditions covered in this loudness web evaluation there is clearly no influence observable.

With respect to the evaluation of the different loudness range adaptations there is a tendency identifiable. The participants seem to prefer rather medium or even strong loudness range compression (compression characteristics c1_15 dB/c1_25 dB in figure 1) than uncompressed audio with high loudness range. A first analysis of the data with respect to correlations between loudness range adaptation aspects and the type of speaker or background noise showed no noticeable interrelation. On the other hand this result could be an indication that loudness ranges of 25 LU and more do not meet home listening environments respectively the expectance of the majority of home listeners. But in order to answer these questions an analysis of a larger dataset is necessary.

Call for participation

We thus invite you to participate in the web evaluation which will be open for another two weeks. The test can be carried out online. Simply go to http://survey.irt.de/notube to take part and to learn more about Loudness Normalisation on the Web! The evaluation will be open until Friday, December 16th 2011.

(This post was contributed by Gerhard Spikofski, IRT) 

Posted in Audio, Evaluations, Loudness | 1 Comment