Designing a new user interface for NoTube’s Beancounter

Managing the large volumes of data generated by the Social Web presents many challenges. In considering the user experience of NoTube’s Beancounter we have been thinking about how to present this kind of data to users in meaningful ways, as well as ensuring we implement robust models of sharing, privacy and ownership.

The importance of getting these things right is reflected in the growing interest outside of the NoTube project in data mining of activity and social data for the provision of social recommendations.

To re-cap, NoTube’s Beancounter technology supports the automatic generation of an implicit user interests profile. This is based on re-use of an individual’s activity on social media services, such as the content of their tweets and Facebook ‘likes’, to determine their interests. The idea is to re-use the scattered and disparate activity data and make it useful by combining it, looking for patterns, and using it to suggest things to watch.

Beancounter is currently being re-built in the backend. Alongside this we have been working with visual designers from the design agency Fabrique in Amsterdam to develop a new user interface for the Web front-end. This blog post outlines some of the design challenges we have encountered along the way.

Challenge 1: Displaying an overview of your interests

To create your Beancounter profile you need to link one or more social media accounts (Facebook, Twitter, LastFM etc) to your Beancounter account. Beancounter retrieves activity data from these sources, interprets the information contained in the activities and matches it to concepts from the Linked Open Data cloud (DBpedia concepts) to give you an overview of your interests. These interests include programmes, movies, people, locations and genres. Each interest is assigned a weight in the profile; the more instances of an interest in your activities, the higher the weight of the interest – and the more influence it has on your recommendations. These weightings change over time as the Beancounter continually adjusts to your activities. For example, if you watch or listen to lots of coverage of The Olympic Games but aren’t generally interested in sport the rest of the time, the weightings for sport-related interests will temporarily increase during The Olympics.

This interests profile is useful because it can be automatically used as input for personalised TV recommendations (via NoTube’s N-Screen prototype for example) to help you decide what to watch. And because the profile is portable, it could also be used as input for other applications.

During implementation of the first version of the Beancounter it became clear that there are potentially a very large number of interests to display for any given user, many of which have very low weightings because they have only appeared once or twice in the user’s activities and therefore have very little significance for recommendations.

We wanted a simple way to clearly present the spread of interests, which also immediately conveys the influence (i.e. weight) of each interest for recommendations, and takes account of the fact that the number of interests could quickly become quite large. As a solution, the designers adapted the tag cloud metaphor into a scaleable grid, with variable font and cell sizes to indicate relative weightings and hierarchy, so that the most influential concepts are prioritised at the top.

Beancounter design: interests overview

Design mock-up for displaying Beancounter interests

Beancounter interest details

Draft Beancounter design mock-up showing the activities that contributed to a particular interest in the user's Profile

Challenge 2: Showing how each activity affects your interests

Another of the UI design challenges that emerged from implementation of the first version of the Beancounter is how to show the contribution of an activity to the ‘evolution’ of your interests. For example, as described in a video of an earlier demo, clicking on the activity “You watched Timecop…” in the user’s activity stream opens a pop-up (shown in the screenshot below) using bar charts to show the ‘before’ and ‘after’ status of the user’s interests based on watching the film Timecop. The two existing concepts ’1990s science fiction films’ and ‘Films shot anamorphically’ now have a greater weighting than they did before. The other concepts (where only the blue bar is displayed) are new interests associated with Timecop, which didn’t already exist in the profile.

Screenshot of early Beancounter demo

Early Beancounter UI showing the effects on a user's interests of watching the film Timecop

We wanted to simplify the presentation of this information, and to make it easier to understand what is happening and why it might be interesting. We’re getting there with the new design, although we’re still thinking about the best way to convey the idea that the length of the bar really represents ‘the influence of this concept on your recommendations’, and of providing a relative scale to measure this level of influence against.

Design mock-up for Beancounter interest details

Draft design mock-up for Beancounter showing the effects of an activity on the user's interests

This design also allows the possibility for the functionality to be extended so that you could manually adjust the weighting of a particular concept (for example, by sliding the bar to the left or the right) to give it more or less influence over recommendations.

Challenge 3: Displaying on-the-fly data analytics

In addition to displaying your interests, it has always been our intention to offer some analysis of the data that the Beancounter has collected about you. This is based on the premise that people are usually interested in information about themselves, and the initial inspiration came from the Dopplr annual report and the BBC’s RadioPop prototype for social radio listening.

Beancounter offers the potential for a range of detailed analytics, including what you’ve watched and listened to most often and when, the things you are most interested in now and at previous points in time, people who have similar watching and listening habits, and those who are least similar. For more design inspiration we looked at many examples of beautiful data visualisations. We particularly liked the infographics from Hunch.com and The Feltron Annual Reports. However, many of these were hand-crafted, and our requirement is for attractive design modules that can be adapted for automated on-the-fly presentation.

Beancounter analytics

Draft Beancounter design mock-up displaying an analysis of the user's data

Challenge 4: Interacting with multiple layers of information

The way that data is stored in the new Beancounter allows for any activity (e.g. listening to a Tom Waits track on Last.fm), interest (e.g. Tom Waits), type of interest (e.g. all people) or type of activity (e.g. all the things you’ve listened to) in the UI to be linked to the relevant analytics relating to that object; providing timelines, comparative views, explanations and statistics. Whilst this enables the user to delve deeper and gain extra insight to their data should they wish to, we want to make sure that the UI doesn’t become cluttered and confused with all these additional overlays. We’re therefore working with the visual designers to determine the most elegant model for interacting with these multiple layers of information without being overwhelmed by them.

Next steps

We’re still finalising the design work. Over the next few months we will be integrating these design mock-ups into a new Beancounter UI so that you will be able to try it out, get personalised TV recommendations in NoTube’s N-Screen, and perhaps discover some interesting new things about yourself…

 

 

Posted in Beancounter, User Experience | Leave a comment

Algorithms for recommendations in various N-Screen implementations

We currently have three different versions of N-Screen running:

They all have the same basic design with small tweaks for image size. They interoperate – you can drag and drop between them. The main differences lie in the collection of data for the backends and in the calculation of similarity between videos. For similarity calculation we use three different techniques for the three different datasets, as we had three different sets of data available.

The Redux version was our first experiment in this area. BBC Redux is a BBC research video on-demand testbed. We were lucky enough to be able to obtain anonymised watching data for programmes in a five-month subset of the period it covers. So our first experiment, led by Dan Brickley, was to take that watching data – around 1.2 million observations over 12,000 programmes – and use open source tools to generate similarity indexes. We were able to use a standard function in Mahout, Apache’s machine learning and data mining software, to generate similarity indexes using a Tanimoto Coefficient model. This function essentially uses people as links between programmes (“Bob watched both ‘Cash in the Attic’ and ‘Bargain Hunt’”), and sorts programme pairs according to how many people watched them both. With this dataset, this technique produced some nice examples of clearly related clusters (for example what you might call ‘daytime home-related factual’, see picture below).

A cluster of 'daytime home-related factual'

It is quite rare to have access to this kind of data about what people have watched. It’s both valuable and private, and may not be readily available. It may not exist, if no-one has watched anything yet. For the TED dataset we therefore took a different approach. TED talks are a diverse set of talks by people prominent in their field, licensed under the Creative Commons BY-NC-ND license. From our point of view, the advantage of using this dataset was that transcripts were available for all talks. To calculate similarity between the talks for N-screen we were therefore able to use a tf-idf algorithm. This technique treats each programme as a document, and finds the most characteristic words for each document within the total corpus of documents, and can be used to match the documents based on the words selected. We were lucky enough to be able to use some Ruby software open sourced by a colleague at the BBC to do this.

This technique produced clearly similar clusters within the 600 video dataset, for example, in this selection, you can clearly see items relating to women and also to drawing and art:

TED talks similarity example

Our third example is an iPlayer version of N-screen. On any given day, there are about 1000 TV and radio programmes available to UK viewers on iPlayer, the BBC’s on-demand service. This is an interesting dataset for us because of its high-quality metadata, including categories, formats, channel and people featured. We were curious as to whether we could generate interestingly similar programme connections using only metadata. Our first approach was to try a Tanimoto similarity over the structured metadata, but the results were not particularly satisfactory – many programmes had no similar items. We then tried tf-idf over the metadata descriptions. This seemed to pick up characteristics of the text rather than of the programmes (for example repeated quirks in phrasing of the descriptions). The best approach we have tried (evaluated only informally) is tf-idf over a combination of metadata and the results of an entity-recognition technique.

We used the existing metadata from /programmes json format (for example http://www.bbc.co.uk/programmes/b00k7pvx.json or http://www.bbc.co.uk/programmes/b015ms3r.json). As you can see from those examples, some have descriptions of people who are in the programme, with mappings to dbpedia where available. We can get more of these by using a service to extract entities from the description text. We used Lupedia, which was developed in the NoTube project by Ontotext for this. We took this data coupled with the channel and the categories to produce a list of keywords, and then ran tf-idf over the top of that. The result can be variable:

Example of a not very good similarity match

but in many cases, reasonably good:

Example of a good selection of similar material

and occasionally throws up an interesting surprise:

Unexpected link between programmes

The next stage is to evaluate these results formally.

Posted in Recommendations, Second screens, Social TV | 1 Comment

N-Screen backend: XMPP/Jabber and group chats

The idea of N-Screen (demo) is to have real-time small-group non-text communication – so for example, sharing a programme (or perhaps a specific point in a programme) with a person, with a TV, or with a group, using drag and drop.


N-Screen related content screenshot

We had a number of very specific requirements:

  • Real time communication
  • Different types of receivers (people, TV/video players, others)
  • Structured data transfer
  • Anonymous usage

We also needed good, open tools and libraries available because of the limited amount of time we had to implement.

Like several other groups, we’ve been using XMPP (Jabber) for the backend because it works in real time and has plenty of tools and libraries. Others have been using the PubSub framework to broadcast synchronised content to connected devices, but integral to our plan was to enable any people watching to also be able to share. I had a surprising amount of success with using a central negotiator that allowed ad-hoc groups to be formed from anonymous users, populating each user’s roster with other people it knew about. However, a much less error-prone approach has been to use ad-hoc XMPP group chats, and this has enabled us to make a pure HTML/Javascript implementation with no backend dependencies apart from an XMPP server and some simple APIs to the database of content.

I’ll talk a little about the requirements in more detail, mentioning some implementation issues as we go.

Requirements

Real-time communication

This is essential for drag and drop between devices to be ‘realistic’ – i.e. for a good user experience. Network issues can always be tricky here, particularly under demonstration (rather than real-life) conditions.

Different types of receivers

A ‘TV’ listens for ‘play’ and ‘pause’ messages and does something with them. A ‘person’ listens for ‘drop’ messages and displays them appropriately. There might also be other kinds of listeners – loggers perhaps, or bots that enhance or modify content dropped to them. All types need to take account of who is joining the group and the kind of thing that they are so that they can do the right thing and display them appropriately.

Structured data transfer

For user experience reasons a fair bit of data needs to be send on most interactions. A shared item needs to have basic metadata (identifier, URL, title, description, image) and also who shared it. Other kinds of message include announcements about the kind of thing you are. We chose Json as the body of the XML XMPP message, though XML would also have been fine or better. One issue is that ‘IQ’ (hidden data) messages cannot be sent to group chats, so that all group messages are visible in a standard chat room if connected to with, say, PSI.

Anonymous usage

Although there is plenty of potential for connecting N-screen with Twitter and / or Facebook, we didn’t want to require it. In N-Screen you need to give a name so that other people using the application can refer to you, but that’s the limit of the requirement for identification. For scalability and maintainability reasons we didn’t want to create a lot of named users on the XMPP server. Fortunately, XMPP allows you to create group chats with anonymous users, which is perfect for our needs.

The setup

We’ve been through many iterations to get here but I’m now pretty happy with the setup we have.

Ejabberd server with Bosh and group chat enabled

Ejabberd is not particularly simple to set up, but once it is up, seems pretty stable. I’ve put some tips on troubleshooting here (scroll to the end). PSI is a great tool for debugging as you can set it to log the XML messages going past.


PSI view of a groupchat created behind the scness of N-Screen

PSI XML view of a groupchat in N-Screen

One thing to note is that for ejabberd at least the group chat URL is

[room_name]@conference.[server]/[nick]

e.g.

default@conference.localhost/libby

APIs to the content

I used a simple ruby server and mysql backend to generate Json search and random APIs. For content-to-content recommendations for TED we have used TF-IDF analysis of the transcripts using this code by my BBC colleague Chris Lowis.

The workflow is as follows:

  • The user goes to a webpage, and gets an alert requesting their name
  • Based on the window hash (the bit after ‘#’), the Javascript chooses what group chat to join / create, using Strophe over Bosh to make the connection and announces itself to the room using a presence message with the name provided by the user
  • The eJabberd server then automatically tells the user about the other partipants in the room, and the Javascript renders them either as users or as TV
  • The ‘TV’ is also a piece of Javascript / Strophe that additionally announces itself to all joiners of the room as a particular type of thing (a ‘TV’). Multiple TVs are allowed in the room.
  • All user pages keep a list of all TVs, and on dropping to the TV sends a programme onto all of them
  • On leaving the page the user is disconnected from the eJabberd server – this can take a few seconds to percolate to the user interface.

The rest is client-side, which I’ll talk about further in another post. Feel free to try out N-Screen here.

Posted in Uncategorized | Leave a comment

N-Screen: a second screen application for small-group exploration of on-demand content

For our latest social prototype in NoTube we return to the problem of finding interesting things to watch within large video collections, and investigate how working together might help people find something interesting.

As we’ve seen, the problem with on-demand video is too much choice is exhausting and demotivating and leads to satisficing behaviour and sometimes no choice at all, particularly in group-choice situations.

It looks as if watching together apart (watching the same thing in different physical locations) is going to be a big deal in the future. Both google and facebook are putting in place tools that allow people to hang out while watching videos together.

Lets think about a group choosing what to watch from scatch. What sorts of things do they say?

  • who is s/he? (who is that actor / participant?)
  • who directed it? (who made it?)
  • what’s it about?

but also:

  • what do you want to watch?

The first set of questions are the kinds of questions metadata can answer. Who is in it, who created it, what sort of thing you can expect from it, what it is similar to.

The second type of question is much harder. We have preferences about each others’ future mental states, or to put it another way, we would usually like everyone to enjoy the content we will watch together, without fully knowing the other participants’ preferences or state of mind. It’s a hard decision problem, and it’s no wonder people give up quickly.

N-Screen is a second screen HTML / Javascript web application that allows people to express their preferences to each other directly, by dragging and dropping content to each other individually or as a group, directly answering the second kind of question. When someone receives some content like this in N-Screen, they can click on it to see more information about it, answering the first kind of question.

The system is designed to be used in conjunction with an out-of-band communications channel (e.g. face to face chat, Skype, or IRC) for the direct negotiations, as much depends on the subtleties of communication – understanding how people are feeling – and this is best done using some familiar channel. It’s called ‘N-screen’ because it might be the primary screen, or one of a bunch of equals; it could play video locally or remotely (in theory).

It’s primarily for tablets and laptops, but runs on anything with a modern Web browser; from smartphones to touch-tables and desktop PCs. It works very nicely on a desktop PC with a touch screen; whereas it serves only as proof-of-concept on an iphone or android phone right now. Similarly, it can run on a touch table, but doesn’t make the most of its potential.

Once people have found something interesting to watch together one of them can drag and drop it to the TV and it will play.

One of the design aspirations behind this work, was to explore practical ‘hands on’ notions of collective intelligence, particularly from groups (perhaps professionals with a common goal; perhaps school children) who are intensely exploring some collection or topic together.

We have a demonstration which you are welcome to try, that uses some of the wonderful TED Talks videos. It’s a work in progress and we’ll be adding features over the next few weeks (if you want to try playing it, use this URL in another window (uses flash).

Do let us know if you have any comments.

Posted in Uncategorized | 4 Comments

Loudness Web Evaluation now online – be part of it!

End of 2010, we conducted a user evaluation to compare different ways to normalise the loudness of video clips. It showed a significant improvement of loudness normalised videos following EBU-R 128 compared to the original video clips without any normalisation, the “Max PPM=0 dBFS” normalisation used in CD production and the former European Broadcast Recommendation “Max QPPM=-9 dBFS” (see the Research Topics pages for more information).

Following the results of this test, we decided to perform another user evaluation to investigate the variation of both the Programme Loudness and Loudness Range (LRA) considering different listening situations as they can occur in the context of NoTube, i.e. using a computer, a mobile device or a Hybrid TV. We prepared different versions of a number of video clips to evaluate the application of loudness and LRA adaptation for typical listening situations of users.

http://survey.irt.de/notube The test can be carried out via the Web and to acquire the largest possible number of test cases for this evaluation we invite everyone to participate! Go to http://survey.irt.de/notube or scan the QR code with your smart phone to take part and to learn more about Loudness Normalisation on the Web! The evaluation will be open until Wednesday, October 12th 2011.

Posted in Evaluations, Loudness | 2 Comments

NoTube presentation at NEM Summit 2011

RAI presents NoTube's Personalised Semantic News at the NEM Summit Conference 2011 NoTube was at the NEM Summit 2011. Two papers that were submitted by NoTube partners were presented at the conference which took place from September 27th to 29th at Politecnico in Torino.
NoTube Demo at NEM Summit 2011 Exhibition Peter Altendorf from IRT presented a paper about the common ground of NoTube and HbbTV. Luca Vignaroli from RAI presented the Use Case 7a. The NoTube project was also demoed by RAI at the accompanying NEM Exhibition.

Posted in About Notube, News, Publication | Leave a comment