There are a lot of social tv platforms around the web where you can register yourself and watch tv programs in a social environment. Usually you can let these services publish your watching activities on your preferred social networks, like Twitter. Thus there will be a lot of tweets around telling about user watching experiences. This is an interesting user dataset that comes without (almost) any effort. We can leverage it to test Notube profiling and recommending services, evaluating the appeal of the recommended tv programs. We chose Miso amongst available social tv platforms. Here’s an example of Miso published tweet:”I’m watching Modern Family S02E15 (via @gomiso) http://miso.io/aCe211“. As you can imagine, these tweets have a rigid structure, such as: I’m watching + [program title] + [program code] + (via @gomiso) + [program url on Miso]. We extracted the program titles from those tweets, along with usernames of users they belong to. So we obtained a list of active users, each one coupled with their watched tv programs.
Fine. This will be our input for the Beancounter, that we can see here as the Notube semantic profiling system. For each user, the Beancounter inspects each of the watched program titles and queries over some useful subsets of the LinkedData cloud (i.e. DBpedia, the semantic version of Wikipedia) to infer some interests from the watched program. I.e. from tv program ‘Lost’, it can go back to genres, related movies and actors. This bunch of items will feed the user profile, along with the program itself. Thus a Beancounter user will have some people, some genres and some programs among their interests. This constitutes the user profile. Now we want to recommend these users something new to watch. In Notube we are designing and implementing several recommenders. Here we want to test our draft pattern-based recommender. This recommender accepts as input a program, a user, a pattern and a Linked Data dataset. About program and user we’ve just spoken. Pattern here stands for a semantic path you can follow in a Linked Data cloud dataset. Let’s clarify this with the pattern we used for this testing session. The pattern (expressed in SPARQL language) is the following:
?film linkedmdb:’sequel’  linkedmdb:’actor’ ?person
It means that we are looking for paths starting from a movie to its sequels and from them to the involved actors.
We want to use this pattern to query the LinkedMDB dataset (containing the semantic version of the IMDb dataset). What does it means? Simply that we fix one of the three variables (film, sequel, actor) and we query LinkedMDB to get some results. I.e.: we fix the actor and we query LinkedMDB to get as result some movies (sequels too) that actor played in. So we call this recommender with user data contained in the Beancounter user profile, made of people, movies plus other things. The recommender then tries to get some results from each one of those user interests: it builds three queries, one guessing that the interest is an actor, one that it is a movie and one that it is a sequel movie. The final result of the three queries joined is a list of things (actors, movies) linked to those ones passed as input according to the pattern we just saw. The list of twitter users and recommended items will be something like this (in a simple JSON syntax):
“unnamedi”:["Sean Bury","Paul and Michelle","Anic e Alvina"],
“unthika”:["Blythe Danner","Meet the Little Focker","Robert De Niro","Dustin Hoffman",
"Ben Stiller","Teri Polo","Barbra Streisand","Little Fockers", "Owen Wilson","Jessica Alba","Laura Dern","Blythe Danner","Meet the Little Focker"],
“soulchainer”:["Ocean's Twelve","Ocean's Thirteen","Ocean's Eleven","Interview with the Vampire: The Vampire Chronicles","Queen of the Damned","Seven Years in Tibet","Enemy at the Gates"]
As you can see, each user gets a bunch of things as recommended, both movies and actors, found by the recommender browsing semantic paths in LinkedMDB. This results obviously need some refinements, still in progress.