Although real time search is fairly new, as we end 2009, the ability to index and search fresh results is rapidly becoming a commodity, with Bing, various startups, and now Google all integrating status feeds from social networking services. The next set of challenges in 2010 will be around providing better relevance, information discovery, and topic exploration for social search, using signals from the dynamic behavior of users and their interaction with the social and topic graphs.
I gave a short talk on real time and social search for a panel at SES Chicago last week. I’ve been heads down for the past few months working on Bing Twitter Search, so now that the first launch is out the door it was a nice chance to talk with people about some of the work we’re doing. There was a lot of interest in the sentiment, trend, and social graph analysis slides (9 and 10). I will write about those in a separate post, but wanted to get the presentation up for those who have been asking about it.
What’s Different about Real Time and Social Search – HJL Slides For SES Chicago Dec 09 – Presentation Transcript
What’s different about real time and social search?
Ho John Lee
Principal Program Manager
Bing Social Search
Search Engine Strategies
Chicago – December 7, 2009
What’s Real Time Search Good For, Anyway?
Twitter is Great for Watching Uninformed Panics Unfold Live
…or finding balloons http://xkcd.com/574/
Some characteristics of Twitter / Social media
Immediacy, Sentiment, Brevity
Not always accurate
Feelings, reactions, impressions
Context is often essential to determine meaning
Gestural – @user, #hashtag, RT, favorites, follows
Self-organizing communities of attention and authority
Content follows attention
People talk about what others are talking about
Observations and commentary from everywhere
If there’s no content, you can ask for some
Extreme head and tail coverage
Low relevance “noise” can become “signal” in aggregate
Your product or brand could suddenly be at the center of a huge conversation
Tiger Woods
Balloon Boy
Breaking Story
Persistent Story
Big Story
Bigger Story
Some characteristics of Real time / Social Search
Real time and social search is qualitatively different from traditional web search
Differences in ranking, relevance, use model
Social graph, user behavior, location, event correlation and other input signals
Real time search is frequently about discovery, not search per se
“what is everyone talking about”, followed by “what are people saying about ”
Top real time and social search results will usually differ from top web search results
Bing Twitter Search at a glance
Top Tweets
Top Shared Links
Tweets/Sentiment per link
Adult /Spam filter; Tweets/Links ranking & relevance
Bing Fall 2009: Twitter vertical, News, MSN, Maps
MSN Local Edition
Page 2: Tweets or Links
Page 1: Tweets & Links
Twitter Answer on News SERP
MSN Hot Topics
Topic / sentiment range, volume, trend analysis
What is the baseline rate of mentions / sentiment per unit time?
Changes in attention flow around a subject, location, topic
Watch for correlated signals from multiple sources
Consider source relevance and authority as well
Graph analysis for relevance and ranking
Spam marketing campaign
Naturally connected community
Spammy communities are highly visible – don’t be part of one!
The session was moderated by Barbara Coll, CEO, WebMama.com Inc., with panelists Bill Fischer, Co-Founder & Director, Workdigital, Ltd., Rob Walk, Managing Partner, NovaRising, Nathan Stoll, Co-Founder, Aardvark, and Ho John Lee, Principal Program Manager, Social and Real Time Search, Microsoft Bing.
As some of you know, I have been exploring a variety of paths forward for SocialQuant, my real time social search and analytics project. My family, friends, and colleagues have given me much support, patience, and advice during this process, which has reached a crossroads, and as Yogi Berra says, “When you come to a fork in the road, take it!”
The rise of Twitter, Facebook, and other social media, combined with web-based applications, smartphones, and cloud computing have all set the stage for new applications and use models based on social discovery, collaboration, and communications, in addition to traditional search. What we’re all calling “real time search” lately isn’t exactly real time, nor is it exactly search, in which you find a definitive/authoritative answer. Much of the opportunity revolves around discovering people, discussions, and events that are relevant to you and bringing it to your attention in a timely, actionable fashion. Information streams from social media are transient, unreliable, and noisy. At the same time, the sheer volume of data can help provide the basis for building better filters. As an added bonus, you can ask questions to people in the social graph itself, and there are numerous examples of communities of interest forming around current events such as Barack Obama’s inauguration, the Iran elections, or even Michael Jackson’s funeral, all of which help surface information content, opinion, and sentiment that were previously inaccessible online. One interesting aspect of real time social media is that it’s not just algorithmic, it’s based on human connections and emotions. So a message that “feels right” from people you trust can be more relevant than one that is “correct” at times.
The challenge then is in filtering and ranking the massive flow of information in a way that helps direct the user’s limited (and non-expanding) time and attention in a way that’s most valuable to them. With today’s information technology, amazing things are possible with limited resources. I personally have more computing and storage resources than the facility we launched HP’s original photo site with (for millions of dollars), at a fraction of the cost, routinely pushing around datasets of millions of rows on the local development servers. Unfortunately, that’s just the ante to get started on the problem. Running ranking, clustering, and semantic analysis for filtering the ever-growing stream of social media eventually requires web scale computing, even with careful problem selection and data pruning. The bar is also going up every day as the social media user base grows, and as well funded teams make progress on their platforms (+Google). So very shortly, to be competitive in real time, social search and discovery is going to require access to lots of data and either getting a datacenter or working with someone who has one.
I look forward to working with Sean Suchter and the Microsoft Bing search team (and likely expanding their carbon footprint) in pursuit of new applications and services as the social media and online application space evolves.
You can follow along on Twitter (@hjl). As always, any and all opinions here are solely mine and do not reflect the position of any past, present, or future employer, partner, or business associate.
Twitter Data – A simple, open proposal for embedding data in Twitter messages – Home – "Twitter Data is a simple, open, semi-structured data representation format for embedding machine-readable, yet human-friendly, data in Twitter messages. This data can then be transmitted, received, and interpreted in real time to enable powerful new kinds of applications to be built on the Twitter platform."
OneRiot Announces API & Real-Time Search Partnerships – "Real-time social search outfit OneRiot today announced their API and partnership program for adding real-time search capabilities to browser add-ons, desktop applications, social websites and other services" Screenshots from initial app TwitterBar (browser extension)