Microsoft’s Answer to PageRank: BrowseRank

Tweet

According to CNET a new paper out of Microsoft Research Asia (PDF) details what may eventually be Microsoft’s answer to the Google PageRank algorithm that was in a large part responsible for the Mountain View-based company’s ascension to the search engine throne. Microsoft’s version, called BrowseRank, would rank pages based on user behavior and not based on linking.

The basic idea behind Google’s PageRank is that the more a page is linked to, the most important it must be. Microsoft says that link analysis algorithms like PageRank are flawed, though, because they’re easy to be gamed and don’t take user behavior into consideration. Of course, Google’s actual implementation of PageRank is far from that simplistic and the company updates its search algorithms hundreds of time each year. Further, Google reminds us often that PageRank is just one of many things that it uses to rank search results.

Still, Microsoft thinks that it can do better — and it better hope that it can do a lot better. As we discussed earlier beating Google with technology means you have to beat the pants off of them and really wow users with dramatically better search results.

Microsoft Research Asia’s BrowseRank algorithm ditches the link graph model that was popularized by Google, and instead creates a user browsing graph that looks at things like which links users clicked on and how long they stayed on each page.

“User behavior data can be recorded by Internet browsers at web clients and collected at a web server,” according to the researchers. Microsoft Research Asia said they gathered anonymous data from an “extremely large group of users under legal agreements with them” to put their theory to the test. The idea is that you can take anonymous browsing data from hundreds of millions of users and create a user browsing graph that can paint a picture of which pages are most important to users.

“The user browsing graph can more precisely represent the web surfer’s random walk process, and thus is more useful for calculating page importance. The more visits of the page made by the users [sic] and the longer time periods spent by the users on the page, the more likely the page is important,” say the researchers. “With this graph, we can leverage hundreds of millions of users’ implicit voting on page importance. In this regard, our approach is in accordance with the concept of Web 2.0.”

Of course, by itself, user browsing behavior probably isn’t enough to rank pages — if BrowseRank was used on its own, it would be easy to see MySpace and Facebook and video sites like Hulu shoot to the top of search results pages. However, Microsoft researchers think that it could be combined with other web page ranking algorithms to greatly enhance search results. “It is also possible to combine link graph and user behavior data to compute page importance,” they write. Researchers said that initial results from their tests using BrowseRank showed better performance than existing methods.

It wouldn’t be surprising to learn that Google had something similar under development. Google is already capturing user browsing behavior via its popular Google Toolbar, and appears to have put some of that data to use earlier this year with the launch of Ad Planner and enhancements to Google Trends that include web traffic. Using that data in search engine results rankings — or at least experimenting with doing so — isn’t a huge leap. Google is hardly a sleeping giant.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://pixelsheaven.com Ronny

    So now instead of link selling, the new trend will be mass browser renting (coined just now).

    Missed those huge “Get PAID to see ADS!” popups? Apparently, Microsoft did. So here we go, only this time you’ll get paid to browse through websites, and the popups are AJAXed.

    I can’t see a way Microsoft could balance between online video or gaming sites and for example tech support sites, whom whole point is to quickly deliver the best answer.

    And what about Google? Their target is to send the user out of Google, via organic or sponsored links, as soon as possible. So maybe giving time spent on a page too much weight is not so bright.

    Of course they’ll try to balance it. I just don’t count on Microsoft to do it properly.

  • http://www.deanclatworthy.com Dean C

    lol, of course user behaviour can’t be gamed can it Microsoft? Buying links is far harder than creating a fake user script in PHP which’ll go around imitating thousands of users behaviours whilst constantly mining a free proxy list to avoid detection.

  • Stevie D

    Dean C – absolutely. I’m no programmer, but I’d have thought it wouldn’t be difficult to make your own spider and point it firmly at your sites – or the sites of people who paid you.

    My other problem with this is that so often users follow “dead end” links that don’t give them what they want. With so many websites and users opening links in new windows and tabs, I would think it’s going to be pretty difficult to accurately track what people are looking at and what route they’ve taken to get there.

  • huni

    more crap from micro(limpskin)soft!!!

  • http://www.mockriot.com/ Josh Catone

    By the same token, I would tend to think that if faking browsing would be easier than faking back links, so would detecting it. Just like Google puts a lot of time and effort into sorting the legit links from the fakes, I am guessing Microsoft, if they ever use BrowseRank, would put a lot of time and effort into weeding out real browsing from faked.

    On some level, the BrowseRank system sort of makes me think of the system that Nielsen uses for television ratings — or in other words, a representative sample might be enough to capture useful data… not everyone need participate.

  • http://pixelsheaven.com Ronny

    Josh, you’re basically right about detecting fraud browsing, but then we’ll be back again to my first comment – Browser renting. Undetectable, although it’s pricey.

  • Chris

    Very interesting, although this has been tried before. DirectHit had a search engine built entirely on clickstream data (Acquired by Ask.com in 2000). They got the data from ISPs in those days. The end-result is really not that much better than Page-Rank.

    Me.dium on the other hand (http://me.dium.com/search) is processing user’s clickstream data in real-time to create a different lens based on what’s going on now. e.g. do a search for John Edwards on Google or Live, and you get johnedwards.com and wiki/johnedwards. Do the same search on Me.dium and you learn that today people care about his love child, pictures of his mistress, etc.

    The difference is real-time (what people are browsing now) vs. historical (what they browsed in the past). Social vs. Old School. Check it out. http://me.dium.com/search.

  • duh

    So the theory here is that Google is NOT using the data they get from the millions of users using the Google toolbar???

  • Anonymous