Tuesday, April 29, 2008

Google trumpets PageRank for pics

Nearly a decade ago, Google unveiled an algorithm called PageRank,
reinventing the way we search for web pages. Now, the company says, it
has a technology that can do much the same for online image search.



Last week, at the International World Wide Web Conference in
Beijing, two Google-affiliated researchers presented a paper called
"PageRank for Product Image Search," trumpeting a fledging algorithm
that overhauls the primitive text-based methods used by the company's
current image search technologies.






"Our experiment results show significant improvement, in terms of
user satisfaction and relevancy, in comparison to the most recent
Google Image Search results," Shumeet Baluja and Yushi Jing tell the
world from the pages of their research paper, available here.



Of course, the most recent Google Image Search results are often
rubbish. Currently, when ranking images, the big search engines spend
little time examining the images themselves. Instead, they look at the
text surrounding those images.



By contrast, Google's PageRank for Product Image Search - also known
as "VisualRank" - seeks to actually understand what's pictured. But the
technology goes beyond classic image recognition, which can be time
consuming and/or expensive - and which often breaks down with anything
other than faces and a handful of other image types. In an effort to
properly identify a wider range of objects, Baluja and Jing have merged
existing image processing techniques with the sort of "link analysis"
made famous by PageRank.



"Through an iterative procedure based on the PageRank computation, a
numerical weight is assigned to each image," they explain. "This
measures its relative importance to the other images being considered."



With classic image recognition, you typically take a known image and
compare it to other images. You might use a known photo of Paris
Hilton, for instance, to find other Paris pics. But VisualRank takes a
different tack. Google's algorithm looks for "visual themes" across a
collection of images, before ranking each image based on how well it
matches those themes.



As an example, the researchers point to an image search on the word
"McDonald's." In this case, VisualRank might identify the famous golden
arches as theme. An image dominated by the golden arches would then be
ranked higher than a pic where the arches are tucked into the
background.



Baluja and Jing recently tested their algorithm using images
retrieved by Google's 2000 most popular product searches, and a panel
of 150 people decided that VisualRank reduced the number of irrelevant
results by 83 per cent. The question is whether this could be applied
to Google's entire database of images.



At the moment, this is just a research paper. And Google isn't the
first to toy with the idea of true image search. After launching an
online photo sharing tool that included face and character recognition,
the Silicon Valley based Riya is now offering an image-rec shopping engine, known as Like.com, that locates products on sale across the web. And the transatlantic image rec gurus at Blinkx are well on their way with video search.

Form : http://www.theregister.co.uk/