Value Of Page Rank on New Algo !

September 8th, 2008

SAN FRANCISCO — Google researchers say they have a software technology intended to do for digital images on the Web what the company’s original PageRank software did for searches of Web pages.

Blogrunner: Reactions From Around the WebOn Thursday at the International World Wide Web Conference in Beijing, two Google scientists presented a paper describing what the researchers call VisualRank, an algorithm for blending image-recognition software methods with techniques for weighting and ranking images that look most similar.

conference_2008

Although image search has become popular on commercial search engines, results are usually generated today by using cues from the text that is associated with each image.

Despite decades of effort, image analysis remains a largely unsolved problem in computer science, the researchers said. For example, while progress has been made in automatic face detection in images, finding other objects such as mountains or tea pots, which are instantly recognizable to humans, has lagged.

“We wanted to incorporate all of the stuff that is happening in computer vision and put it in a Web framework,” said Shumeet Baluja, a senior staff researcher at Google, who made the presentation with Yushi Jing, another Google researcher. The company’s expertise in creating vast graphs that weigh “nodes,” or Web pages, based on their “authority” can be applied to images that are the most representative of a particular query, he said.

The research paper, “PageRank for Product Image Search,” is focused on a subset of the images that the giant search engine has cataloged because of the tremendous computing costs required to analyze and compare digital images. To do this for all of the images indexed by the search engine would be impractical, the researchers said. Google does not disclose how many images it has cataloged, but it asserts that its Google Image Search is the “most comprehensive image search on the Web.”

The company said that in its research it had concentrated on the 2000 most popular product queries on Google’s product search, words such as iPod, Xbox and Zune. It then sorted the top 10 images both from its ranking system and the standard Google Image Search results. With a team of 150 Google employees, it created a scoring system for image “relevance.” The researchers said the retrieval returned 83 percent less irrelevant images.

Google is not the first into the visual product search category. Riya, a Silicon Valley start-up, introduced Like.com in 2006. The service, which refers users to shopping sites, makes it possible for a Web shopper to select a particular visual attribute, such as a certain style of brown shoes or a style of buckle, and then be presented with similar products available from competing Web merchants.

Rather than relying on a text query, the service focuses on the ability to match shapes or objects that might be hard to describe in writing, said Munjal Shah, the chief executive of Riya.

“I think what they’re trying to accomplish is largely impossible,” he said. “Our belief is, there is not large-scale solutions.”

Mr. Shah said there had been a number of technology demonstrations by Google Labs researchers, such as a project in 2005 that used machine learning techniques to recognize the gender of a person in an image. However, the company has been slow to deploy its research, he said.

Author: JOHN MARKOFF

This week I’ve noticed a number of interesting changes in the way Google ranks web pages. The following article is based on my observations and theory rather than fact. Please comment if you have noticed similar issues.

Quite a number of the queries we track have altered recently and websites that previously ranked have dropped down by a number of places. This doesn’t appear to be a penalty - just an alteration in the algorithm.

Algo_Aug

The common characteristic all the sites have is that their rankings were based very heavily on anchor text rather than on-site optimisation. The changes don’t seem to have affected major commercial queries yet but they are visible when you search for particular peoples names.

For example a search for “patrick” used to bring blogstorm.co.uk in 5th place, this week it dropped down to 35th place. The sites above all have better on-site optimisation for that keyword but previously a few good anchor text links was enough for Blogstorm to rank.

Last December I did a study to see how a few SEO bloggers ranked for their own name which gives a good barometer to see how the algorithm has changed since then. Some blogs have moved up and some have moved down but in general the trend is downwards depending on whether you use google.co.uk or .com (there are geographical fluctuations going on as well which affects the results).

Why would Google do this?
Anchor text is the biggest flaw in the Google algorithm. Google wants to show the most relevant and trusted websites at the top of the search results but anchor text has no relation to trust for most queries.

Just because a site has 5 million links with the anchor text “loans” doesn’t mean its a good search result for the query “loans”. Currently there are two types of sites ranking for commercial queries - ones that rank due to the TrustRank of their incoming links (links from newspaper websites and quality blogs) and ones that rank because they have thousands of paid links with keywords in the anchor text.

If I worked at Google then I would discount any links with really competitive keywords in the anchor text - nobody naturally links to a commercial site with “loans” or “car insurance” in the anchor text - they use the sites name instead.

If your site name is mega-cheap-car-insurance.com and all your anchor text is “Mega Cheap Car Insurance” does that mean you should rank higher than somebody like confused.com when a searcher is looking for “cheap car insurance”? I think trust (something which can’t be gamed) should play a much bigger factor than anchor text which until now was by far the biggest loophole in the algorithm.

Author:  Patrick Altoft