« Magnetizer/Demagnetizer | Home | Remotely monitor a room from anywhere in the world — for $85 »

August 21, 2005

The size of the web — and other imaginary numbers


My first acquaintance with imaginary numbers in real life came in high school when I learned all about i.

The continuing debate between Yahoo and Google about who indexes more of the web misses the point: both miss most of it.

The stuff search engines find is the tip of the iceberg.

Most of what's out there is known and available only to those who know where to go in the first place.

This past Monday John Markoff, the nonpareil tech reporter for the New York Times, wrote a story about the ongoing Yahoo v. Google "mine is bigger" back–and–forth.

Yahoo two weeks ago announced that its search engine indexes 19.2 billion documents.

Google's latest number is 8.1 billion.

So Yahoo should be far more useful than Google, right?

Well, maybe not.

Sergei Brin, Google's co–founder, said that Yahoo cheats by inflating its numbers with duplicate entries.

Also, Markoff noted that some experts believe that index size may be inversely related to the quality of search results, making smaller better.

Last Sunday researchers at the National Center for Supercomputer Applications had a kind of battle of the search engines and ran a random sample of about 10,000 searches on both Yahoo and Google.

They found that Google returned 166.9% more results than Yahoo.

In only 3% of the 10,000+ searches did Yahoo return more results.

The scientists concluded that the Yahoo claim was "suspicious."

Well, let's look at things from a few other vantage points.

First, the same issue of the Times published a table showing the relative frequency with which various search engines were used during the four weeks between July 10 and August 6.

    The results:

    1. google.com--------------59.5%

    2. search.yahoo.com------28.5%

    3. search.msn.com---------5.5%

    4. ask.com-------------------3.3%

    5. search.aol.com-----------0.9%

Now let's get a little closer to home: how about a look at which search engines bring people to bookofjoe?

At the top of this post is a representative snapshot and it's quite clear to me that Google is overwhelmingly the search vehicle of choice.

Even if you discount the Google image search numbers it's still 292 v. 49: that's nearly a 6:1 preference, even more dominant for Google than the survey above from the Times.

Finally, when I try searches on both Yahoo and Google, as I do from time to time, it's always a clear win for Google — by a mile.

August 21, 2005 at 02:01 PM | Permalink


TrackBack URL for this entry:

Listed below are links to weblogs that reference The size of the web — and other imaginary numbers:

» Breakfast: 8/22/2005 from basil's blog
TalkLeft reports on Sean Penn's report. Mark Noonan says it's okay to admit when your wrong. Jayson of PoliPundit finds the Wayback Machine in use outside the Crawford Ranch. Confederate Yankee compares two deaths in Iraq. Baldilocks has a com... [Read More]

Tracked on Aug 22, 2005 6:35:06 AM

» So, we've made it back to Monday ... from NIF
Today's dose of NIF - News, Interesting & Funny ... [Read More]

Tracked on Aug 22, 2005 8:15:21 AM


The reason bookofjoe.com is getting so many hits from google is that is has a high page rank of 6. If your page rank was lower like 2 or 1, then you'd get a lot less hits, probably more on par of that from yahoo. The Google method just seems to be better at judging sites and where they should rank on a given query then yahoo.

Posted by: Tamara | Aug 21, 2005 11:37:35 PM

The comments to this entry are closed.