2006-04-30

How the Grinch stole the gift of sound judgement from the chosen people

The emerging field of Google Record Count Heuristics (GRCH) uses Google counts to assess the hidden ontology of stuff (what is there and how much of it is there and why?); naively assuming that the Google inventory somehow reflects upon the real world. Yes, it's hard to count all "Brad"s in the real world, on the other hand how to filter all the fictional "Brad"s on the Web and maybe it is always the same "Brad" they are all raving about, so what do you count? But why is "Brad" so popular, then?

So let's count the houses and sort them according to height, using the Google retrieval system:


two storey house = 488000
two story house = 573000 B (1.061.000)

three storey house = 99000
three story house = 132000 B (231.000)

four storey house = 22700 A
four story house = 19800 (42.500)

five storey house = 731 A
five story house = 642 (1.373)

six storey house = 764 A
six story house = 527 (1.291)

seven storey house = 69
seven story house = 201 B (270)

eight storey house = 27
eight story house = 55 B (82)

nine storey house = 55 A
nine story house = 36 (91)

Yeah, "four-story-house" was counted as four story house as well.


Points to learn:

People don't care whether the house has five or six levels but the difference between a four and a seven story house freaks them out.

People don't know whether to write "story" or "storey" but the dominant use is not clear-cut but strangely varies rhythmically according to height (see A/B indicator of dominance).

If you built a tower from the blueprint of the data distribution and just take the numbers as radius in cm you'd get a *very* strange "nine storey house" indeed, fat-ass giant base and flimsy needle top 'hat'.

Maybe I should make one using plywood (vide Kliban).

0 Comments:

Post a Comment

<< Home