I once wrote a semi-serious spoof on some Mac user attitudes under the title Are Mac users smarter than PC users?
The central schtick there was to compare the complexity and correctness of
the sentences found in typical Mac and PC discussion forums using style
.
Here's a key bit of the explanation:
By the early eighties most Unix releases, whether BSD or AT&T derived, came with the AT&T writers workbench - a collection of useful text processing utilities.One of the those was a thing called
style
.Style
is somewhat out of style these days but is on many Linux "bonus" CDs and downloadable from gnu.org as part of the diction package.
Style
produces readability metrics on text. Forget for the moment what the ratings mean and look at the numbers. For comparison here's whatstyle
says about the first 1,000 words in what is arguably the finest novel ever published in English: The Golden Bowl:
readability grades:
Kincaid: 18.2
ARI: 22.2
Coleman-Liau: 9.8
Flesch Index: 46.7
Fog Index: 21.7
Lix: 64.4 = higher than school year 11
SMOG-Grading: 13.5
Of course that's Henry James at the top of his form. For a more realistic, and interesting, baseline I collected about 2,800 lines of slashdot discussion contributions and ran
style
against them to get the following ratings summary along with a lot of detail data omited here:
readability grades:
Kincaid: 7.7
ARI: 8.0
Coleman-Liau: 9.7
Flesch Index: 72.4
Fog Index: 10.7
Lix: 37.1 = school year 5
SMOG-Grading: 9.8
I then compared a few thousand entries from Mac discussion sites with stuff from PC forums to discover that the Mac discussions obtained significantly higher scores on measures of complexity and grammatical correctness - from which I cheerfully concluded that Mac users are smarter.
Now in reality I wouldn't argue that someone's ability to structure a sentence
provides a sufficient guide to that person's relative inteligence, but the argument
probably applies quite well to internet documents - meaning that
using style
within an internet search engine's classification
makes sense, with better constructed, better expressed, materials always earning a higher page rank
than poorly constructed, poorly expressed, materials.
Do a search, for example, on "Amish history" and google now places
some drek written by a government bureaucrat at the top of the
first listings page and buries
the wikipedia entry
near the middle of page two. Adopt a
style
style metric, however, and
both entries end up where they belong - the wikipedia at the top and
the bureaucrat buried.