[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[humorix] Google Unveils Filter System For Websites
Google Unveils Filter System For Websites
August 12, 2005
Ninety percent of everything is crud.
-- Sturgeon's Law
Sturgeon was clueless. The real number is closer to 100%.
-- Anonymous Google employee
SILLYCON VALLEY -- With Yahoo rapidly expanding the size of
its search engine database, Google has decided to take a
different approach: shrinking the size of its universe by
removing the crud that no sane person (marketing weasels
excluded) ever wants to look at.
"Our name might be Google, but there's no reason to maintain
a database with over a googol different scams, schemes, and
various others species of shit," said a Google
spokesperson. "Those yodeling idiots can have their
billions of pages of penis-enlarging, mortgage-shrinking
insanity."
Google has quietly launched a filter system, dubbed
SturgeonAssassin, to eliminate cruddy websites from its
database. Of course, the system remains in Alpha testing
and can only be accessed through the undocumented "nocrud:"
operator.
The idea is quite simple. Users can enable and disable
different filters based on their personal preference, and
SturgeonAssassin will adjust the page ranking of each site
accordingly. The form looks something like this:
---
Filter sites that have the following properties:
[X] Domain has more than two hyphens (such as
fast-paced-casino-poker-action.com)
[X] Contains a BLINK tag
[X] Has a "Best viewed with Internet Explorer" disclaimer
[X] Tries to set a third-party cookie from a shady
advertising bureau
[X] Tries to set a cookie, period
[X] Repeatedly misuses the words "they're" and "their" or
"its" and "it's"
[X] Has a "last updated" notice containing a date before
1997
[X] Includes, or links to, some kind of "mission statement"
[X] Has a high buzzword concentration (at least 1 buzzword
per 25 words of text)
[X] Features "tips" about search engine optimization
[X] HTML title includes a phrase like "Title Goes Here" or
"Adobe GoLive 4.0"
[X] Code has unnecessary FONT tags
[ ] Code has unnecessary TABLE tags
[X] Webmaster obviously doesn't have a clue about the ALT
attribute in image tags
[X] Links to a copyright or legal notice that contains more
copy than the rest of the site combined
[X] Legal notice prohibits "linking" to the site
[X] URL ends with .htm
[X] Features images in .bmp format -- or worse, embedded in
Word documents
[X] Launches pop-up ads using Flash applets designed specifically to
bypass Firefox's pop-up blocker
[X] Launches pop-up ads, period
[X] Requires, or simply hints about requiring, user
registration
[X] Page contains "tag soup" obviously produced by a
Microsoft product
[X] Includes the phrase "As Seen On TV!"
[X] Features text-based ads from an advertising network
other than Google
[ ] Publishes fake news or sarcasm directed at Google's
attempts at world domination
[X] Whois record contains obviously bogus contact info, such
as "123 Fake Street"
[X] Attempts to disable the right-click context menu, hide
the back button, or perform other nefarious tricks
[X] Warns that the site is best viewed at a weird
resolution, such as "320x240" or "4096x3192"
[X] Has a serious lack of proper punctuation with run on
sentences that continue on and on the webmaster is
obviously some kind have clueless product of the
american education system either or learned english by
reading nothing but slashdot comments grammar is
important
[X] Every link points to a domain that was registered within
the last 3 hours
[X] Contains a high concentration of dollar signs (Perl
programming sites excluded)
---
"This is the next logical step in the arms race against
blighted websites," explained a random Google Ph.D.
"Moreover, the planned Beta version of the filter (due in
2010) will push the envelope by automatically assuming that
all websites are crap until proven otherwise. This approach
will more closely match the perception of average Internet
users. Less is more -- and we're not talking about
command-line pipe buffering commands."
Yahoo has quickly downplayed Google's latest innovation.
"Those eggheads are out of touch with cold reality. Most
people secretly enjoy crap. How do you think Hollywood has
managed to make so much money over the years? It's not
because of their creative or artistic triumphs! Crap is
king, and that's why Yahoo hopes to expand our index to
include everything, ranging from the best of the best (the
Yahoo homepage) to the absolute worst (Aunt Bertha's
Day-By-Day Timeline Of Her Dead Cat Princess III). More is
more!"
The staff of Humorix is also eyeing the new filter with
suspicion. "We poke fun at Google, we have dubious HTML
coding standards, and our grammar and spelling leafs much
bee two desired. If this becomes a standard feature of
Google, we're screwed!"
--
Humorix: Linux and Open Source(nontm) on a lighter note
Archive: http://mail.nl.linux.org/humorix/
Web site: http://www.i-want-a-website.com/about-linux/