ryanlee.org - Ryan Lee

I'm Ryan Lee.

And that's all you really need to know about me. So now that we have that out of the way. I use space here to publish photography and thoughts and software and services. There are other places nearby with design work and random data you might find useful. Aside from those main divisions, some other things I've put up: movies seen, books read, my resume.

Hope you find something fun, useful, or beautiful while you're here.

Inspecting Akismet

Akismet is a web service for classifying user-generated content as spam or non-spam. Launched with the guidance of Matt Mullenweg (of WordPress fame), Akismet is positioning itself as the best hope for defeating spam in the blogging world.

I like the idea. Requiring everybody to run an automated classification system on their simple web servers is not a practicable solution. Everything else has been about raising the barrier of entry to keep unintelligent agents from brute forcing their way in, slowly making it more and more prohibitively inconvenient to some or all users to contribute to discussions. Providing a classifying service available over the wire is definitely the way to go.

There are two main things I don't like. First, Akismet appears to be a fully centralized and thus a closed-source project. That's always an alarm no matter who's in charge. I don't mind black boxes that work, but I don't always trust their custodians of power to do the right thing, anywhere from keeping backwards compatibility to maintaing consistent terms of service to knowing how to run a business such that their service can remain steadily available to returning users.

Second, Akismet requires some form of identification. You must obtain an API key in order to participate in the system, which in turn requires an account on wordpress.com. I've not convinced myself this is really a bad thing, but it doesn't seem like a real good thing. But I don't know if it's truly possible in a decentralized system to even have reliable identification mechanisms in place, so maybe it's just a tech-related inkling.

Is it possible to make a fully decentralized, completely anonymous web service that does its job reliably, specifically, in this case, classifying text as spam? Are there lessons to be learned from trackerless BitTorrent that can be applied here? Or should we all seriously consider putting all of our eggs in Akismet's basket and, as a friend put it, watching that basket very, very carefully?