|Detailed Spam Breakdown - monday 2005-12-12 0633||last modified 2005-12-12 0710|
Unless I find something else fascinating buried in my data, this will be the last in my spate of posts on spam.
The average file size of spam was very close to 10KB per mail. Altogether, the mail represented in these graphs comes to 19,660. They come from all of my real addresses, for personal and work use. The number of mail miscategorized as spam that was actually legitimate and that I know of is 13, a rate of 0.066%. Considering all my spam is there for examination, I suppose I could pore over it at a later date to see if I missed any of SpamAssassin's mistakes in that direction.
Spam received by week. The bottom of the stacked chart is failure on SpamAssassin's part to categorize properly.
The percentage failure rate by SpamAssassin from week to week. That outlier week in October is highly skewed due to my flagging a number of quasi-legitimate posts to a mailing list as spam.
What's missing is a comparison against mail I receive that's legitimate. That would also allow me to decide whether my threshhold value for spam could stand to be lowered or whether that would result in too many false positives. I hesitate to plot my present valid email data since I'm on moderated mailing lists and also tend not to keep trashed mail around; the numbers would be inaccurate with no way to measure the error rate. Perhaps for 2006 I'll keep all of it and see how those numbers work out.
You must login to leave a comment