-
April 30th, 2009, 08:05 PM
#1
spamassassin - seed bayes db
I need a file with 200+ known spam emails in order to get bayesian filter/learning option seeded with enough data for it to kick in.
Does anyone have a file of spam to lend me?
I know, I could set up and collect, but I'm busy (and lazy).
In God We Trust....Everything else we backup.
-
April 30th, 2009, 08:20 PM
#2
sorry, it would take me a few days to collect that many. I have 49 in my junk folder at the moment.
-
April 30th, 2009, 08:44 PM
#3
yeah, that's my backup plan. it will take a few days here too.
In God We Trust....Everything else we backup.
-
May 1st, 2009, 02:52 PM
#4
CSR I get around 3,000 per day. Not sure if I can dump the report into CSV. But I'll let you know if I can
09:F9:11:02:9D:74:E3:5B 8:41:56:C5:63:56:88:C0
-
May 1st, 2009, 04:13 PM
#5
Thanks dino. That would be awesome. wouldnt need to be csv. plain old text is fine. In fact, spamassassin can read it if its in maildir or mbox format.
If you are abel to get a file for me, PM me and I will give you an ftp account to send to.
Thanks again. CSR.
P.S. Drinks on me at the bar on the other side.
In God We Trust....Everything else we backup.
-
May 1st, 2009, 06:04 PM
#6
Sorry mate. I can send you many charts and graphs but the only place that lists the actual emails is in a control window where you can "release" or "Mark as not spam" No way to copy all records or export.
I'm talking about MXLogic SpamSoap admin console if anyone knows a way to dl the quarantine list.
09:F9:11:02:9D:74:E3:5B 8:41:56:C5:63:56:88:C0
-
May 2nd, 2009, 05:41 AM
#7
Last edited by t34b4g5; May 2nd, 2009 at 05:43 AM.
-
May 2nd, 2009, 06:46 AM
#8
Thanks, but I need the actual spam, not a list of addresses.
I am trying to implement a bayesian filter that will "learn" what is spam vs ham. the code needs a seed file of known spam in order to get started.
No worries. I have been capturing spam on my hotmail account. Should have enough in a few days.
In God We Trust....Everything else we backup.
-
May 4th, 2009, 11:19 AM
#9
Hi CSR,
I'm afraid my cats eat all the spam around here but this link might help:
http://untroubled.org/spam/
Over 10,000 for May 2009 already!
I don't know what a "Lorien" file is supposed to be, but you can view them as text.
-
May 4th, 2009, 03:51 PM
#10
Great find. Thanks Old Man.
P.S. sa-learn wasnt able to consume the folder as-is. I needed to write a script to loop through the folder list and process them explicitly one at a time. I suspect perl had problems with the large number of files. Easier to write a new script than debug perl. (Lazy).
Last edited by Cheap Scotch Ron; May 4th, 2009 at 09:14 PM.
In God We Trust....Everything else we backup.
Similar Threads
-
By d34dl0k1 in forum Phishing and Cyber Scams
Replies: 2
Last Post: September 10th, 2007, 04:59 PM
-
By pi><boy in forum Other Tutorials Forum
Replies: 7
Last Post: November 7th, 2005, 05:17 AM
-
By AxessTerminated in forum Programming Security
Replies: 12
Last Post: September 13th, 2004, 02:26 AM
-
By problemchild in forum The Security Tutorials Forum
Replies: 2
Last Post: April 3rd, 2003, 12:11 AM
-
By intruder in forum AntiOnline's General Chit Chat
Replies: 2
Last Post: June 21st, 2002, 05:49 AM
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
|