Results 1 to 10 of 10

Thread: spamassassin - seed bayes db

  1. #1
    AO's Filibustier Cheap Scotch Ron's Avatar
    Join Date
    Nov 2008
    Location
    Swamps of Jersey
    Posts
    378

    spamassassin - seed bayes db

    I need a file with 200+ known spam emails in order to get bayesian filter/learning option seeded with enough data for it to kick in.

    Does anyone have a file of spam to lend me?

    I know, I could set up and collect, but I'm busy (and lazy).
    In God We Trust....Everything else we backup.

  2. #2
    Senior Member JPnyc's Avatar
    Join Date
    Jan 2005
    Posts
    2,734
    sorry, it would take me a few days to collect that many. I have 49 in my junk folder at the moment.

  3. #3
    AO's Filibustier Cheap Scotch Ron's Avatar
    Join Date
    Nov 2008
    Location
    Swamps of Jersey
    Posts
    378
    yeah, that's my backup plan. it will take a few days here too.
    In God We Trust....Everything else we backup.

  4. #4
    THE Bastard Sys***** dinowuff's Avatar
    Join Date
    Jun 2003
    Location
    Third planet from the Sun
    Posts
    1,253
    CSR I get around 3,000 per day. Not sure if I can dump the report into CSV. But I'll let you know if I can
    09:F9:11:02:9D:74:E3:5B8:41:56:C5:63:56:88:C0

  5. #5
    AO's Filibustier Cheap Scotch Ron's Avatar
    Join Date
    Nov 2008
    Location
    Swamps of Jersey
    Posts
    378
    Thanks dino. That would be awesome. wouldnt need to be csv. plain old text is fine. In fact, spamassassin can read it if its in maildir or mbox format.

    If you are abel to get a file for me, PM me and I will give you an ftp account to send to.

    Thanks again. CSR.

    P.S. Drinks on me at the bar on the other side.
    In God We Trust....Everything else we backup.

  6. #6
    THE Bastard Sys***** dinowuff's Avatar
    Join Date
    Jun 2003
    Location
    Third planet from the Sun
    Posts
    1,253
    Sorry mate. I can send you many charts and graphs but the only place that lists the actual emails is in a control window where you can "release" or "Mark as not spam" No way to copy all records or export.

    I'm talking about MXLogic SpamSoap admin console if anyone knows a way to dl the quarantine list.
    09:F9:11:02:9D:74:E3:5B8:41:56:C5:63:56:88:C0

  7. #7
    Senior Member t34b4g5's Avatar
    Join Date
    Sep 2003
    Location
    Australia.
    Posts
    2,391
    Greetings.

    http://www.stopforumspam.com/downloads

    Download the CSV maybe that should get you started.

    Also

    http://www.stopforumspam.com/spamdomainsandips
    Last edited by t34b4g5; May 2nd, 2009 at 05:43 AM.

  8. #8
    AO's Filibustier Cheap Scotch Ron's Avatar
    Join Date
    Nov 2008
    Location
    Swamps of Jersey
    Posts
    378
    Thanks, but I need the actual spam, not a list of addresses.

    I am trying to implement a bayesian filter that will "learn" what is spam vs ham. the code needs a seed file of known spam in order to get started.

    No worries. I have been capturing spam on my hotmail account. Should have enough in a few days.
    In God We Trust....Everything else we backup.

  9. #9
    Senior Member nihil's Avatar
    Join Date
    Jul 2003
    Location
    United Kingdom: Bridlington
    Posts
    17,188
    Hi CSR,

    I'm afraid my cats eat all the spam around here but this link might help:

    http://untroubled.org/spam/

    Over 10,000 for May 2009 already!



    I don't know what a "Lorien" file is supposed to be, but you can view them as text.

  10. #10
    AO's Filibustier Cheap Scotch Ron's Avatar
    Join Date
    Nov 2008
    Location
    Swamps of Jersey
    Posts
    378
    Great find. Thanks Old Man.

    P.S. sa-learn wasnt able to consume the folder as-is. I needed to write a script to loop through the folder list and process them explicitly one at a time. I suspect perl had problems with the large number of files. Easier to write a new script than debug perl. (Lazy).
    Last edited by Cheap Scotch Ron; May 4th, 2009 at 09:14 PM.
    In God We Trust....Everything else we backup.

Similar Threads

  1. spamassassin & bayesians
    By d34dl0k1 in forum Phishing and Cyber Scams
    Replies: 2
    Last Post: September 10th, 2007, 04:59 PM
  2. Simplified use of Baye's Theorem in Computer Networks
    By pi><boy in forum Other Tutorials Forum
    Replies: 7
    Last Post: November 7th, 2005, 05:17 AM
  3. Cracking this algorithm.
    By AxessTerminated in forum Programming Security
    Replies: 12
    Last Post: September 13th, 2004, 02:26 AM
  4. Build a spam firewall with Linux
    By problemchild in forum The Security Tutorials Forum
    Replies: 2
    Last Post: April 3rd, 2003, 12:11 AM
  5. Emperor's Seed - good one
    By intruder in forum AntiOnline's General Chit Chat
    Replies: 2
    Last Post: June 21st, 2002, 05:49 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •