SitePoint Sponsor

User Tag List

Results 1 to 17 of 17
  1. #1
    SitePoint Addict phptek's Avatar
    Join Date
    Jun 2002
    Location
    Wellington, NZ
    Posts
    363
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Mail, PHP and spam?

    Hi there - not sure this is the right forum for this post - so feel free to suggest an alternative

    I've just written a system around the PHPmailer class with the choice to send HTML as well as plain-text mail to clients of ours.

    However: I'm testing it by sending it to Outlook and some other mail-clients such as yahoo.co.uk (who use spamassin). The latter are rating my innocent wee emails as spam! - amongst points awarded for SPAM status are:

    Content analysis details: (5.00 points, 5 required)
    EXTRA_MPART_TYPE (0.0 points) Message with extraneous Content-type:...type= header
    UNDISC_RECIPS (1.2 points) Valid-looking To "undisclosed-recipients"
    HTML_MESSAGE (0.1 points) BODY: HTML included in message
    HTML_40_50 (1.1 points) BODY: Message is 40% to 50% HTML
    PORN_4 (2.5 points) URI: URL uses words and phrases which indicate porn (4)
    MIME_HTML_ONLY (0.1 points) Message only has text/html MIME parts
    How can I reduce the points my mails accrue by reducing, say the PORN_4 score (FYI there is no porn in these mails and what are the criteria for judging a URL possessing "...words and phrases which indicate porn" ) ??

    Thanks a lot!

  2. #2
    No. Phil.Roberts's Avatar
    Join Date
    May 2001
    Location
    Nottingham, UK
    Posts
    1,142
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You would eliminate most of those by simply not using HTML emails. Spamasassin will mark those as possible spam right from the word go.

    Also, as you're probably sending mail by Bcc-ing it to multiple recipients the recipeints themselves won't get a valid 'To' header, which would also trigger the spam filters.

  3. #3
    SitePoint Evangelist elgumbo's Avatar
    Join Date
    Nov 2002
    Location
    North West, UK
    Posts
    545
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Also, by using BCC I think Hotmail autmatically moves the email into the Spam Inbox meaning that even if your message gets through the usual spam filters it will still not get read by your client.

  4. #4
    SitePoint Member
    Join Date
    Apr 2003
    Location
    Croatia
    Posts
    14
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Here is the regex that matches URL in your email:

    /^https?:\/\/[\w\.-]*(?xx|(?<!es|ba)(?<!dle|sus)sex|anal(?!og|y[sz])|slut|pussy|(?<!cir)(?<!\bdo)cum(?!ul|be?r|b?en)|nympho|suck|porn|hard-?core|taboo|whore|voyeur|lesbian|gurlpages|naughty|lolita|(?<!thir|four|eigh|nine)(?<!fif|six)(?<!seven)teen|schoolgirl|kooloffer|erotic|lust(?!(?<=illust)(?:rat|rious)|(?<=clust)er)|pant(?:y|ies))[\w-]*\./

    so try to create URL that passes this regex and your mail will not beeing marked as a spam

  5. #5
    SitePoint Member
    Join Date
    Apr 2003
    Location
    Croatia
    Posts
    14
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    also, as Phil.Roberts said...
    you should put an entry in the "To:" line header as many ISP's require that an entry be present here or your "Bcc:" only email will be considered as SPAM. If your ISP doesn't require an entry in the "To:" line, they may automatically insert "Undisclosed Recipients" there for you and SpamA. will increase spam score by 1.1.

    You can handle this by either entering your own name there, or setting up an address such as "Friends" (where the real Email address is a valid reply address - usually yours).

  6. #6
    No. Phil.Roberts's Avatar
    Join Date
    May 2001
    Location
    Nottingham, UK
    Posts
    1,142
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by el8
    Here is the regex that matches URL in your email:

    /^https?:\/\/[\w\.-]*(?xx|(?<!es|ba)(?<!dle|sus)sex|anal(?!og|y[sz])|slut|pussy|(?<!cir)(?<!\bdo)cum(?!ul|be?r|b?en)|nympho|suck|porn|hard-?core|taboo|whore|voyeur|lesbian|gurlpages|naughty|lolita|(?<!thir|four|eigh|nine)(?<!fif|six)(?<!seven)teen|schoolgirl|kooloffer|erotic|lust(?!(?<=illust)(?:rat|rious)|(?<=clust)er)|pant(?:y|ies))[\w-]*\./

    so try to create URL that passes this regex and your mail will not beeing marked as a spam
    Nearly all my spam is for medical products. I very rarely get porn spam.

  7. #7
    SitePoint Member
    Join Date
    Apr 2003
    Location
    Croatia
    Posts
    14
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I forgot to write that above regex is for PORN_4 test

  8. #8
    SitePoint Addict phptek's Avatar
    Join Date
    Jun 2002
    Location
    Wellington, NZ
    Posts
    363
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    el8:

    I sent a very similar message to my Yahoo! account today and Spamassassin ignored it even without a "To" header!?.

    Although I take your points as to how to reduce the score for PORN_4 qualification.

    OUI - is there a place one can go for full definitions of criteria for PORN_4 & PLING (whatever that is)? and if there is howcome spammers haven't caught on to modifying their own headers?

    Thanks to everyone BTW

  9. #9
    SitePoint Member
    Join Date
    Apr 2003
    Location
    Croatia
    Posts
    14
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Spamassassin is probably set not to rewrite message with report if a message is not spam... (if score <= 5.0)

    For a full definition of rules you can dl conf files from here:
    http://www.spamassassin.org/dist/rules/

  10. #10
    SitePoint Author silver trophybronze trophy
    wwb_99's Avatar
    Join Date
    May 2003
    Location
    Washington, DC
    Posts
    10,635
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Having sent a fair amout of blast emails, here are some lessons learned:

    1) Make the messages individually to someone. EG, send the same message to bob, then sandy, then al. Do not use mass BCCs. Some companies now block emails with too many recipients (50 or so) at the gateway.

    2) Plain text works better, but everyone wants HTML. True multipart emails seem to set off red flags though. Ugh. No real way around this one aside from keeping it fugly.

    3) You actually dont want to send things too fast--one thing spam blocking software checks for is a mass of similar emails coming from the same server. Better yet, rewrite the package to 'server hop,' speeding up sending and making it harder to block.

    4) Avoid words like 'Free,' that porn stuff, in the title.

    5) If possible, add in a couple random MD5 strings hidden in the message somewhere. Helps make similar emails seem different.

    WWB

  11. #11
    SitePoint Addict phptek's Avatar
    Join Date
    Jun 2002
    Location
    Wellington, NZ
    Posts
    363
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    wwb:

    Thanks for the pointers - but these are starting to sound more sly hacks than legitimate workarounds.

    However - points noted, so thank you.

  12. #12
    SitePoint Author silver trophybronze trophy
    wwb_99's Avatar
    Join Date
    May 2003
    Location
    Washington, DC
    Posts
    10,635
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    They are essential hacks and workarounds. Then again, there is very little to differentiate from legitimate bulk email (like our newsletters) from illegitimate bulk email (aka SPAM). So one is forced to take on spam-like tactics to make sure one's audience recieves the message.

    WWB

  13. #13
    SitePoint Wizard samsm's Avatar
    Join Date
    Nov 2001
    Location
    Atlanta, GA, USA
    Posts
    5,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I think the fear is that, as spam catching technology becomes more advanced these methods of avoiding spam catchers will actually trigger the spam catchers.

    For example, the MD5 hashes. How long before spam catchers start looking for the hash? I suppose you could cross those roads when you arrive at them, but what if you get blacklisted before you notice? I don't know if that would happen or not, there are so many variables flying around... spam is handled in 100s of different ways.

    It's a rough position for everyone, really... tricky stuff.
    Using your unpaid time to add free content to SitePoint Pty Ltd's portfolio?

  14. #14
    SitePoint Enthusiast
    Join Date
    Jun 2003
    Location
    El Toro, CA (USA)
    Posts
    32
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by wwb_99
    5) If possible, add in a couple random MD5 strings hidden in the message somewhere. Helps make similar emails seem different.
    On a note about the md5, you could just add a md5 hash to the end of one of your urls (ex: http://www.blah.com/?id=[hash here]). I don't know if this sets any flags with spamassassin, but it'll make your email cleaner [no random text anywhere]).

  15. #15
    SitePoint Author silver trophybronze trophy
    wwb_99's Avatar
    Join Date
    May 2003
    Location
    Washington, DC
    Posts
    10,635
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by samsm
    I think the fear is that, as spam catching technology becomes more advanced these methods of avoiding spam catchers will actually trigger the spam catchers.
    That is quite a realistic fear. It is essentially an arms race. We just installed spam catching software at work--now I dont get some of my newsletters.

    The biggest fear, as you mention, is blacklisting. We are actually going to separate domains for our newsletter addresses to preclude that possibility.

    I really think the proper way around such problems as this--opt-in type lists that get hammered by spamassassain--is white listing. But that is alot of infrastructure for a few webzines and e-newsletters.

    WWB

  16. #16
    SitePoint Author silver trophybronze trophy
    wwb_99's Avatar
    Join Date
    May 2003
    Location
    Washington, DC
    Posts
    10,635
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by obscura
    On a note about the md5, you could just add a md5 hash to the end of one of your urls (ex: http://www.blah.com/?id=[hash here]). I don't know if this sets any flags with spamassassin, but it'll make your email cleaner [no random text anywhere]).
    Not a bad idea. The MD5s can be hidden in HTML emails without too much trouble as well. In any case, you are correct in pointing out that it should never be directly visible to the recipient.

    WWB

  17. #17
    SitePoint Wizard samsm's Avatar
    Join Date
    Nov 2001
    Location
    Atlanta, GA, USA
    Posts
    5,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by obscura
    On a note about the md5, you could just add a md5 hash to the end of one of your urls (ex: http://www.blah.com/?id=[hash here]). I don't know if this sets any flags with spamassassin, but it'll make your email cleaner [no random text anywhere]).
    A better idea than random text, and it could actually be useful for tracking who clicks.

    On the other hand, some spam catcher might think it looks like an affiliate link.
    Using your unpaid time to add free content to SitePoint Pty Ltd's portfolio?


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •