aspose file tools*
The moose likes Ranch Office and the fly likes Spamassassin doesnt like JavaRanch private messages Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » This Site » Ranch Office
Bookmark "Spamassassin doesnt like JavaRanch private messages" Watch "Spamassassin doesnt like JavaRanch private messages" New topic
Author

Spamassassin doesnt like JavaRanch private messages

Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11525
    
100

Hi everyone
Just installed spamassassin on my Linux server (it's an application which attempts to determine if an incoming email is spam based on a large number of rules). In testing it, I sent a large number of saved messages from different sources to myself.
One of the two JavaRanch private messages got tagged as being probable spam. I can easily fix this in my rules (add a rule that everything from JavaRanch is desirable (of course )). However this only fixes it for me - not for the world at large.
My question - is there anything that can be done to outgoing messages to stop the identification as possible spam?
For those curious, this is what spamassassin reported:

Regards, Andrew


The Sun Certified Java Developer Exam with J2SE 5: paper version from Amazon, PDF from Apress, Online reference: Books 24x7 Personal blog
paul wheaton
Trailboss

Joined: Dec 14, 1998
Posts: 20729
    ∞

I think that anything we might be able to do would be the same sort of thing that a spammer might be able to do. Can you find the rule that's tripping the wire for JavaRanch stuff?


permaculture Wood Burning Stoves 2.0 - 4-DVD set
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
OK, first off, some good news: what was blocked here was not the private message itself; rather it was the private message's e-mail notification. Meaning the the recipients can still view the private message on UBB (the saloon software) by clicking on "My Account". The e-mail notification isn't necessarily required. However, most people don't routinely check their PMs that way; they depend on e-mail notification to know that there's a message at all. So while in this case the message has not been lost, it may never be read.
Can you find the rule that's tripping the wire for JavaRanch stuff?
Isn't that what the listing is? Looks like it's an accumulation of little things, mostly. The system assigns points for various things, and if a mail gets five or more points, it's assumed to be spam. The JavaRanch mail scored 5.5. Of that, the biggest two contributors are:
SPAM: NO_REAL_NAME (1.3 points) From: does not include a real name
SPAM: SPAM_PHRASE_05_08 (1.6 points) BODY: Spam phrases score is 05 to 08 (medium) [score: 6]
The from field currently appears as moosesaloon@javaranch.com. We could substitute another address, but it would have to be the same address for everyone. (The system isn't set up to reveal a PM user's address - that's one of the points of using PM.) We might be able to insert a real name in addition to the moosesaloon@javaranch.com address - but whose? I don't really want mine there, even though I'm the one who receives moosesaloon@javaranch.com - I had nothing to do with the PM after all. Moosesaloon is just there as a contact address in case someone has technical difficulties. We could make up a name for inclusion here maybe, but that's dishonest, isn't it? Hurm...
As for the "SPAM_PHRASE" part - unfortunately I can't find documentation online on this part. Andrew, does your installation have any doc for this?
Here's the boilerplate included with a PM notification. Before message:
Private Message Notification
[sender name] just sent you ([recipient name]) a Private Message at JavaRanch Big Moose Saloon.
Here is the message sent by [sender name]:

And after the message:

To view this private message, click here [link].
You are being notified because you are have instructed us to send you a notification each time someone sends you a private message. You can disable this automatic notification in your profile settings on the board.
Please do not reply to this email. The person sending you the private email did not send this notification. To reply to the private message, you must do so from the message board.
Contact Us | JavaRanch

I'll guess that phrases like "you are being notified" and "please do not reply to..." might be considered suspicious - it does sound a bit like a form letter. (That's what it is, after all.) Not sure how or whether to improve it though.
These are interesting:
SPAM: HTML_FONT_COLOR_CYAN (0.4 points) BODY: HTML font color is cyan
SPAM: HTML_FONT_COLOR_UNSAFE (0.3 points) BODY: HTML font color not within
safe 6x6x6 palette
We're being penalized for using the color cyan? :roll: What's this about? Though this may be the easiest thing to change if we figure out a nice "safe" color. Apparently we can shave off .7 points, which would take us within the threahold of Andrew's original settings.
SpamAssassin's list of tests is interesting and somtimes amusing. Mentioning Nigeria, Viagra, and university diplomas are all bad. (Duh.) So are phrases like "up to X or more", "for only pennies a day", "take action now". But my favorites are the phrases like "cannot be considered spam", "This is not spam", "We strongly oppose the use of spam email" - which, shockingly :roll: seem to indicate the e-mail is spam.


"I'm not back." - Bill Harding, Twister
Thomas Paul
mister krabs
Ranch Hand

Joined: May 05, 2000
Posts: 13974
How does it know that moosesaloon@javaranch.com is not a real name?


Associate Instructor - Hofstra University
Amazon Top 750 reviewer - Blog - Unresolved References - Book Review Blog
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Dunno. Maybe it's looking for something in quotes, like
"Joe Foo" <jf1234@bar.com>
We could try making our address appear as
"Big Moose Saloon" <moosesaloon@javaranch.com>
[ July 03, 2003: Message edited by: Jim Yingst ]
paul wheaton
Trailboss

Joined: Dec 14, 1998
Posts: 20729
    ∞

Can we make the sent e-mail use plain text? It seems if you make a complete URL, almost all mail programs convert that into a link.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Can we make the sent e-mail use plain text?
Yup. I have done so and left it that way for now - can change it back any time. Let's see what people think. Andrew, if you can set your spamassassin back to its original settings without too much trouble, we can PM you to see if the new format works better.
Here is the text of the notification I just received after sending a PM to myself:
Private Message Notification
Subject: test
Jim Yingst just sent you (Jim Yingst) a Private Message at JavaRanch
Big Moose Saloon.
Here is the message sent by Jim Yingst:
--------------------------------------------------
test
--------------------------------------------------
To view this private message, visit:
http://www.javaranch.com
You are being notified because you have instructed us to send you a
notification each time someone sends you a private message. You can
disable this automatic notification in your profile settings on the board.
Message Board: http://www.javaranch.com
Please do not reply to this email. The person sending you the private
email did not send this notification. To reply to the private message,
you must do so from the message board.

Looks good enough to me. Do we lose anything important by not using HTML here?
I also tried modifying the controls to use a support address of
"Moose Saloon" <moosesaloon@javaranch.com>
but unfortunately the system takes this and tries to stick it into mailto: links in various places where it doesn't work. So no go there, unless we hack some Perl scripts.
[ July 04, 2003: Message edited by: Jim Yingst ]
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11525
    
100

Wow - I kind of expected to get a "talk to the makers of the BBS" type response. Then I log on this morning and find a possible solution has already been implemented. Very impressive.
OK - I have put my rules back to default installation, so if someone can send me a PM I will let you know the results.
I cannot send myself a PM - is that something that you can do Jim because you are a bartender, or do you have multiple accounts?
Regards, Andrew
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11525
    
100

I will try and work through the code today (don't hold your breath though), and see if I can work out what the phrases were that it was complaining about, and what colours are acceptable.
Regards, Andrew
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Not urgent, but good to know if you get a chance. Thanks.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Wow - I kind of expected to get a "talk to the makers of the BBS" type response. Then I log on this morning and find a possible solution has already been implemented. Very impressive.
Well, you kind of lucked out and brought up something that we could control, and hadn't already been discussed much. (There were one or two comments about text vs. HTML in the past, but the possible connection to spam filtering hadn't been mentioned that I recall. Plus we're in an unusually proactive mood right now.
OK - I have put my rules back to default installation, so if someone can send me a PM I will let you know the results.
Done.
I cannot send myself a PM - is that something that you can do Jim because you are a bartender, or do you have multiple accounts?
I have multiple accounts, mostly so that I can test what things look like for non-bartenders. Well, and a couple others for special occasions.
[ July 04, 2003: Message edited by: Jim Yingst ]
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11525
    
100

OK, that got through without beeing triggered as spam.
Looking in the headers, I can see that it is still triggering on a couple of points, probably not worth worrying about from an adminstative point of view though:

That was from the headers, so the descriptions are far more terse. The normal user would not get to see them. The descriptions are:

The non real name is being triggered by the following perl regular expression:

I don't know what differences exist in perl regular expressions compared to other regular expressions. Anyone care to have a stab at decoding this? I think it is looking for a quoted name (as suggested by Jim).
The Message-Id is:

Is Verge Technologies Group the ISP for JavaRanch?
Regards, Andrew
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11525
    
100

OK, here are the phrases it is identifying as being common in spam:

The second column is a score - not sure how it works, I assume they get added together or something? Different phrases get different scores, with the highest score going to "temple kiff" (narowly beating "million mails").
Where we used those phrases:

Please do not reply to this email. The person sending you the private email did not send this notification. To reply to the private message, you must do so from the message board.

I guess the wording could be changed to use words that are not in the phrase list (e.g. change "you must" to "you should"). Doing so would remove the 0.758 points we always get for the disclaimer. But it may be a wasted effort - as Paul mentioned, anything we can do, a spammer could also do. So anything we change may get caught on the next round of updates.
Anyway, if people would like me to try and work out alternate wordings that don't get caught, let me know. One possibility could be:

Please do not reply via email. The person sending you the private message did not send this informational notification. To reply to the private message, you should do so from the message board.

(I also changed "private email" to "private message" because I thought it might cause confusion).
Not sure if I changed some of the meanings in doing this
By the way, the old HTML message phrases were:

The "click here" was used in:

To view this private message, click here.

The "for your" was actually part of the private message that was sent to me - there is nothing that can be done about that
The other words are still in the current text format.
Regards, Andrew
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11525
    
100

One of these days I will put all my thoughts down before hitting send, and save myself from sending multiple posts in quick succession :roll:
Anyway, Jim it looks like the template contains the words "click here". If so, then it could be changed to "click this link" and it would not get identified as a spammer's phrase.
The "SUPERLONG_LINE" is only worth 0.009 points. I guess that is to be expected, as many normal emails would be expected to have complete paragraphs. The superlong line in our case is the "You are being notified..." paragraph.
On the question of colours, all colours in an email get a score apparently. It is just that some of them get higher scores. The scores for colours are:

Regards, Andrew
[ July 03, 2003: Message edited by: Andrew Monkhouse ]
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11525
    
100


I don't know what differences exist in perl regular expressions compared to other regular expressions. Anyone care to have a stab at decoding this? I think it is looking for a quoted name (as suggested by Jim).

</blockquote>
Ok, checked up on Perl regular expressions, and they seem to be the same as what we use in Java.
My interpretation is:

This will match on MooseSaloon@javaranch.com (hence our trigger) but wont match on "MooseSaloon@javaranch.com" <MooseSaloon@javaranch.com>.
Regards, Andrew
[ July 04, 2003: Message edited by: Andrew Monkhouse ]
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Based on your feedback Andrew, I have revised the outgoing message, replacing "You are being notified..." with the following:
Look, you asked to get e-mail whenever someone sent you a private message, and now it's happened. Deal with it. Don't go hitting the "reply" key - that just sends mail to me (not the one who sent the private message) and I really don't know anything about your message. Or care. So just follow the link, and don't bother me any more.

Well, I didn't actually, but I was tempted. Instead it's now:
Please do not reply to this notification, since that goes to the wrong address. The person who sent you the private message did not send this notification. To reply to the private message, follow the link to the message board and reply there, not by e-mail.

I also sent you another test, Andrew, if you're interested.
Can't do anything about the "not a real name" it appears. (Not without a lot more hacking than I'm inclined to for this purpose.) But hopefully the other changes will help.
Of course, there are probably a lot of different spam filters out there, and if we defeat one, so can spammers. Which will mean spam filters will have to come upwith yet more clever filtering tricks... yadda, yadda. Not sure I want to play that game in the long run. But it's interesting to have an idea how these things work. Thanks for your time & effort, Andrew.
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11525
    
100

Awww, I prefer your first version. Can we please have it? An auto message with attitude!
Getting closer, now it is only reporting one spam phrase:

The one spam phrase it finds is the "this not" from "this notification.
Regards, Andrew
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11525
    
100

I do agree that any changes we make may be blown out of the water in a future upgrade of the spam checker (or even fail on a different spam checker). And I think there is a point where the return we get on investment of time diminishes to nothing.
I am very happy with what we now have.
But: I still prefer your alternate disclaimer. And it doesnt have any "spam phrases"
Regards, Andrew
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Cool. I subsequently changed several "notification" instances to "notice" since that flowed better IMO, and we get the same penalty for either word. Browsing through some of the messages UBB sends, it seems a lot of them could use rewriting to remove some of the starch. Instead of "here's your new password" UBB would probably say "pursuant to your request regarding the assignment of a previously unassigned password for your personal usage, herein find a newly generated password which you may now use." :roll: I'm only partly joking, unfortunately. However many of these messages are for features we have turned off, like age verification, e-mail confirmation, registration moderation. So you don't see many of the messages. Which is probably a good thing.
Glad you liked my alternate disclaimer.
And it doesnt have any "spam phrases"
"I'm sorry, sir, but the only way we could get the message through the spam filters is by phrasing it as 'piss off, you!'. No personal insult was intended, really."
paul wheaton
Trailboss

Joined: Dec 14, 1998
Posts: 20729
    ∞

You may have danced around one spam filter only to trigger others more than you did before.
How about if we cowboy-ify the text a bit? That should slide by spam filters much better.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
You may have danced around one spam filter only to trigger others more than you did before.
Possible. But the changes so far make sense to me as making the thing sound less like spam, and a little more readable, IMO. We'd probably get diminishing returns the more we worry aboutthe feedback of one particular filter, but my gut feeling is we're not too near that point yet.
How about if we cowboy-ify the text a bit? That should slide by spam filters much better.
Probably true. Might also confuse some of our international readers. I'm less of a fan of the heavy cowboy accent than you are - and also I'm not as good at it. So I'm drawing a blank as to how to western it up effectively in this case. But if you or someone else wants to offer an alternate text, I can put it up for you.
paul wheaton
Trailboss

Joined: Dec 14, 1998
Posts: 20729
    ∞

Perhaps we could e-mail the text to the cowgirl along with your concerns. She's quite the master at this sort of thing.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
I sent her a PM referenceing this thread.
Andrew Monkhouse
author and jackaroo
Marshal Commander

Joined: Mar 28, 2003
Posts: 11525
    
100

FYI, Just received a Book Promotion email from JavaRanch. Surprisingly it scored even lower on the spam scale that PM notifications:

Spelling those out (Descriptions of problems found, followed by scores for these problems, followed by matching phrases):

The free trial is from: "JavaRanch is hosted by the Premium Java Web Hosting Company Ejip.Net. Free Trial!"
The spam phrases for the 2..3 matches were:
  • "this email" (You are receiving this email)
  • "with your" (with your username and password)
  • "with you" (with your username and password)
    We get charged for the same phrase twice?
    Regards, Andrew
    [ July 08, 2003: Message edited by: Andrew Monkhouse ]
    [ July 08, 2003: Message edited by: Andrew Monkhouse ]
  • Andrew Monkhouse
    author and jackaroo
    Marshal Commander

    Joined: Mar 28, 2003
    Posts: 11525
        
    100

    By the way, Spamassassin was quite happy with the book promotions email address:

    Regards, Andrew
    [ July 08, 2003: Message edited by: Andrew Monkhouse ]
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    Hunh - I had tried

    and it didn't work - but didn't think to try

    I've now implemented the latter, and so far it looks good. Let me know if anyone sees any ill effects from this.
    Thomas Paul
    mister krabs
    Ranch Hand

    Joined: May 05, 2000
    Posts: 13974
    Gives an excuse about why you were sent this spam
    Well, that sucks. If we put it in then the software thinks we are spam. If we leave it out then people forget that they registered and accuse us of spamming them. :roll:
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    This goes back to phrasing. Perhaps something like: "look, you asked for this, so don't go whining about it now!" :roll:
    Andrew Monkhouse
    author and jackaroo
    Marshal Commander

    Joined: Mar 28, 2003
    Posts: 11525
        
    100

    Hi Jim
    Your last PM looked good. We are down to 1.5 points now:

    Regards, Andrew
    Jim Yingst
    Wanderer
    Sheriff

    Joined: Jan 30, 2000
    Posts: 18671
    Cool. Thanks for all the help, Andrew.
    Anupam Sinha
    Ranch Hand

    Joined: Apr 13, 2003
    Posts: 1088
    I wonder will these measures also show up in the yahoo mail as yahoo mail also regards the PM notification as spam. Great job Jim and Andrew.
    [ July 09, 2003: Message edited by: Anupam Sinha ]
    Anupam Sinha
    Ranch Hand

    Joined: Apr 13, 2003
    Posts: 1088
    I created a new account at javaranch and sent a PM to myself well it is no longer considered a spam.
    Now, how do I delete my other account.
    [ July 09, 2003: Message edited by: Anupam Sinha ]
    Marilyn de Queiroz
    Sheriff

    Joined: Jul 22, 2000
    Posts: 9053
        
      12
    A sheriff can delete it for you if you tell him which one to delete.


    JavaBeginnersFaq
    "Yesterday is history, tomorrow is a mystery, and today is a gift; that's why they call it the present." Eleanor Roosevelt
    a bcd
    Greenhorn

    Joined: Jul 09, 2003
    Posts: 1
    This is my other id please delete it
    Anupam Sinha
    Member # 48296
     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: Spamassassin doesnt like JavaRanch private messages