This how to will configure a public folder for spam so that Zarafa users can "train" the SpamAssassin spam filter to be more accurate. This folder is accessible via the Webmail or Outlook. I assume you already have Zarafa installed and running as per the forum thread - you will also need the Zarafa-Gateway service running
Big thanks to the SME thread here, the steps below are derived from it
http://wiki.contribs.org/Zarafa_Bayesian_Learning
[ol]
Configure ClearOS LDAP to use Zarafa LDAP schema
Modify webconfig php so that it recognises the Zarafa LDAP parameters
Edit /var/webconfig/api/User.class.php, add the "zarafa-user" to the core class list. Around line 210:
Create the Zarafa Public store (if you haven't already)
Create a user account via the webconfig with mail permissions only specifically for collecting the mail. I've used 'SpamAdmin' account below
Use LdapAdmin or similar tool to assign a user Zarafa admin account, and add zarafa-user to object class. See screenshots below.
a. First set your LDAP publish policy to enabled within the webconfig
b. Use LdapAdmin to login using your LDAP credentials Bind DN, Base, and Bind Password
c. Right click on your user, and add the Zarafa-user object class to the drop down list on the left hand side
d. Then change the Zarafa-Admin attribute to 1
Create ham/spam folders and assign permissions
Login to Zarafa with an account that has admin rights as assigned above and make two new folders LearnAsSpam and LearnAsHam under: Public folder > Public folders. Set the permissions (right-click folder > Properties > Permission-tab) on both these new folders to:
Spam administration account
* Folder visible
* Read items
* Edit items: all
* Delete items: all
Everyone (and/or other users/groups you've added at least need
* Folder visible
* Create items
* Edit items: none
* Delete items: none
Create a script that will login via IMAP check the folders periodically and pull mail into the ClearOS spam training system. Note that you will need the Zarafa Gateway service running for IMAP access. You can run this on a seperate port if you are still using the ClearOS IMAP. I've assumed that Zarafa Gateway is listening on port 143.
Create /usr/bin/Zarafa-sa-learn.pl
Make sure it's executable
Create cron script to regularly pull mail from the Public folder into the Spam training system
Create /etc/cron.d/zarafa-spamassassin
Test! you should now be able to login via webmail or Outlook, and drag (move) mail that isnt' being correctly recognised as SPAM into the LearnAsSpam folder and ClearOS will 'auto-learn' it. The same applies for HAM (mail incorreclty marked as SPAM). Note that mail is deleted after it is moved into the Public Folders so keep a copy if you need it.
[/ol]
Enjoy!
Big thanks to the SME thread here, the steps below are derived from it
http://wiki.contribs.org/Zarafa_Bayesian_Learning
[ol]
cp /usr/share/zarafa/zarafa.schema /etc/openldap/schema/
vi /etc/openldap/slapd.conf
## add the following line after the top section
include /etc/openldap/schema/zarafa.schema
## restart LDAP
service ldap restart
Edit /var/webconfig/api/User.class.php, add the "zarafa-user" to the core class list. Around line 210:
$this->coreclasses = array(
'top',
'posixAccount',
'shadowAccount',
'inetOrgPerson',
'kolabInetOrgPerson',
'hordePerson',
'pcnAccount',
'zarafa-user',
);
zarafa-admin -s
a. First set your LDAP publish policy to enabled within the webconfig
b. Use LdapAdmin to login using your LDAP credentials Bind DN, Base, and Bind Password
c. Right click on your user, and add the Zarafa-user object class to the drop down list on the left hand side
d. Then change the Zarafa-Admin attribute to 1
Login to Zarafa with an account that has admin rights as assigned above and make two new folders LearnAsSpam and LearnAsHam under: Public folder > Public folders. Set the permissions (right-click folder > Properties > Permission-tab) on both these new folders to:
Spam administration account
* Folder visible
* Read items
* Edit items: all
* Delete items: all
Everyone (and/or other users/groups you've added at least need
* Folder visible
* Create items
* Edit items: none
* Delete items: none
Create /usr/bin/Zarafa-sa-learn.pl
#!/usr/bin/perl
#
# Extract mail from imap server shared folder 'Public folders/LearnAsSpam' & 'Public folders/LearnAsHam'
# Orig by dmz@dmzs.com - March 19, 2004
# http://www.dmzs.com/tools/files/spam.phtml
# Modified for compatibility with ClearOS spam training Jan 14, 2011, spam training handled by external script
# LGPL
use Mail::IMAPClient;
my $debug=0;
my $salearn;
my $filterdir="/var/spool/filter/training";
my $spamwebclient="$filterdir/spam-web";
my $notspamwebclient="$filterdir/notspam-web";
# # # # # # # # # # EDIT USER AND PASSWORD (CHECK PORT) # # # # # # # # # #
my $imap = Mail::IMAPClient->new( Server=> '127.0.0.1:143',
User => 'spamadmin',
Password => 'yourpasswordhere',
Debug => $debug);
if (!defined($imap)) { die "IMAP Login Failed"; }
# If debugging, print out the total counts for each mailbox
if ($debug) {
my $spamcount = $imap->message_count('Public folders/LearnAsSpam');
print $spamcount, " Spam to process\n";
my $nonspamcount = $imap->message_count('Public folders/LearnAsHam');
print $nonspamcount, " Notspam to process\n" if $debug;
}
# Process the spam mailbox
$imap->select('Public Folders/LearnAsSpam');
my @msgs = $imap->search("ALL");
for (my $i=0;$i <= $#msgs; $i++)
{
# export message to file
$imap->message_to_file($spamwebclient.".zarafa.".time.".".$i,$msgs[$i]);
# delete processed message
$imap->delete_message($msgs[$i]);
open("placeholder",'>'.$spamwebclient) or die "Can't create placeholder $!";
close("placeholder");
}
$imap->expunge();
$imap->close();
# Process the not-spam mailbox
$imap->select('Public Folders/LearnAsHam');
my @msgs = $imap->search("ALL");
for (my $i=0;$i <= $#msgs; $i++)
{
$imap->message_to_file($notspamwebclient.".zarafa.".time.".".$i,$msgs[$i]);
# delete processed message
$imap->delete_message($msgs[$i]);
open("placeholder",'>'.$notspamwebclient) or die "Can't create placeholder $!";
close("placeholder");
}
$imap->expunge();
$imap->close();
$imap->logout();
Make sure it's executable
chmod a+x /usr/bin/Zarafa-sa-learn.pl
Create /etc/cron.d/zarafa-spamassassin
# Extract mail from spam folders for training every 10minutes from 8:00 to 22:00 during weekdays
*/10 8-22 * * 1-5 root /usr/bin/Zarafa-sa-learn.pl
[/ol]
Enjoy!
Share this post:
Responses (40)
-
Accepted Answer
-
Accepted Answer
Hi,
First off thx Tim for doing a great job on Zarafa, i have been using it for 3year now and i love it.
Thinking about upgrading to Clearos 7 and use the home edition
I have a question about the script. It seem to be working in 6.7 but i the mail get's moved to /var/spool/filter/training and is stored there.
I this normal ?
should i delete the files in the directory from time to time or not ?
i hope someone has a anwser? -
Accepted Answer
Al Catoe wrote:
[quote]Tim Burgess wrote:
This how to will configure a public folder for spam so that Zarafa users can "train" the SpamAssassin spam filter to be more accurate. This folder is accessible via the Webmail or Outlook. I assume you already have Zarafa installed and running as per the forum thread - you will also need the Zarafa-Gateway service running
Big thanks to the SME thread here, the steps below are derived from it
http://wiki.contribs.org/Zarafa_Bayesian_Learning
[ol]
Tim, will this work on a fresh install of Home 7.0? I had this working on previous 6.5 and I would like to continue using it.
Thanks!
I can confirm this works with ClearOS 7 Home - at least the script works... will see if spam reduces! -
Accepted Answer
[quote]Tim Burgess wrote:
This how to will configure a public folder for spam so that Zarafa users can "train" the SpamAssassin spam filter to be more accurate. This folder is accessible via the Webmail or Outlook. I assume you already have Zarafa installed and running as per the forum thread - you will also need the Zarafa-Gateway service running
Big thanks to the SME thread here, the steps below are derived from it
http://wiki.contribs.org/Zarafa_Bayesian_Learning
[ol]
Tim, will this work on a fresh install of Home 7.0? I had this working on previous 6.5 and I would like to continue using it.
Thanks! -
Accepted Answer
-
Accepted Answer
-
Accepted Answer
Tim: I got this to work, and it does a brilliant job! Now almost all spam goes where it should go. Only one thing: I cannot get the cronjob to work. I have to start it manually. Could there be any typo in the cronjob script? I'm not good at finding faults like that..
Location [ View Larger Map ]
-
Accepted Answer
-
Accepted Answer
Thanks Tim.
I am trying to figure out why I am getting this error when the cron job runs:
binmode() on closed filehandle $fh at /usr/share/perl5/vendor_perl/Mail/IMAPClient.pm line 926.
print() on closed filehandle $fh at /usr/share/perl5/vendor_perl/Mail/IMAPClient.pm line 1736.
print() on closed filehandle $fh at /usr/share/perl5/vendor_perl/Mail/IMAPClient.pm line 1767.
Can't create placeholder No such file or directory at /usr/bin/Zarafa-sa-learn.pl line 45.
Any advice?
Thanks,
Al -
Accepted Answer
-
Accepted Answer
How do you modify webconfig php so that it recognises the Zarafa LDAP parameters as instructed by "Edit /var/webconfig/api/User.class.php, add the "zarafa-user" to the core class list. Around line 210:"? I know this path is legacy now, but how can you do this in 6.5 with Zarafa 7.1.7?
Thanks! -
Accepted Answer
I have waiting this fiture too for Zarafa ClearOS 6x, because there are some email in the junk folder already and should be marking as not junk (as not spam)
I see it have been add to the roadmap and confirmed
" 0000662: [app-zarafa - Zarafa Engine] Add spam training from timb80 (pbaldwin) - confirmed."
Thanks,
Ingkram -
Accepted Answer
I added to following to the bottom of the script in the first post to force sa-learn to run after the mail has been exported. I don't know if its 100% needed but I'll see if spam gets flagged now or not.
print `sa-learn --no-sync --spam ${spamwebclient}.".zarafa.*"`;
unlink glob $spamwebclient.".zarafa.*";
print `sa-learn --no-sync --ham ${notspamwebclient}.".zarafa.*"`;
unlink glob $notspamwebclient.".zarafa.*";
print `sa-learn --sync`;
print `sa-learn --dump magic`;
Bob -
Accepted Answer
I got this all working a couple of weeks a go but users aren't noticing any reduction in spam. If I look at mail headers I see that spam assassin is running but it doesn't appear that it's "learning". How can I confirm that the main being dumped out to /var/spool/filter/training is actually being used for training by assassin?
Bob -
Accepted Answer
I've been trying to get this working on 6.4 but I'm stuck where the IMAP login of the script isn't working. I get the following error if I turn debug on.
/Zarafa-sa-learn.pl
Started at Fri Aug 30 14:18:22 2013
Using Mail::IMAPClient version 3.33 on perl 5.010001
Connecting via IO::Socket::INET to localhost:143:143 Timeout 600
Connected to localhost:143
Read: * OK [CAPABILITY IMAP4rev1 LITERAL+ STARTTLS AUTH=PLAIN] Zarafa IMAP gateway ready
Sending: 1 LOGIN bsleys *********
Sent 24 bytes
Read: 1 NO LOGIN imap feature disabled
I've tried using the spamadmin account and my own personal account for testing. I know both accounts work in Zarafa webmail but don't know why it thinks imap is disabled.
Any hints would be appreciated.
Thanks
Bob -
Accepted Answer
The Skript works with 6.4!
|quote] I just had to create the folder /var/spool/filter/training...and find the old filter training script. I'm can't work out why the script you are running is using a file at ./tmp.pl? are you running the right script?[/quote]
Thanks Team,
that was the only problem: Just make a directory /var/spool/filter/training. Now it works.
And sorry, tmp.pl was just the name for your skript when I tried out again-just for my post to see the errormessage.
Greetings Alex -
Accepted Answer
Hi Alex, thanks for the feedback - I did check this out for 6.3 but never got round to updating the script or how to. My modified script is tweaked so that it's works with existing ClearOS filtering scripts (/usr/bin/filtertraining as was in 5.2), where as the one linked above calls sa-learn direct.
However, the script 'Zarafa-sa-learn.pl' still works fine here in 6.4 but the instructions need updating. I just had to create the folder /var/spool/filter/training...and find the old filter training script. I'm can't work out why the script you are running is using a file at ./tmp.pl? are you running the right script? -
Accepted Answer
Hello,
I was trying the script with my ClearOS 6.4 Pro and Zarafa Community installed from the Marketplace. It was not working out of the box:
[root@clearos sbin]# ./spamtraining.pl
binmode() on closed filehandle $fh at /usr/share/perl5/vendor_perl/Mail/IMAPClient.pm line 899.
print() on closed filehandle $fh at /usr/share/perl5/vendor_perl/Mail/IMAPClient.pm line 1722.
Can't create placeholder Datei oder Verzeichnis nicht gefunden at ./tmp.pl line 45.
So I tried the Version form there:
https://secure.kitserve.org.uk/content/zarafa-debian-how-part-2-sasl-and-autolearning-spamassassin
And it was working. Without any LDAP-changing etc. Just created the Folders in Public Store and the User.
Greetings Alex -
Accepted Answer
-
Accepted Answer
-
Accepted Answer
-
Accepted Answer
-
Accepted Answer
-
Accepted Answer
Hi Augustyr, if you inspect the mail headers of incoming mail you'll find it has been given a spam score (See X-Spam fields)
I have a feeling that the train.spam mailbox is not compatible with Zarafa, as its unable to resolve the local recipient 'train.spam'...i'll investigate
In the webconfig see what threshholds you have given to the antispam config? set it lower if you want more mail to be marked as [SPAM], and set the quanrantine threshold even lower to automatically prevent mail being delivered to the userrs inbox. To gauge what scores, inspect the mail headers on typical spam as mentioned above
Most legitimate mail should get a score below zero (negative). I set my spam threshold around +4... -
Accepted Answer
I have tried to forward the email to train.spam@mycomp.com
And got following:
If you do so, please include this problem report. You can delete your own text from the attached returned message.
The mail system
</var/spool/filter/training/spam-mailbox@mycomp.com>: internal software error.
Command output: [26139] Failed to resolve recipient
/var/spool/filter/training/spam-mailbox
There is nothing in:/var/spool/filter/training/
I never get anything marked as spam .... -
Accepted Answer
-
Accepted Answer
The filter has to be trained over a length of time, it doesnt automatically block everything that instantly matches what you have copied over
In time, it builds up a better understanding of what you consider to be spam, and the words / structures that identify them. You'll notice the spam scores will change apart from that there is not a lot more you can debug
http://wiki.apache.org/spamassassin/BayesInSpamAssassin
http://en.wikipedia.org/wiki/Bayesian_spam_filtering -
Accepted Answer
-
Accepted Answer
hi,Tim!
[zarafa 7.01] some wrong when restart ladp before add 'zarafa.schem' to sldap.conf:
[root@mail openldap]# service ldap start
Checking configuration files for slapd: [FAILED]
/etc/openldap/schema/zarafa.schema: line 182 objectclass: AttributeType not found: "mail"
slaptest: bad configuration file!
my sldap.conf:
-----------------------------------------------------------------------------
access to dn.regex="(.*,)?cn=internal,dc=mail,dc=post88,dc=net"
by group/kolabGroupOfNames="cn=admin,cn=internal,dc=mail,dc=post88,dc=net" write
by group/kolabGroupOfNames="cn=maintainer,cn=internal,dc=mail,dc=post88,dc=net" write
by self write
by dn="cn=nobody,cn=internal,dc=mail,dc=post88,dc=net" read #The line 182
by anonymous auth stop
-----------------------------------------------------------------------------
hope your help,thx! -
Accepted Answer
-
Accepted Answer
-
Accepted Answer
-
Accepted Answer
-
Accepted Answer
It seemed like this one chrashed my box... :S I installed it as per guide, tossed a few hundred mails in the LearnAsSpam, sat back and waited... time went by, nothing happend, then it hit's me... it's past 22, oh well, turned in and the next morning i went of to work. And at 9 I wanted to check up on the folder status, was it empty og was it still full..? Guess what.. I could not get in contact with my server..?? Dang!!! Went home with the thoughts of a long evening of error correcting in mind. My server is an old notebook that serves my needs, and it's alway turned on (Hey it has a server function so it has to be) and as a result it gets a bit hot. Nothing alarming, just about 37 degrees (C). The day before installing from this guide, I had som other work to do on the notebook. I forgot to close the screen down after i was done. Did I mention I have a cat...? He's a one year old helfbreed Maincoon, and he just looooves those hot places.... He lay down on the keyboard, and by doing that also pressed down on the powerbutton... And with a purr from the cat and a hum from the notebook, my server went offline... ;-)
Got the cat off the notebook and the notebook powered on again, and presto... the cron job empties the folder as expected, and it seems like the Bayesian learning is taking place. -
Accepted Answer
Please login to post a reply
You will need to be logged in to be able to post a reply. Login using the form on the right or register an account if you are new here.
Register Here »