Plesk & Server Architecture Matt | 12 Nov 2008 01:55 pm
Parse Plesk Maillog, Count Emails Per Domain
I wanted to get a quick count of messages being sent to and from all my domains in plesk. I wrote this script to get the mail domains from the plesk database. Then I loop through them and grep a regex against the maillog file. Since I’m actually using zgrep to look decompress yesterday’s maillog file at the same time I get the counts, it’s a little processor intensive. It takes about 10 seconds to run through 30 domains in a 5 meg gzipped file. If I were to gunzip the file first and just grep it, this would go much quicker. But I’m fine with a 10 second run time and the bonus of not having to trash a decompressed maillog file when I’m done.
Here’s the bash script I’ve saved as mailcount.sh:
#!/bin/sh
MYSQLPASS=`cat /etc/psa/.psa.shadow`
for DOMAIN in `mysql -uadmin -p$MYSQLPASS -e "select distinct domains.name from mail inner join domains on mail.dom_id=domains.id" -B -N psa`
do
echo $DOMAIN `zgrep -c -E "(to|from)=.+@$DOMAIN" /usr/local/psa/var/log/maillog.processed.1.gz`
done
Here’s a version that copies and gunzip’s the maillog before doing any processing.
#!/bin/sh
MYSQLPASS=`cat /etc/psa/.psa.shadow`
# copy the log file before decompress
cp /usr/local/psa/var/log/maillog.processed.1.gz /usr/local/psa/var/log/maillog.processed.1-working.gz &&
# decompress the log working file
gunzip /usr/local/psa/var/log/maillog.processed.1-working.gz
for DOMAIN in `mysql -uadmin -p$MYSQLPASS -e "select distinct domains.name from mail inner join domains on mail.dom_id=domains.id" -B -N psa`
do
echo $DOMAIN `grep -c -E "(to|from)=.+@$DOMAIN" /usr/local/psa/var/log/maillog.processed.1-working`
done
# remove the working log file
rm -f /usr/local/psa/var/log/maillog.processed.1-working &&
echo "file removed"
exit;
on 21 Jan 2009 at 5:55 pm 1.Juan C Lopez said …
Hello I try your script and is very interesting the aproach (how ever I got the same number of emails for all domain why), but I see it very REDOUNDANT, why? First you are opening directly a GZiped file, this consumes a Processor time, and memory, as you said it can be done unziping it before), but the problem is that you are doing this process, every time, for each domain in the server, so for a real case (like me), the numbers are:
122 Total Sites
8-10mb maillog.gz files
Time for each site 10-15 secs
Total time: 21-25 mins
Im not a SHELL programer so I cant write the script (I did’n even know that you can access MySQL from SHELL
LOL), but the script sould do something like this:
1) Store a list of the domains (some how, an array Maybe)
2) Open the Maillog.gz file
3) run grep for all the to= and from=
4) if a domain from the domains list count it (maybe an other literal array in php like $domain['domain.com']++ )
then at the end print results.
on 21 Jan 2009 at 7:41 pm 2.Juan C Lopez said …
Me again I just seen the error in the script
you typed:
do
echo $DOMAIN `zgrep -c -E “(to|from)=.+@$domain” /usr/local/psa/var/log/maillog.processed.1.gz`
done
and sould be:
do
echo $DOMAIN `zgrep -c -E “(to|from)=.+@$DOMAIN” /usr/local/psa/var/log/maillog.processed.1.gz`
done
The second domain was lower cased, any way this is still, slow for the reasons i told you before! bye!
on 22 Jan 2009 at 9:48 am 3.Matt said …
Thanks for the bug fix Juan. I’ve posted a version of this script that copies and decompresses the log file into a working copy and then deletes it when it’s done.
You’re right, my first version is pretty bad for larger plesk installations. And I think it gets exponentially worse as the log file size increases.
-Matt