Posts or Comments 02 July 2009

Plesk & Server Architecture Matt | 12 Nov 2008 01:55 pm

Parse Plesk Maillog, Count Emails Per Domain

I wanted to get a quick count of messages being sent to and from all my domains in plesk.   I wrote this script to get the mail domains from the plesk database.  Then I loop through them and grep a regex against the maillog file.  Since I’m actually using zgrep to look decompress yesterday’s maillog file at the same time I get the counts, it’s a little processor intensive.  It takes about 10 seconds to run through 30 domains in a 5 meg gzipped file.  If I were to gunzip the file first and just grep it, this would go much quicker.  But I’m fine with a 10 second run time and the bonus of not having to trash a decompressed maillog file when I’m done.

Here’s the bash script I’ve saved as mailcount.sh:

#!/bin/sh

MYSQLPASS=`cat /etc/psa/.psa.shadow`

for DOMAIN in `mysql -uadmin -p$MYSQLPASS -e "select distinct domains.name from mail inner join domains on mail.dom_id=domains.id" -B -N psa`

do
echo $DOMAIN `zgrep -c -E "(to|from)=.+@$DOMAIN" /usr/local/psa/var/log/maillog.processed.1.gz`
done

Here’s a version that copies and gunzip’s the maillog before doing any processing.

#!/bin/sh

MYSQLPASS=`cat /etc/psa/.psa.shadow`

# copy the log file before decompress
cp /usr/local/psa/var/log/maillog.processed.1.gz /usr/local/psa/var/log/maillog.processed.1-working.gz &&

# decompress the log working file
gunzip /usr/local/psa/var/log/maillog.processed.1-working.gz

for DOMAIN in `mysql -uadmin -p$MYSQLPASS -e "select distinct domains.name from mail inner join domains on mail.dom_id=domains.id" -B -N psa`
do
echo $DOMAIN `grep -c -E "(to|from)=.+@$DOMAIN" /usr/local/psa/var/log/maillog.processed.1-working`
done

# remove the working log file
rm -f /usr/local/psa/var/log/maillog.processed.1-working &&
echo "file removed"
exit;

3 Responses to “Parse Plesk Maillog, Count Emails Per Domain”

  1. on 21 Jan 2009 at 5:55 pm 1.Juan C Lopez said …

    Hello I try your script and is very interesting the aproach (how ever I got the same number of emails for all domain why), but I see it very REDOUNDANT, why? First you are opening directly a GZiped file, this consumes a Processor time, and memory, as you said it can be done unziping it before), but the problem is that you are doing this process, every time, for each domain in the server, so for a real case (like me), the numbers are:

    122 Total Sites
    8-10mb maillog.gz files
    Time for each site 10-15 secs
    Total time: 21-25 mins

    Im not a SHELL programer so I cant write the script (I did’n even know that you can access MySQL from SHELL :) LOL), but the script sould do something like this:

    1) Store a list of the domains (some how, an array Maybe)
    2) Open the Maillog.gz file
    3) run grep for all the to= and from=
    4) if a domain from the domains list count it (maybe an other literal array in php like $domain['domain.com']++ )
    then at the end print results.

  2. on 21 Jan 2009 at 7:41 pm 2.Juan C Lopez said …

    Me again I just seen the error in the script
    you typed:

    do
    echo $DOMAIN `zgrep -c -E “(to|from)=.+@$domain” /usr/local/psa/var/log/maillog.processed.1.gz`
    done

    and sould be:
    do
    echo $DOMAIN `zgrep -c -E “(to|from)=.+@$DOMAIN” /usr/local/psa/var/log/maillog.processed.1.gz`
    done

    The second domain was lower cased, any way this is still, slow for the reasons i told you before! bye!

  3. on 22 Jan 2009 at 9:48 am 3.Matt said …

    Thanks for the bug fix Juan. I’ve posted a version of this script that copies and decompresses the log file into a working copy and then deletes it when it’s done.

    You’re right, my first version is pretty bad for larger plesk installations. And I think it gets exponentially worse as the log file size increases.

    -Matt

Trackback This Post | Subscribe to the comments through RSS Feed

Leave a Reply