Thursday, January 16, 2014

Solaris 9 shell script to find why a directory keeps filling up

We have thresholds set on certain file systems at work, and one continues to be exceeded at random intervals in the night. Rather than just bumping the threshold up, I thought I'd script something up to help understand what's happening.

The threshold was < 6GB available space on /tmp (hence the 629145 KB's within the if statement). It compares the last 10 minute snapshot (ls of /tmp) to the current snapshot, and outputs the difference to a file. Once space drops below 6GB, I receive an email with the last 50 lines of that difference file. It's not the cleanest code, but you get the picture:

#!/bin/bash

if [ -f /var/tmp/de.tmp.1 ]

then
        ls -lh /tmp > /var/tmp/de.tmp.2
        date >> /var/tmp/de.tmp.diff
        echo -e '\n' >> /var/tmp/de.tmp.diff
        diff /var/tmp/de.tmp.1 /var/tmp/de.tmp.2 >> /var/tmp/de.tmp.diff
        echo -e '\n-----------------------------------------------\n' >> /var/tmp/de.tmp.diff
        rm -f /var/tmp/de.tmp.1
        rm -f /var/tmp/de.tmp.2
else
        ls -lh /tmp > /var/tmp/de.tmp.1
fi

avail=$(df -k | grep \/tmp | awk '{print $4}')

if [ $avail -lt 6291456 ]
then
        emailMessage=$(tail -50 /var/tmp/de.tmp.diff)
        echo "$emailMessage" | mailx -s "/tmp on myserver less than 6291456 KB ($avail)" -r monitor@company.com myemail@company.com

fi


Once I tested it a few times, I created a cron entry to mirror our alarm policy (execute every 10 minutes of every hour ):

0,10,20,30,40,50 * * * * /usr/local/bin/tmp_space_alert.sh

Now to wait for it to happen, and analyse the email and /var/tmp/de.tmp.diff file.

No comments:

Post a Comment