AIX操作系统的监控

项目地址: https://github.com/zhangrj/Aix-Monitor

1、CPU使用率

CPU使用率 = 100% – CPU idle time

CPU ilde time可取4秒内的平均值。例如:

root@**:/ # vmstat 1 4

System configuration: lcpu=32 mem=63488MB

kthr    memory              page              faults        cpu    
----- ----------- ------------------------ ------------ -----------
 r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
 0  0 9247706 3838878   0   0   0   0    0   0   3  321 300  0  0 99  0
 0  0 9247706 3838876   0   0   0   0    0   0  10  145 302  0  0 99  0
 0  0 9247706 3838876   0   0   0   0    0   0   2   54 279  0  0 99  0
 0  0 9247706 3838876   0   0   0   0    0   0   3   78 285  0  0 99  0

写成脚本:

#!/bin/ksh

cpu_idle_list=$(/usr/bin/vmstat 1 4 | egrep -v '[a-z,A-Z]|-' |egrep '[0-9]' | awk {'print $16'})

count=0

for cpu_idle in $cpu_idle_list
do
    count=`expr $count + $cpu_idle` 
done

cpu_used=`expr 100 - $count / 4`

echo $cpu_used

2、内存使用率

内存使用率 = 已使用内存 / 系统可用内存

使用svmon命令查看内存使用情况,inuse列(注意此命令显示的结果单位为4K):

root@**:/ # svmon -G
               size       inuse        free         pin     virtual   mmode
memory     16252928    12416139     3836789     1637578     9249771     Ded
pg space    8388608       22955

               work        pers        clnt       other
pin         1160474           0           0      477104
in use      9249771           0     3166368

PageSize   PoolSize       inuse        pgsp         pin     virtual
s    4 KB         -    11239115       22955      735562     8072747
m   64 KB         -       73564           0       56376       73564

使用lsattr命令查看可用物理内存:

root@**:/ # lsattr -El sys0 -a realmem
realmem 65011712 Amount of usable physical memory in Kbytes False

写成脚本:

#!/bin/ksh

um=`svmon -G | head -2|tail -1| awk {'print $3'}`
um=`expr $um / 256`
tm=`lsattr -El sys0 -a realmem | awk {'print $2'}`
tm=`expr $tm / 1000`
fm=`expr $tm - $um`
pa=`echo "scale=2;  $um/$tm" | bc`
pr=`echo "scale=0;  $pa * 100" | bc`
PERCENTUSED=$pr

echo $PERCENTUSED
exit 0

3、paging使用率

AIX Paging Space与Linux Swap Space相似,可使用lsps命令查看paging使用率:

root@**:/ # lsps -s
Total Paging Space   Percent Used
      32768MB               1%

写成脚本:

#!/bin/sh

valp=`lsps -s | tail -1 | awk '{print $2}' | cut -d "%" -f1`
echo $valp

4、文件系统使用率

使用df命令查看文件系统使用情况:

root@**:/ # df
Filesystem    512-blocks      Free %Used    Iused %Iused Mounted on
/dev/hd4         8388608   7422768   12%    13921     2% /
/dev/hd2        16777216  10608200   37%    45265     4% /usr
/dev/hd9var      8388608   2270176   73%    10955     5% /var
/dev/hd3         8388608   5395584   36%      570     1% /tmp
/dev/hd1       167772160 128136536   24%    90561     1% /home
/dev/hd11admin    1048576   1047696    1%        5     1% /admin
/proc                  -         -    -         -     -  /proc
/dev/hd10opt    16777216  13892480   18%    17188     2% /opt
/dev/livedump    1048576   1047760    1%        4     1% /var/adm/ras/livedump
/dev/odm               0         0   -1%        6   100% /dev/odm

写成脚本(本脚本取自nagios exchange网站):

#!/bin/ksh

# Global Variables
count=0
output=""
output_ext="<br/>当前磁盘空前使用情况总览<br/>"
# Default type is the grep pattern jfs (also matches jfs2 etc.)
mountType=""

errorHelp() {
   echo "----- ERROR -----"
   echo "- Dude! You have to pass arguments to the script, for it to work."
   echo "- Example of usage:"
   echo "       ./check_filesystems_space 90 99 \"jfs\"," 
   echo "        for warning 90 and critical 99 and only include filesystems types that matches the pattern 'jfs'
 (default)."
   echo "-----------------"
   exit
}

# Argument Checking
if [[ $# -eq 2 ]] ;then
    mountType="jfs|vxfs"
elif [[ $# -eq 3 ]]; then
    mountType=$3
else
    errorHelp
fi
        
# Iteration of file systems.
if [ -n "$1" ] && [ -n "$2" ] && [ -n "$mountType" ]
then
        warninglimit=$2
        lowlimit=$1
        rawSysVDFResults=`/usr/sysv/bin/df -n | grep -i -E $mountType | awk -F\: '{print ""$1":"$2""}' | tr -d '\t' | tr -d ' '`

        for fs in $rawSysVDFResults
        do
                set -A array $(echo $fs | tr ':' '\n')
                fMount=${array[0]}  
                fType=${array[1]}   

                size=`df -k $fMount|grep $fMount|awk '{ print $4; }'`
                prc=`echo $size | tr -d "%"`
                output_ext=${output_ext}"$fMount $size;<br/>"
                if [ $prc -gt $warninglimit ]
                then
                        output=`echo $output "CRITICAL: $fMount 使用率 $size;"`
                        count=`expr $count + 1`
                elif [ $prc -gt $lowlimit ]
                then
                        output=`echo $output "WARNING: $fMount 使用率 $size;"`
                        count=`expr $count + 1`
                fi
        done
fi

#output
if [ $count -gt 0 ] 
then
        echo $output
else
        if [ -n "$1" ] && [ -n "$2" ] && [ -n "$mountType" ]
        then
                echo "OK: Filesystem space inside acceptable levels"
        fi
fi 
echo $output_ext

5、磁盘I/O负载

磁盘I/O负载可通过cpu iowait time反映,使用vmstat命令查看 cpu iowait time(wa列):

root@**:/home/nagios/bin # vmstat  1 4

System configuration: lcpu=32 mem=63488MB

kthr    memory              page              faults        cpu    
----- ----------- ------------------------ ------------ -----------
 r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
 0  0 9247992 3838665   0   0   0   0    0   0  17 1564 308  0  0 99  0
 1  0 9247635 3839022   0   0   0   0    0   0 234 6122 520  7  1 91  1
 0  0 9247700 3838957   0   0   0   0    0   0  17  427 315  0  0 99  0
 0  0 9247701 3838956   0   0   0   0    0   0   3  148 282  0  0 99  0

写成脚本:

#!/bin/ksh

iowait_list=$(/usr/bin/vmstat 1 4 | egrep -v '[a-z,A-Z]|-' |egrep '[0-9]' | awk {'print $17'})

count=0

for iowait in $iowait_list
do
    count=`expr $count + $iowait` 
done

cpu_iowait=`expr $count / 4`

echo $cpu_iowait

关于 “AIX操作系统的监控” 的 2 个意见

评论关闭。