shell监控Fastdfs的storage更新延迟报警,fastdfsstorage最后发现在2天前就变


最近线上遇到了悲催的事情:fastdfs的存储服务器其中一块磁盘坏了(存储分区变成read only),可是zabbix监控没有对此进行监控,结果导致客户端上传失败。最后发现在2天前就变成只读了。虽然数据存储有冗余的,影响不大,不过还是很不爽,没有及时发现问题。针对这个情况,写了个小脚本以实现storage更新延迟高于特定值(如2分钟)就报警。


通过fdfs_monitor来查看所有fastdfs的storage状态信息,更新时间延迟等,思路是通过执行结果last_synced_timestamp的uptime时间值。对Active状态及延迟时间进行监控。脚本如下:

#!/bin/bash


#storage synchronous delay alarm scripts


# Richard shen 2012/07/11


# BLOG:http://lxsym.blog.51cto.com


Basedir=`dirname $0`


Now_time=`date +%s`


Active=$Basedir/active.txt


IP=$Basedir/ip.txt


Syn_time=$Basedir/syn_time.txt


COMMAND="/usr/local/webserver/fdfs/bin/fdfs_monitor /usr/local/webserver/fdfs/etc/client.conf"


$COMMAND | grep "(" | awk '/ip_addr/{print $5}' >$Active


$COMMAND | grep "(" | awk '/ip_addr/{print $3}' >$IP


$COMMAND | grep last_synced_timestamp | awk '{ print $3,$4}' >$Syn_time


paste $Syn_time $IP $Active > main.log


cat main.log | while read day time ip active


do


sys_time=`date -d "$day $time" +%s`


num=`expr $Now_time - $sys_time`


#Stuts alarm


if [ $active != "ACTIVE" ];then


#邮件报警API,


# echo "$ip State is $active,please check."


fi


#Set alarm time (eg 2m(120s))


if [ $num -gt 120 ];then


#邮件报警API, 如wget -q -O - "http://api.abc.com/sendMail.php?type=abcdG&to=邮件地址&subject=【Storage同步延迟报警:$ip延迟$num秒,请检查~】&body=RT,请检查,谢谢" > /dev/null


# echo "$ip Update time delay $num (s)"


fi


done


rm -rf $Active $IP $Syn_time main.log

相关内容

    暂无相关文章