Monit - ( Autohealer + Monitor )






While at work , one of the important part of job responsibility is the production support where we have to make sure that the application is up and running all the time and there are no alerts. What if we get something which will keep an eye on the application on our behalf and heal the alert if there is any . In short monit will take all the headache and we will have peaceful life. Let's see how it works .


Monit is a small Open Source utility for managing and monitoring Unix systems. Monit conducts automatic maintenance and repair and can execute meaningful causal actions in error situations.

CentOS



sudo yum update && sudo yum install epel-release
sudo yum update && sudo yum install monit

  • To enable and start the daemon in CentOS 7


sudo systemctl enable monit && sudo systemctl start monit

  • To enable and start the daemon in CentOS 6


sudo chkconfig monit on && sudo service monit start

Debian / Ubuntu


  • Debian and Ubuntu automatically start and enable Monit after installation.


sudo apt-get update && sudo apt-get upgrade
sudo apt-get install monit

Monitrc - Configuration file


Below is the monitrc configuration -


daemon - monit cycle to re run checks every 120s

include - contains files which needs to be monitored or auto heal

eventqueue - If no mail server is available, Monit can queue events in the local file-system for retry until the mail server recovers.


/etc/monitrc

set daemon 120
set logfile syslog
set statefile /var/lib/monit/state
set idfile /var/lib/monit/id
set eventqueue
  basedir /var/lib/monit/events
  slots 100
include /etc/monit.d/*

Example monitors


  • Monitor root disk

# cat /etc/monit.d/root_disk 

check filesystem root_disk with path /dev/sda2
  if space usage > 80% for 5 times within 15 cycles then alert

Explanation - it will monitor /dev/sda2 partition and alert if usage is more than 80%


  • Monitor ssh

# cat /etc/monit.d/sshd      

check process sshd with pidfile /var/run/sshd.pid
  start program = "/usr/sbin/service sshd start"
  stop program = "/usr/sbin/service sshd stop"

Explanation - it will monitor ssh process and restart it automatically when it is not able to find sshd.pid file


  • Monitor td-agent and send alert on slack


# cat /etc/monit.d/td-agent 

check process td-agent with pidfile /var/run/td-agent/td-agent.pid
  start program = "/usr/sbin/service td-agent start"
  stop program = "/usr/sbin/service td-agent stop"
 if 2 restarts within 3 cycles then exec "/usr/local/bin/slack.sh"

Explanation - it will monitor td-agent and try to restart it 3 times only after that it will alert to slack


Slack Config

# cat /usr/local/bin/slack.sh
#!/bin/bash

URL=$(cat /opt/slack-url)

COLOR=${MONIT_COLOR:-$([[ $MONIT_EVENT == *"succeeded"* ]] && echo good || echo danger)}
TEXT=$(echo -e "$MONIT_SERVICE $MONIT_EVENT: $MONIT_DESCRIPTION" | python3 -c "import json,sys;print(json.dumps(sys.stdin.read()))")
MONIT_HOST=`hostname`
PAYLOAD="{
  \"attachments\": [
    {
      \"text\": $TEXT,
      \"color\": \"$COLOR\",
      \"mrkdwn_in\": [\"text\"],
      \"fields\": [
        { \"title\": \"Date\", \"value\": \"$MONIT_DATE\", \"short\": true },
        { \"title\": \"Host\", \"value\": \"$MONIT_HOST\", \"short\": true }
      ]
    }
  ]
}"

Slack WebHook

cat /opt/slack-url 
https://hooks.slack.com/services/*******/******/**********************

Check Status


# monit status

Remote Hosts


Perhaps you are not a DevOps at all, you are a tester who works with many client sites on different hosts. Wouldn’t it be nice to proactively respond to site outages even before a client calls? It is! You can configure Monit to check all your client sites statuses and alert you immediately if they are down


check host server with address www.foo.com
    if failed port 3000 protocol http with timeout 60 seconds then alert

Hope you liked this article and find it useful. Happy Reading !!






343 views0 comments
 

Subscribe Form

©2020 by Linux Advise