Friday, July 24, 2009

Taking on Nagios

OK,
I have been searching for an easy way to try out Nagios, even going to VMWare for a virtual machine, there is only 1 link worthwhile and it is dead, it is the same link that everyone in Google refers to as well, so I am convinced there is no easy way.

Maybe when I finish, I'll post the virtual machine at VMware or keep it at my site for visitors...

I am going to start from scratch; download Fedora 11, then the Nagios tarball. Then configure....

1. Downloaded Fedora 11 and installed to 15G hard drive
2. Download Nagios: nagios-3.0.6.tar.gz
3. un-tar Nagios: tar xzf nagios-version.tar.gz
4. add user nagios: adduser nagios
5. make installation directory: mkdir /usr/local/nagios
6. chown the directory: chown nagios.nagios /usr/local/nagios
7. creat a new group: /usr/sbin/groupadd nagcmd
8. add the web and nagios user to that group:
/usr/sbin/usermod -G nagcmd apache
/usr/sbin/usermod -G nagcmd nagios
9. run configure (include command group nagcmd):
./configure --with-command-group=nagcmd
10. now compile: make all
11. post init file to /etc/rc.d/initd:make install-init
12. now edit httpd.conf with the following 2 aliases so that the web pages get diverted to correct directories:

ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin

Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user


Alias /nagios /usr/local/nagios/share


Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user


13. restart the web server: /etc/rc.d/init.d/httpd restart
14. OK - now trying to log into nagios via website, I get AVC denials all over the place, after searching, I find my best answer for now is to just disable SELinux - kind of like disabling the junk I don't like in Vista - this is not a recommendation, just something I do to move forward, I can enable SELinux later when I get everything running
15. Great, now a "Whoops" error, After fruitless "googling" I get nowhere so as a last resort, I actually read the error carefully, first, I run a nagios -v {config-file] command like they say: ./nagios -v /usr/local/nagios/etc/nagios.cfg
everything shows fine, so now I run it without the -v and I see that I get a weird error about not finding the nagios.cmd in the usr/local/nagios/var/rw directory. Understood, cause I don't even have an rw directory. So I go ahead and create it and set ownership and group to "nagios". Run the command again and yes, nagios now starts..... Log into the webpage and can now see everything. Next step will be to learn everything it can do and try the plugins...

OK - update 2 weeks later, I have configured nagios to monitor LAN servers as well as a few off-site servers. Had to work with windows.cfg, commands.cfg in the /usr/local/nagios/etc/objects directory and the nagios.cfg, cgi.cfg in the /usr/local/nagios/etc directory. Needed to download nsclient++ and install on all windows servers, configure the .ini file and poke holes for the listening port of choice in the .ini file.

Next was fighting with the horrible email notification documentation. Finally had to yum install ssmtp and configure the conf file to my email server and fight with the command line to get the correct user from, to and subject. Funny how important the alerting is, yet the documentation is a joke for this issue and don't try to go on the forums, all you get is some pompous advice to RTFM - hmmm I did read the manual and it was written horribly, that is what the forum is for; to help interpret the poorly written stuff that techies write and think is incredibly clear....

Well, after getting it all running, have to say it is pretty cool, and I even tracked back to re-enable all the security that I removed to get this up and runing.

No comments:

Post a Comment