配置nagios监控HA集群(二)
配置nagios来监控HA集群(二)
终于开始监控集群了,先说下实验环境:之前搭建的ha集群,192.168.10.101和192.168.10.102,运行的是http服务,nagios安装在192.168.10.1,利用nrpe监控101和102. 好了,开始.老习惯,先找资料:
First off, we need to define what we mean by a "cluster". The simplest way to understand this is with an example. Let''s say that your organization has five hosts which provide redundant DNS services to your organization. If one of them fails, its not a major catastrophe because the remaining servers will continue to provide name resolution services. If you''re concerned with monitoring the availability of DNS service to your organization, you will want to monitor five DNS servers. This is what I consider to be a
这段讲的重点其实就是我们之要搭建集群就是为了保证服务在其中节点宕机的情况下还能正常运行,用nagios监控集群的关键是将监控集群本身作为nagios的一个服务去看待,而这个服务的目标在于对集群这个整体运行状态的监控,而不是具体针对其中哪一台机器出了问题.
here are several ways you could potentially monitor service or host clusters. I''ll describe the method that I believe to be the easiest. Monitoring service or host clusters involves two things: *Monitoring individual cluster elements *Monitoring the cluster as a collective entity Monitoring individual host or service cluster elements is easier than you think. In fact, you''re probably already doing it. For service clusters, just make sure that you are monitoring each service element of the cluster. If you''ve got a cluster of five DNS servers, make sure you have five separate service definitions (probably using the check_dns plugin). For host clusters, make sure you have configured appropriate host definitions for each member of the cluster (you''ll also have to define at least one service to be monitored for each of the hosts). Important: You''re going to want to disable notifications for the individual cluster elements (host or service definitions). Even though no notifications will be sent about the individual elements, you''ll still get a visual display of the individual host or service status in the status CGI. This will be useful for pinpointing the source of problems within the cluster in the future. Monitoring the overall cluster can be done by using the previously cached results of cluster elements. Although you could re-check all elements of the cluster to determine the cluster''s status, why waste bandwidth and resources when you already have the results cached? Where are the results cached? Cached results for cluster elements can be found in the |
凌众科技专业提供服务器租用、服务器托管、企业邮局、虚拟主机等服务,公司网站:http://www.lingzhong.cn 为了给广大客户了解更多的技术信息,本技术文章收集来源于网络,凌众科技尊重文章作者的版权,如果有涉及你的版权有必要删除你的文章,请和我们联系。以上信息与文章正文是不可分割的一部分,如果您要转载本文章,请保留以上信息,谢谢! |