Ambari Metrics介绍,ambarimetrics介绍


概念

Ambari Metrics是Ambari中负责监控集群状态的功能组件。它有如下一些主要的概念:

Terminology Description
Ambari Metrics System (“AMS”) The built-in metrics collection system for Ambari.
Metrics Collector The standalone server that collects metrics, aggregates metrics, serves metrics from the Hadoop service sinks and the Metrics Monitor.
Metrics Hadoop Sinks Plugs into the various Hadoop components sinks to send Hadoop metrics to the Metrics Collector.
Metrics Monitor Installed on each host in the cluster to collect system-level metrics and forward to the Metrics Collector.

简单地说,Ambari收集两类信息放到Collector上:
1. 各节点“系统级”的指标
2. Hadoop各组件的指标
前者是通过安装在每个节点上的Metrics Monitor(就是Agent)来收集的,后者是通过面向特定Hadoop组件的Sink(概念上和Flume的Sink是一样的)来收集的。
最后补充一一点,Collector是使用HBase存放Metrics数据的。

架构

配置

配置Ambari Metrics为分布式模式

默认安装时Ambari Metrics为embedded模式,这样收集的所有数据是存放在Collector节点的本地的,大量的Metrics数据会挤占大量的本地存储空间,该为分布式模式后Metrics数据会放置到HDFS上,所以通常这是安装Ambari后必备一个操作。具体的操作可以参考: http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_ambari_reference_guide/content/_configuring_ambari_metrics_for_distributed_mode.html

配置Metrics数据的生命周期

大量的Metrics会占用非常大的存数空间,设定Metrics数据的保留时间(TTL)是很必要的,控制Metrics数据保留时间的参数位于ams-site.xml中,以下是相关的配置项:

配置项 默认值 描述
timeline.metrics.host.aggregator.ttl 86400 1 minute resolution data purge interval. Default is 1 day.
timeline.metrics.host.aggregator.minute.ttl 604800 Host based X minutes resolution data purge interval. Default is 7 days.(X = configurable interval, default interval is 2 minutes)
timeline.metrics.host.aggregator.hourly.ttl 2592000 Host based hourly resolution data purge interval. Default is 30 days.
timeline.metrics.host.aggregator.daily.ttl 31536000 Host based daily resolution data purge interval. Default is 1 year.
timeline.metrics.cluster.aggregator.minute.ttl 2592000 Cluster wide minute resolution data purge interval. Default is 30 days.
timeline.metrics.cluster.aggregator.hourly.ttl 31536000 Cluster wide hourly resolution data purge interval. Default is 1 year.
timeline.metrics.cluster.aggregator.daily.ttl 63072000 Cluster wide daily resolution data purge interval. Default is 2 years.

版权声明:本文为博主原创文章,未经博主允许不得转载。

相关内容