一、Open-Falcon组件简述
【Open-Falcon绘图相关组件】
- Agent: 部署在目标机器采集机器监控项
- Transfer : 数据接收端,转发数据到后端Graph和Judge
- Graph:操作rrd文件存储监控数据
- Query:查询各个Graph数据,提供统一http查询接口
- Dashboard:查询监控历史趋势图的web端
- Task:负责一些定时任务,索引全量更新、垃圾索引清理、自身组件监控等
【Open-Falcon报警相关组件】
- Sender :报警发送模块,控制并发度,提供发送的缓冲queue
- UIC(FE):用户组管理,单点登录
- Portal:配置报警策略,管理机器分组的web端
- HBS:HeartBeat Server,心跳服务器
- Judge:报警判断模块
- Links:报警合并依赖的web端,存放报警详情
- Alarm:报警时间处理器
【Open-Falcon架构图】
官网架构图:

网友:


二、安装准备
1.安装Redis
http://www.cnblogs.com/xialiaoliao0911/p/7523952.html
2.安装MySQL
http://www.cnblogs.com/xialiaoliao0911/p/7523931.html
3.Open-Falocn下载地址
二进制版本:https://pan.baidu.com/s/1jOb6z-HRJ7i6nSFxf7I5Bg
4. 初始化MySQL表结构
# open-falcon所有组件都无需root账号启动,推荐使用普通账号安装,提升安全性。此处我们使用普通账号:work来安装部署所有组件 # 当然了,使用yum安装依赖的一些lib库的时候还是要有root权限的。 git clone https://github.com/open-falcon/scripts.git cd ./scripts/ mysql -h localhost -u root --password="" < db_schema/graph-db-schema.sql mysql -h localhost -u root --password="" < db_schema/dashboard-db-schema.sql mysql -h localhost -u root --password="" < db_schema/portal-db-schema.sql mysql -h localhost -u root --password="" < db_schema/links-db-schema.sql mysql -h localhost -u root --password="" < db_schema/uic-db-schema.sql
5.解压open-falcon.tar.gz
#新建用户falcon
useadd falcon
#新建临时目录tmp
su - falcon
cd /home/falcon
mkdir tmp
#解压
tar -zxf of-release-v0.1.0.tar.gz -C ./tmp/
for x in `find ./tmp/ -name "*.tar.gz"`;do \
app=`echo $x|cut -d '-' -f2`; \
mkdir -p $app; \
tar -zxf $x -C $app; \
done
三、安装Open-Falcon绘图相关组件
1.Agent
每台机器上,都需要部署agent,agent会自动采集预先定义的各种采集项,每隔60秒,push到transfer。
cd $WORKSPACE/agent/ mv cfg.example.json cfg.json vim cfg.json - 修改 transfer这个配置项的enabled为 true,表示开启向transfer发送数据的功能 - 修改 transfer这个配置项的addr为:["127.0.0.1:8433"] (改地址为transfer组件的监听地址, 为列表形式,可配置多个transfer实例的地址,用逗号分隔) # 默认情况下(所有组件都在同一台服务器上),保持cfg.json不变即可 # cfg.json中的各配置项,可以参考 https://github.com/open-falcon/agent/blob/master/README.md # 启动 ./control start # 查看日志 ./control tail #启动完成后,通过浏览器进行访问 http://192.168.102.141:1988/
【配置文件】
/home/falcon/tmp/agent/cfg.json
[falcon@open-falcon-demo agent]$ more cfg.json
{
"debug": false,
"hostname": "open-falcon-demo",
"ip": "192.168.102.141",
"plugin": {
"enabled": false,
"dir": "./plugin",
"git": "https://github.com/open-falcon/plugin.git",
"logs": "./logs"
},
"heartbeat": {
"enabled": true,
"addr": "127.0.0.1:6030",
"interval": 60,
"timeout": 1000
},
"transfer": {
"enabled": true,
"addrs": [
"127.0.0.1:8433",
"127.0.0.1:8433"
],
"interval": 60,
"timeout": 1000
},
"http": {
"enabled": true,
"listen": ":1988",
"backdoor": false
},
"collector": {
"ifacePrefix": ["eth", "em"]
},
"ignore": {
"cpu.busy": true,
"df.bytes.free": true,
"df.bytes.total": true,
"df.bytes.used": true,
"df.bytes.used.percent": true,
"df.inodes.total": true,
"df.inodes.free": true,
"df.inodes.used": true,
"df.inodes.used.percent": true,
"mem.memtotal": true,
"mem.memused": true,
"mem.memused.percent": true,
"mem.memfree": true,
"mem.swaptotal": true,
"mem.swapused": true,
"mem.swapfree": true
}
}
通过浏览器打开后的界面:

2.aggregator
cd $WORKSPACE/aggregator/ mv cfg.example.json cfg.json
【配置文件】
/home/falcon/tmp/aggregator/cfg.json
[falcon@open-falcon-demo aggregator]$ more cfg.json
{
"debug": false,
"http": {
"enabled": true,
"listen": "0.0.0.0:6055"
},
"database": {
"addr": "root:mysql@tcp(127.0.0.1:3306)/falcon_portal?loc=Local&parseTime=true",
"idle": 10,
"ids": [1, -1],
"interval": 55
},
"api": {
"hostnames": "http://127.0.0.1:5050/api/group/%s/hosts.json",
"push": "http://127.0.0.1:6060/api/push",
"graphLast": "http://127.0.0.1:9966/graph/last"
}
}
3.Transfer
transfer默认监听在:8433端口上,agent会通过jsonrpc的方式来push数据上来。
cd $WORKSPACE/transfer/ mv cfg.example.json cfg.json # 默认情况下(所有组件都在同一台服务器上),保持cfg.json不变即可 # cfg.json中的各配置项,可以参考 https://github.com/open-falcon/transfer/blob/master/README.md # 如有必要,请酌情修改cfg.json # 启动transfer ./control start # 校验服务,这里假定服务开启了6060的http监听端口。检验结果为ok表明服务正常启动。 curl -s "http://127.0.0.1:6060/health" #查看日志 ./control tail # 停止transfer ./control stop
[falcon@open-falcon-demo transfer]$ more cfg.json
{
"debug": false,
"minStep": 30,
"http": {
"enabled": true,
"listen": "0.0.0.0:6060"
},
"rpc": {
"enabled": true,
"listen": "0.0.0.0:8433"
},
"socket": {
"enabled": false,
"listen": "0.0.0.0:4444",
"timeout": 3600
},
"judge": {
"enabled": true,
"batch": 200,
"connTimeout": 1000,
"callTimeout": 5000,
"maxConns": 32,
"maxIdle": 32,
"replicas": 500,
"cluster": {
"judge-00" : "127.0.0.1:6080"
}
},
"graph": {
"enabled": true,
"batch": 200,
"connTimeout": 1000,
"callTimeout": 5000,
"maxConns": 32,
"maxIdle": 32,
"replicas": 500,
"cluster": {
"graph-00" : "127.0.0.1:6070"
}
},
"tsdb": {
"enabled": false,
"batch": 200,
"connTimeout": 1000,
"callTimeout": 5000,
"maxConns": 32,
"maxIdle": 32,
"retry": 3,
"address": "127.0.0.1:8088"
}
}
4.Graph
graph组件是存储绘图数据、历史数据的组件。transfer会把接收到的数据,转发给graph。
cd $WORKSPACE/graph/ mv cfg.example.json cfg.jsonmkdir -p /home/falcon/data/6070 #新建graph数据存储目录 # 默认情况下(所有组件都在同一台服务器上),保持cfg.json不变即可 # cfg.json中的各配置项,可以参考 https://github.com/open-falcon/graph/blob/master/README.md # 启动 ./control start # 查看日志 ./control tail # 校验服务,这里假定服务开启了6071的http监听端口。检验结果为ok表明服务正常启动。 curl -s "http://127.0.0.1:6071/health"
[falcon@open-falcon-demo graph]$ more cfg.json
{
"pid": "/home/falcon/open-falcon/graph/var/app.pid", #修改为本机实际的目录
"log": "info",
"debug": false,
"http": {
"enabled": true,
"listen": "0.0.0.0:6071"
},
"rpc": {
"enabled": true,
"listen": "0.0.0.0:6070"
},
"rrd": {
"storage": "/home/falcon/data/6070" #graph数据存储目录,需要手动建立
},
"db": {
"dsn": "root:mysql@tcp(127.0.0.1:3306)/graph?loc=Local&parseTime=true", #标记红色的为MySQL数据的root密码
"maxIdle": 4
},
"callTimeout": 5000,
"migrate": {
"enabled": false,
"concurrency": 2,
"replicas": 500,
"cluster": {
"graph-00" : "127.0.0.1:6070"
}
}
}
5.Query
query组件,绘图数据的查询接口,query组件收到用户的查询请求后,会从后端的多个graph,查询相应的数据,聚合后,再返回给用户。
cd $WORKSPACE/query/ mv cfg.example.json cfg.json#进入query目录新建graph_backends.txt文件,并写入graph相关的内容,内容来源于graph的cfg.json的migrate>clustercd /home/falcon/tmp/queryvi graph_backends.txt graph-00 127.0.0.1:6070 # 默认情况下(所有组件都在同一台服务器上),保持cfg.json不变即可 # cfg.json中的各配置项,可以参考 https://github.com/open-falcon/query/blob/master/README.md # 启动 ./control start # 查看日志 ./control tail
[falcon@open-falcon-demo query]$ more cfg.json
{
"log_level": "info",
"slowlog": 2000,
"debug": "false",
"http": {
"enabled": true,
"listen": "0.0.0.0:9966"
},
"graph": {
"backends": "./graph_backends.txt",
"reload_interval": 60,
"connTimeout": 1000,
"callTimeout": 5000,
"maxConns": 32,
"maxIdle": 32,
"replicas": 500,
"cluster": {
"graph-00": "127.0.0.1:6070"
}
},
"api": {
"query": "http://127.0.0.1:9966",
"dashboard": "http://127.0.0.1:8081",
"max": 500
}
}
6.Dashboard
dashboard是面向用户的查询界面,在这里,用户可以看到push到graph中的所有数据,并查看其趋势图。
Install dependency #配置EPEL源,安装virtualenv环境 rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm yum install -y python-pip pip install virtualenv#根据MySQL实际路径,新建两个软连接
ln -s /usr/local/mysql/lib/libmysqlclient.so.20 /usr/lib/libmysqlclient.so.20
ln -s /usr/local/mysql/lib/libmysqlclient.so.20 /usr/lib64/libmysqlclient.so.20
#将pip_requirements.txt中的mysql-python这一行去掉,使用easy_install单独安装#进入到virtualenv环境
[falcon@open-falcon-demo dashboard]$ virtualenv env
[falcon@open-falcon-demo dashboard]$ source env/bin/activate
#安装mysql-python
(env)[falcon@open-falcon-demo dashboard]$ easy_install mysql-python
#查看READ.me文件,找到./env/bin/pip install -r pip_requirements.txt -i http://pypi.douban.com/simple这行然后执行
(env)[falcon@open-falcon-demo dashboard]$ ./env/bin/pip install -r pip_requirements.txt -i http://pypi.douban.com/simple
#启动Dashboard
(env)[falcon@open-falcon-demo dashboard]$ ./control start
#查看Dashboard启动状态
(env)[falcon@open-falcon-demo dashboard]$ ./control status
#查看日志
(env)[falcon@open-falcon-demo dashboard]$ ./control tail
#退出virtualenv环境
(env)[falcon@open-falcon-demo dashboard]$ deactivate
#启动完成后,可通过浏览器进行访问
http://192.168.102.141:8081/
【配置文件】
/home/falcon/tmp/dashboard/rrd/config.py
[falcon@open-falcon-demo rrd]$ more config.py
#-*-coding:utf8-*-
import os
#-- dashboard db config --
DASHBOARD_DB_HOST = "127.0.0.1"
DASHBOARD_DB_PORT = 3306
DASHBOARD_DB_USER = "root"
DASHBOARD_DB_PASSWD = "mysql"
DASHBOARD_DB_NAME = "dashboard"
#-- graph db config --
GRAPH_DB_HOST = "127.0.0.1"
GRAPH_DB_PORT = 3306
GRAPH_DB_USER = "root"
GRAPH_DB_PASSWD = "mysql"
GRAPH_DB_NAME = "graph"
#-- app config --
DEBUG = True
SECRET_KEY = "secret-key"
SESSION_COOKIE_NAME = "open-falcon"
PERMANENT_SESSION_LIFETIME = 3600 * 24 * 30
SITE_COOKIE = "open-falcon-ck"
#-- query config --
QUERY_ADDR = "http://127.0.0.1:9966"
#BASE_DIR = "/home/falcon/open-falcon/dashboard/"
BASE_DIR="/home/falcon/data/6070" #和graph新建的数据存储目录相同
LOG_PATH = os.path.join(BASE_DIR,"log/")
try:
from rrd.local_config import *
except:
pass
7.task
cd /home/falcon/tmp/task mv cfg.example.json cfg.json
#修改配置文件[falcon@open-falcon-demo task]$ more cfg.json
{
"debug": false,
"http": {
"enable": true,
"listen": "0.0.0.0:8002"
},
"index": {
"enable": true,
"dsn": "root:mysql@tcp(127.0.0.1:3306)/graph?loc=Local&parseTime=true", #MySQL的root密码
"maxIdle": 4,
"autoDelete": false,
"cluster":{
"test.hostname01:6071" : "0 0 0 ? * 0-5",
"test.hostname02:6071" : "0 30 0 ? * 0-5"
}
},
"collector" : {
"enable": true,
"destUrl" : "http://127.0.0.1:1988/v1/push",
"srcUrlFmt" : "http://%s/statistics/all",
"cluster" : [
"transfer,test.hostname:6060",
"graph,test.hostname:6071",
"task,test.hostname:8001"
]
}
}
#启动task [falcon@open-falcon-demo task]$ ./control start #查看启动状态 [falcon@open-falcon-demo task]$ ./control status #查看日志 [falcon@open-falcon-demo task]$ ./control tail #重启 [falcon@open-falcon-demo task]$ ./control restart
四、安装Open-Falcon报警相关组件
1.Sender
调用各个公司提供的mail-provider和sms-provider,按照某个并发度,从redis中读取邮件、短信并发送,alarm生成的报警短信和报警邮件都是直接写入redis即可,sender来发送。
cd $WORKSPACE/sender/ mv cfg.example.json cfg.json # vi cfg.json # redis地址需要和后面的alarm、judge使用同一个 # queue维持默认 # worker是最多同时有多少个线程玩命得调用短信、邮件发送接口 # api要给出sms-provider和mail-provider的接口地址 ./control start
[falcon@open-falcon-demo sender]$ more cfg.json
{
"debug": false,
"http": {
"enabled": true,
"listen": "0.0.0.0:6066"
},
"redis": {
"addr": "127.0.0.1:6379",
"maxIdle": 5
},
"queue": {
"sms": "/sms",
"mail": "/mail"
},
"worker": {
"sms": 10,
"mail": 50
},
"api": {
"sms": "http://11.11.11.11:8000/sms",
"mail": "http://11.11.11.11:9000/mail"
}
}
2.UIC(FE)
cd $WORKSPACE/fe/ mv cfg.example.json cfg.json # 请基于cfg.example.json 酌情修改相关配置项 # 启动 ./control start # 查看日志 ./control tail # 停止服务 ./control stop
[falcon@open-falcon-demo fe]$ more cfg.json
{
"log": "debug",
"company": "MI",
"http": {
"enabled": true,
"listen": "0.0.0.0:1234"
},
"cache": {
"enabled": true,
"redis": "127.0.0.1:6379",
"idle": 10,
"max": 1000,
"timeout": {
"conn": 10000,
"read": 5000,
"write": 5000
}
},
"salt": "",
"canRegister": true,
"ldap": {
"enabled": false,
"addr": "ldap.example.com:389",
"baseDN": "dc=example,dc=com",
"bindDN": "cn=mananger,dc=example,dc=com",
"bindPasswd": "12345678",
"userField": "uid",
"attributes": ["sn","mail","telephoneNumber"]
},
"uic": {
"addr": "root:mysql@tcp(127.0.0.1:3306)/uic?charset=utf8&loc=Asia%2FChongqing", #红色为MySQL数据库root密码
"idle": 10,
"max": 100
},
"shortcut": {
"falconPortal": "http://192.168.102.141:5050/", #Portal访问地址
"falconDashboard": "http://192.168.102.141:8081/", #Dashboard访问地址
"falconAlarm": "http://192.168.102.141:9912/" #Alarm访问地址
}
}
3.Portal
portal是用于配置报警策略的地方。
yum install -y python-virtualenv # run as root cd $WORKSPACE/portal/ virtualenv ./env ./env/bin/pip install -r pip_requirements.txt # vi frame/config.py # 1. 修改DB配置 # 2. SECRET_KEY设置为一个随机字符串 # 3. UIC_ADDRESS有两个,internal配置为FE模块的内网地址,portal通常是和UIC在一个网段的, # 内网地址相互访问速度快。external是终端用户通过浏览器访问的UIC地址,很重要! # 4. 其他配置可以使用默认的 ./control start portal默认监听在5050端口,浏览器访问即可
more /home/falcon/tmp/portal/frame/config.py
# -*- coding:utf-8 -*-
__author__ = 'Ulric Qin'
# -- app config --
DEBUG = True
# -- db config --
DB_HOST = "127.0.0.1"
DB_PORT = 3306
DB_USER = "root"
DB_PASS = "mysql" #数据库密码
DB_NAME = "falcon_portal"
# -- cookie config --
SECRET_KEY = "4e.5tyg8-u9ioj"
SESSION_COOKIE_NAME = "falcon-portal"
PERMANENT_SESSION_LIFETIME = 3600 * 24 * 30
UIC_ADDRESS = {
'internal': 'http://127.0.0.1:1234',
'external': 'http://192.168.102.141:1234', #可通过浏览器访问的地址
}
UIC_TOKEN = ''
MAINTAINERS = ['root']
CONTACT = 'ulric.qin@gmail.com'
COMMUNITY = True
try:
from frame.local_config import *
except Exception, e:
print "[warning] %s" % e
4.HBS
心跳服务器,只依赖Portal的DB cd $WORKSPACE/hbs/ mv cfg.example.json cfg.json # vi cfg.json 把数据库配置配置为portal的db ./control start 如果先安装的绘图组件又来安装报警组件,那应该已经安装过agent了,hbs启动之后会监听一个http端口,一个rpc端口,agent要和hbs通信,重新去修改agent的配置cfg.json,把heartbeat那项enabled设置为true,并配置上hbs的rpc地址,./control restart重启agent,之后agent就可以和hbs心跳了
[falcon@open-falcon-demo hbs]$ more cfg.json
{
"debug": true,
"database": "root:mysql@tcp(127.0.0.1:3306)/falcon_portal?loc=Local&parseTime=true",
"hosts": "",
"maxIdle": 100,
"listen": ":6030",
"trustable": [""],
"http": {
"enabled": true,
"listen": "0.0.0.0:6031"
}
}
5.Judge
报警判断模块,judge依赖于HBS,所以得先搭建HBS
cd $WORKSPACE/judge/ mv cfg.example.json cfg.json # vi cfg.json # remain: 这个配置指定了judge内存中针对某个数据存多少个点,比如host01这个机器的cpu.idle的值在内存中最多存多少个, # 配置报警的时候比如all(#3),这个#后面的数字不能超过remain-1 # hbs: 配置为hbs的地址,interval默认是60s,表示每隔60s从hbs拉取一次策略 # alarm: 报警event写入alarm中配置的redis,minInterval表示连续两个报警之间至少相隔的秒数,维持默认即可 ./control start
[falcon@open-falcon-demo judge]$ more cfg.json
{
"debug": true,
"debugHost": "nil",
"remain": 11,
"http": {
"enabled": true,
"listen": "0.0.0.0:6081"
},
"rpc": {
"enabled": true,
"listen": "0.0.0.0:6080"
},
"hbs": {
"servers": ["127.0.0.1:6030"],
"timeout": 300,
"interval": 60
},
"alarm": {
"enabled": true,
"minInterval": 300,
"queuePattern": "event:p%v",
"redis": {
"dsn": "127.0.0.1:6379",
"maxIdle": 5,
"connTimeout": 5000,
"readTimeout": 5000,
"writeTimeout": 5000
}
}
}
6.Links
links组件的作用:当多个告警被合并为一条告警信息时,短信中会附带一个告警详情的http链接地址,供用户查看详情。
# yum install -y python-virtualenv $ cd $WORKSPACE/links/ $ virtualenv ./env $ ./env/bin/pip install -r pip_requirements.txt
./control start ./control status ./control tail
cd /home/falcon/tmp/links/frame
[falcon@open-falcon-demo frame]$ more config.py
# -*- coding:utf-8 -*-
__author__ = 'Ulric Qin'
# -- app config --
DEBUG = True
# -- db config --
DB_HOST = "127.0.0.1"
DB_PORT = 3306
DB_USER = "root"
DB_PASS = "mysql"
DB_NAME = "falcon_links"
# -- cookie config --
SECRET_KEY = "4e.5tyg8-u9ioj"
SESSION_COOKIE_NAME = "falcon-links"
PERMANENT_SESSION_LIFETIME = 3600 * 24 * 30
try:
from frame.local_config import *
except Exception, e:
print "[warning] %s" % e
7.Alarm
alarm模块是处理报警event的,judge产生的报警event写入redis,alarm从redis读取,这个模块被业务搞得很糟乱,各个公司可以根据自己公司的需求重写.
cd $WORKSPACE/alarm/ mv cfg.example.json cfg.json # vi cfg.json # 把redis配置成与judge同一个 ./control start
注意,alarm当前的版本,highQueues和lowQueues都不能为空,是个bug,稍候修复。我们可以把event:p0~event:p5配置到highQueues,把event:p6配置到lowQueues
[falcon@open-falcon-demo alarm]$ more cfg.json
{
"debug": true,
"uicToken": "",
"http": {
"enabled": true,
"listen": "0.0.0.0:9912"
},
"queue": {
"sms": "/sms",
"mail": "/mail"
},
"redis": {
"addr": "127.0.0.1:6379",
"maxIdle": 5,
"highQueues": [
"event:p0",
"event:p1",
"event:p2",
"event:p3",
"event:p4",
"event:p5"
],
"lowQueues": [
"event:p6"
],
"userSmsQueue": "/queue/user/sms",
"userMailQueue": "/queue/user/mail"
},
"api": {
"portal": "http://192.168.102.141:5050",
"uic": "http://127.0.0.1:1234",
"links": "http://192.168.102.141:5090"
}
}
PS:本例安装open-falcon时是使用falcon用户安装的。
falcon用户的家目录是:/home/falcon
所有配置好的配置文件的打包在这里:https://pan.baidu.com/s/1ii6r0-iJYYt4Mn_WzHcfcw
【agent】
http://192.168.102.141:1988/
【dashboard】
http://192.168.102.141:8081/
【uic/fe】
http://192.168.102.141:1234/
【Portal】
http://192.168.102.141:5050/
【alarm】
http://192.168.102.141:9912/
手动触发graph
curl -s "http://127.0.0.1:6071/index/updateAll"
来源:https://www.cnblogs.com/xialiaoliao0911/p/8546211.html