以下文章來源于:https://testerhome.com/topics/3026?locale=zh-cn「發(fā)表于 TesterHome 」 作者:htmlbiji (超愛fitnesse)
日志管理工具總覽
先看看 推薦跷车!國外程序員整理的系統(tǒng)管理員資源大全 中饭弓,國外程序員整理的日志聚合工具的列表:
日志管理工具:收集,解析叫潦,可視化
- Elasticsearch - 一個(gè)基于Lucene的文檔存儲(chǔ),主要用于日志索引官硝、存儲(chǔ)和分析诅挑。
- Fluentd - 日志收集和發(fā)出
- Flume -分布式日志收集和聚合系統(tǒng)
- Graylog2 -具有報(bào)警選項(xiàng)的可插入日志和事件分析服務(wù)器
- Heka -流處理系統(tǒng),可用于日志聚合
- Kibana - 可視化日志和時(shí)間戳數(shù)據(jù)
- Logstash -管理事件和日志的工具
- Octopussy -日志管理解決方案(可視化/報(bào)警/報(bào)告)
Graylog與ELK方案的對(duì)比
- ELK: Logstash -> Elasticsearch -> Kibana
- Graylog: Graylog Collector -> Graylog Server(封裝Elasticsearch) -> Graylog Web
之前試過Flunted + Elasticsearch + Kibana的方案泛源,發(fā)現(xiàn)有幾個(gè)缺點(diǎn):
- 不能處理多行日志拔妥,比如Mysql慢查詢,Tomcat/Jetty應(yīng)用的Java異常打印
- 不能保留原始日志达箍,只能把原始日志分字段保存没龙,這樣搜索日志結(jié)果是一堆Json格式文本,無法閱讀。
- 不符合正則表達(dá)式匹配的日志行硬纤,被全部丟棄解滓。
本著解決以上3個(gè)缺點(diǎn)的原則,再次尋找替代方案筝家。
首先找到了商業(yè)日志工具Splunk洼裤,號(hào)稱日志界的Google,意思是全文搜索日志的能力溪王,不光能解決以上3個(gè)缺點(diǎn)腮鞍,還提供搜索單詞高亮顯示,不同錯(cuò)誤級(jí)別日志標(biāo)色等吸引人的特性莹菱,但是免費(fèi)版有500M限制移国,付費(fèi)版據(jù)說要3萬美刀,只能放棄道伟,繼續(xù)尋找迹缀。
最后找到了Graylog,第一眼看到Graylog蜜徽,只是系統(tǒng)日志syslog的采集工具祝懂,一點(diǎn)也沒吸引到我。但后來深入了解后拘鞋,才發(fā)現(xiàn)Graylog簡直就是開源版的Splunk嫂易。
我自己總結(jié)的Graylog吸引人的地方:
- 一體化方案,安裝方便掐禁,不像ELK有3個(gè)獨(dú)立系統(tǒng)間的集成問題怜械。
- 采集原始日志,并可以事后再添加字段傅事,比如http_status_code缕允,response_time等等。
- 自己開發(fā)采集日志的腳本蹭越,并用curl/nc發(fā)送到Graylog Server障本,發(fā)送格式是自定義的GELF,F(xiàn)lunted和Logstash都有相應(yīng)的輸出GELF消息的插件响鹃。自己開發(fā)帶來很大的自由度驾霜。實(shí)際上只需要用inotifywait監(jiān)控日志的modify事件,并把日志的新增行用curl/netcat發(fā)送到Graylog Server就可买置。
- 搜索結(jié)果高亮顯示粪糙,就像google一樣。
- 搜索語法簡單忿项,比如:
source:mongo AND reponse_time_ms:>5000
蓉冈,避免直接輸入elasticsearch搜索json語法 - 搜索條件可以導(dǎo)出為elasticsearch的搜索json文本城舞,方便直接開發(fā)調(diào)用elasticsearch rest api的搜索腳本。
Graylog圖解
Graylog開源版官網(wǎng): https://www.graylog.org/
來幾張官網(wǎng)的截圖:
1.架構(gòu)圖
2.屏幕截圖
3.部署圖
最小安裝:
生產(chǎn)環(huán)境安裝:
Graylog服務(wù)器安裝
包括四塊內(nèi)容:
- mongodb
- elasticsearch
- graylog-server
- graylog-web
以下環(huán)境是CentOS 6.6寞酿,服務(wù)器ip是10.0.0.11家夺,已安裝jre-1.7.0-openjdk
1. mongodb
http://docs.mongodb.org/manual/tutorial/install-mongodb-on-red-hat
[root@logserver yum.repos.d]# vim /etc/yum.repos.d/mongodb-org-3.0.repo
---
[mongodb-org-3.0]
name=MongoDB Repository
baseurl=http://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.0/x86_64/
gpgcheck=0
enabled=1
---
[root@logserver yum.repos.d]# yum install -y mongodb-org
[root@logserver yum.repos.d]# vi /etc/yum.conf
最后一行添加:
---
exclude=mongodb-org,mongodb-org-server,mongodb-org-shell,mongodb-org-mongos,mongodb-org-tools
---
[root@logserver yum.repos.d]# service mongod start
[root@logserver yum.repos.d]# chkconfig mongod on
[root@logserver yum.repos.d]# vi /etc/security/limits.conf
最后一行添加:
---
* soft nproc 65536
* hard nproc 65536
mongod soft nproc 65536
* soft nofile 131072
* hard nofile 131072
---
[root@logserver ~]# vi /etc/init.d/mongod
ulimit -f unlimited 行前插入:
---
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi
---
[root@logserver ~]# /etc/init.d/mongod restart
2. elasticsearch
Elasticsearch的最新版是1.6.0
https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-repositories.html
[root@logserver ~]# rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
[root@logserver ~]# vi /etc/yum.repos.d/elasticsearch.repo
---
[elasticsearch-1.5]
name=Elasticsearch repository for 1.5.x packages
baseurl=http://packages.elastic.co/elasticsearch/1.5/centos
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1
---
[root@logserver ~]# yum install elasticsearch
[root@logserver ~]# chkconfig --add elasticsearch
[root@logserver ~]# vi /etc/elasticsearch/elasticsearch.yml
32 cluster.name: graylog
[root@logserver ~]# /etc/init.d/elasticsearch start
[root@logserver ~]# curl localhost:9200
3. graylog
Graylog的最新版是 1.1.4 ,下載鏈接如下:
https://packages.graylog2.org/repo/el/6Server/1.1/x86_64/graylog-server-1.1.4-1.noarch.rpm
https://packages.graylog2.org/repo/el/6Server/1.1/x86_64/graylog-web-1.1.4-1.noarch.rpm
[root@logserver ~]# wget https://packages.graylog2.org/repo/el/6Server/1.0/x86_64/graylog-server-1.0.2-1.noarch.rpm
[root@logserver ~]# wget https://packages.graylog2.org/repo/el/6Server/1.0/x86_64/graylog-web-1.0.2-1.noarch.rpm
[root@logserver ~]# rpm -ivh graylog-server-1.0.2-1.noarch.rpm
[root@logserver ~]# rpm -ivh graylog-web-1.0.2-1.noarch.rpm
[root@logserver ~]# /etc/init.d/graylog-server start
Starting graylog-server: [確定]
啟動(dòng)失敺サ拉馋!
[root@logserver ~]# cat /var/log/graylog-server/server.log
2015-05-22T15:53:14.962+08:00 INFO [CmdLineTool] Loaded plugins: []
2015-05-22T15:53:15.032+08:00 ERROR [Server] No password secret set. Please define password_secret in your graylog2.conf.
2015-05-22T15:53:15.033+08:00 ERROR [CmdLineTool] Validating configuration file failed - exiting.
[root@logserver ~]# yum install pwgen
[root@logserver ~]# pwgen -N 1 -s 96
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
[root@logserver ~]# echo -n 123456 | sha256sum
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx -
[root@logserver ~]# vi /etc/graylog/server/server.conf
11 password_secret = zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
...
22 root_password_sha2 = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
...
152 elasticsearch_cluster_name = graylog
[root@logserver ~]# /etc/init.d/graylog-server restart
啟動(dòng)成功!
[root@logserver ~]# /etc/init.d/graylog-web start
Starting graylog-web: [確定]
啟動(dòng)失敳液谩煌茴!
[root@logserver ~]# cat /var/log/graylog-web/application.log
2015-05-22T15:53:22.960+08:00 - [ERROR] - from lib.Global in main
Please configure application.secret in your conf/graylog-web-interface.conf
2015-05-22T16:25:55.343+08:00 - [ERROR] - from lib.Global in main
Please configure application.secret in your conf/graylog-web-interface.conf
[root@logserver ~]# pwgen -N 1 -s 96
yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
[root@logserver ~]# vi /etc/graylog/web/web.conf
---
2 graylog2-server.uris="http://127.0.0.1:12900/"
12 application.secret="yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy"
---
注意:/etc/graylog/web/web.conf中的graylog2-server.uris值必須與/etc/graylog/server/server.conf中的rest_listen_uri一致
---
36 rest_listen_uri = http://127.0.0.1:12900/
---
[root@logserver ~]# /etc/init.d/graylog-web start
瀏覽器中輸入url: http://10.0.0.11:9000/ 可以進(jìn)入graylog登錄頁,
管理員帳號(hào)/密碼: admin/123456
4. 添加日志收集器
以admin登錄http://10.0.0.11:9000/
4.1 進(jìn)入 System > Inputs > Inputs in Cluster > Raw/Plaintext TCP | Launch new input
取名"tcp 5555" 完成創(chuàng)建
任何安裝nc的Linux機(jī)器上執(zhí)行:
echo `date` | nc 10.0.0.11 5555
瀏覽器的http://10.0.0.11:9000/登錄后首頁 昧狮,點(diǎn)擊第三行綠色搜索按鈕景馁,看到一條新消息:
Timestamp Source Message
2015-05-22 08:49:15.280 10.0.0.157 2015年 05月 22日 星期五 16:48:28 CST
說明安裝已成功0遄场逗鸣!
4.2 進(jìn)入 System > Inputs > Inputs in Cluster > GELF HTTP | Launch new input
取名"http 12201" 完成創(chuàng)建
任何安裝curl的Linux機(jī)器上執(zhí)行:
curl -XPOST http://10.0.0.11:12201/gelf -p0 -d '{"short_message":"Hello there", "host":"example.org", "facility":"test", "_foo":"bar"}'
瀏覽器的http://10.0.0.11:9000/登錄后首頁 ,點(diǎn)擊第三行綠色搜索按鈕绰精,看到一條新消息:
Timestamp Source Message
2015-05-22 08:49:15.280 10.0.0.157 Hello there
說明GELF HTTP Input設(shè)置成功H鲨怠!
5. 時(shí)區(qū)和高亮設(shè)置
admin帳號(hào)的時(shí)區(qū):
[root@logserver ~]# vi /etc/graylog/server/server.conf
---
30 root_timezone = Asia/Shanghai
---
[root@logserver ~]# /etc/init.d/graylog-server restart
其他帳號(hào)的默認(rèn)時(shí)區(qū):
[root@logserver ~]# vi /etc/graylog/web/web.conf
---
18 timezone="Asia/Shanghai"
---
[root@logserver ~]# /etc/init.d/graylog-web restart
允許查詢結(jié)果高亮:
[root@logserver ~]# vi /etc/graylog/server/server.conf
---
147 allow_highlighting = true
---
[root@logserver ~]# /etc/init.d/graylog-server restart
發(fā)送日志到Graylog服務(wù)器
使用http協(xié)議發(fā)送:
http://docs.graylog.org/en/1.1/pages/sending_data.html#gelf-via-http
curl -XPOST http://graylog.example.org:12202/gelf -p0 -d '{"short_message":"Hello there", "host":"example.org", "facility":"test", "_foo":"bar"}'
使用tcp協(xié)議發(fā)送
http://docs.graylog.org/en/1.1/pages/sending_data.html#raw-plaintext-inputs
echo "hello, graylog" | nc graylog.example.org 5555
結(jié)合inotifywait收集nginx日志
gather-nginx-log.sh
#!/bin/bash
app=nginx
node=$HOSTNAME
log_file=/var/log/nginx/nginx.log
graylog_server_ip=10.0.0.11
graylog_server_port=12201
while inotifywait -e modify $log_file; do
last_size=`cat ${app}.size`
curr_size=`stat -c%s $log_file`
echo $curr_size > ${app}.size
count=`echo "$curr_size-$last_size" | bc`
python read_log.py $log_file ${last_size} $count | sed 's/"/\\\\\"/g' > ${app}.new_lines
while read line
do
if echo "$line" | grep "^20[0-9][0-9]-[0-1][0-9]-[0-3][0-9]" > /dev/null; then
seconds=`echo "$line" | cut -d ' ' -f 6`
spend_ms=`echo "${seconds}*1000/1" | bc`
http_status=`echo "$line" | cut -d ' ' -f 2`
echo "http_status -- $http_status"
prefix_number=${http_status:0:1}
if [ "$prefix_number" == "5" ]; then
level=3 #ERROR
elif [ "$prefix_number" == "4" ]; then
level=4 #WARNING
elif [ "$prefix_number" == "3" ]; then
level=5 #NOTICE
elif [ "$prefix_number" == "2" ]; then
level=6 #INFO
elif [ "$prefix_number" == "1" ]; then
level=7 #DEBUG
fi
echo "level -- $level"
curl -XPOST http://${graylog_server_ip}:${graylog_server_port}/gelf -p0 -d "{\"short_mess
sage\":\"$line\", \"host\":\"${app}\", \"level\":${level}, \"_node\":\"${node}\", \"_spend_msecs\":$
{spend_ms}, \"_http_status\":${http_status}}"
echo "gathered -- $line"
fi
done < ${app}.new_lines
done
read_log.py
#!/usr/bin/python
#coding=utf-8
import sys
import os
if len(sys.argv) < 4:
print "Usage: %s /path/of/log/file print_from count" % (sys.argv[0])
print "Example: %s /var/log/syslog 90000 100" % (sys.argv[0])
sys.exit(1)
filename = sys.argv[1]
if (not os.path.isfile(filename)):
print "%s not existing!!!" % (filename)
sys.exit(1)
filesize = os.path.getsize(filename)
position = int(sys.argv[2])
if (filesize < position):
print "log file may cut by logrotate.d, print log from begin!" % (position,filesize)
position = 0
count = int(sys.argv[3])
fo = open(filename, "r")
fo.seek(position, 0)
content = fo.read(count)
print content.strip()
# Close opened file
fo.close()
5秒一次收集iotop日志笨使,找出高速讀寫磁盤的進(jìn)程
#!/bin/bash
app=iotop
node=$HOSTNAME
graylog_server_ip=10.0.0.11
graylog_server_port=12201
while true; do
sudo /usr/sbin/iotop -b -o -t -k -q -n2 | sed 's/"/\\\\\"/g' > /dev/shm/graylog_client.${app}.new_lines
while read line; do
if echo "$line" | grep "^[0-2][0-9]:[0-5][0-9]:[0-5][0-9]" > /dev/null; then
read -a WORDS <<< $line
epoch_seconds=`date --date="${WORDS[0]}" +%s.%N`
pid=${WORDS[1]}
read_float_kps=${WORDS[4]}
read_int_kps=${read_float_kps%.*}
write_float_kps=${WORDS[6]}
write_int_kps=${write_float_kps%.*}
command=${WORDS[12]}
if [ "$command" == "bash" ] && (( ${#WORDS[*]} > 13 )); then
pname=${WORDS[13]}
elif [ "$command" == "java" ] && (( ${#WORDS[*]} > 13 )); then
arg0=${WORDS[13]}
pname=${arg0#*=}
else
pname=$command
fi
curl --connect-timeout 1 -s -XPOST http://${graylog_server_ip}:${graylog_server_port}/gelf -p0 -d "{\"timestamp\":$epoch_seconds, \"short_message\":\"${line::200}\", \"full_message\":\"$line\", \"host\":\"${app}\", \"_node\":\"${node}\", \"_pid\":${pid}, \"_read_kps\":${read_int_kps}, \"_write_kps\":${write_int_kps}, \"_pname\":\"${pname}\"}"
fi
done < /dev/shm/graylog_client.${app}.new_lines
sleep 4
done
收集android app日志
device.env
export device=4b13c85c
export app=com.tencent.mm
export filter="\( I/ServerAsyncTask2(\| W/\| E/\)"
export graylog_server_ip=10.0.0.11
export graylog_server_port=12201
adblog.sh
#!/bin/bash
. ./device.env
adb -s $device logcat -v time *:I | tee -a adb.log
gather-androidapp-log.sh
#!/bin/bash
. ./device.env
log_file=./adb.log
node=$device
if [ ! -f $log_file ]; then
echo $log_file not exist!!
echo 0 > ${app}.size
exit 1
fi
if [ ! -f ${app}.size ]; then
curr_size=`stat -c%s $log_file`
echo $curr_size > ${app}.size
fi
while inotifywait -qe modify $log_file > /dev/null; do
last_size=`cat ${app}.size`
curr_size=`stat -c%s $log_file`
echo $curr_size > ${app}.size
pids=`./getpids.py $app $device`
if [ "$pids" == "" ]; then
continue
fi
count=`echo "$curr_size-$last_size" | bc`
python read_log.py $log_file ${last_size} $count | grep "$pids" | sed 's/"/\\\\\"/g' | sed 's/\t/ /g' > ${app}.new_lines
#echo "${app}.new_lines lines: `wc -l ${app}.new_lines`"
while read line
do
if echo "$line" | grep "$filter" > /dev/null; then
priority=${line:19:1}
if [ "$priority" == "F" ]; then
level=1 #ALERT
elif [ "$priority" == "E" ]; then
level=3 #ERROR
elif [ "$priority" == "W" ]; then
level=4 #WARNING
elif [ "$priority" == "I" ]; then
level=6 #INFO
fi
#echo "level -- $level"
curl -XPOST http://${graylog_server_ip}:${graylog_server_port}/gelf -p0 -d "{\"short_message\":\"$line\", \"host\":\"${app}\", \"level\":${level}, \"_node\":\"${node}\"}"
echo "GATHERED -- $line"
#else
#echo "ignored -- $line"
fi
done < ${app}.new_lines
done
get_pids.py
#!/usr/bin/python
import sys
import os
import commands
if __name__ == "__main__":
if len(sys.argv) != 3:
print sys.argv[0]+" packageName device"
sys.exit()
device = sys.argv[2]
cmd = "adb -s "+device+" shell ps | grep "+sys.argv[1]+" | cut -c11-15"
output = commands.getoutput(cmd)
if output == "":
sys.exit()
originpids = output.split("\n")
strippids = map((lambda pid: int(pid,10)), originpids)
pids = map((lambda pid: "%5d" %pid), strippids)
pattern = "\(("+")\|(".join(pids)+")\)"
print pattern