當(dāng)需要做一個(gè)日志系統(tǒng)時(shí)温艇,可能第一時(shí)間會(huì)想到ELK全家桶哀九。
ELK分別表示:Elasticsearch , Logstash, Kibana 上炎。他們組成了一套完整的日志系統(tǒng)的解決方案涯贞,方便我們使用晶府。
- Logstash 對(duì)各個(gè)服務(wù)的日志進(jìn)行采集扎谎、過濾、推送。
- Elasticsearch 存儲(chǔ)Logstash傳送的結(jié)構(gòu)化數(shù)據(jù)龙填,提供給Kibana岩遗。
- Kibana 提供用戶UIweb頁面進(jìn)行梆靖,數(shù)據(jù)展示和分析形成圖表等
在使用Logstash進(jìn)行收集的時(shí)候,僅僅需要簡單的一個(gè)配置文件就可以啟動(dòng)一個(gè)logstash-agent來進(jìn)行日志收集。
input {
file {
type => "test"
path => ["/home/hadoop/hao.zhang/logstashTest/1.txt"]
sincedb_path => "/home/hadoop/elk/logstash-agent/conf/datafall/test_sincedb"
}
}
output{
stdout {
}
}
上述示例簡單的從某個(gè)文件目錄下收集日志打印到控制臺(tái)剂公。
然后我們啟動(dòng)這個(gè)logstash-agent:bin/logstash -f conf/datafall/test.config
往日志文件push數(shù)據(jù):
echo "hello world" >> 1.txt
就可以看到控制輸出:
172.20.33.5:hadoop@sz-pg-dm-it-001:/home/hadoop/elk/logstash-agent]$ bin/logstash -f conf/datafall/test.config
Settings: Default pipeline workers: 40
Pipeline main started
2019-05-07T12:17:37.113Z sz-pg-dm-it-001.tendcloud.com hello world
到此時(shí)是正確的一個(gè)流程鳞上。
誤操作開始了:
但是我們以vim
命令進(jìn)入日志文件的時(shí)候移怯,在文件末尾追加一個(gè)數(shù)據(jù)happy
姻乓,此時(shí)控制臺(tái)會(huì)輸出:
[172.20.33.5:hadoop@sz-pg-dm-it-001:/home/hadoop/elk/logstash-agent]$ bin/logstash -f conf/datafall/test.config
Settings: Default pipeline workers: 40
Pipeline main started
2019-05-07T12:17:37.113Z sz-pg-dm-it-001.tendcloud.com hello world
2019-05-07T12:22:58.313Z sz-pg-dm-it-001.tendcloud.com hello world
2019-05-07T12:22:58.314Z sz-pg-dm-it-001.tendcloud.com happy
會(huì)發(fā)現(xiàn)logstash會(huì)從文件開始從新讀取數(shù)據(jù)(此時(shí)就會(huì)造成數(shù)據(jù)的重復(fù)收集)。
為什么會(huì)出現(xiàn)這種情況呢?
Logstash有一個(gè)有趣的組件或功能叫做sincedb。該文件存儲(chǔ)了當(dāng)前l(fā)ogstash-agent收集的文件日志的offset士修。在前面test.config
配置了sincedb的位置沸移。
如果不配置,它會(huì)默認(rèn)在當(dāng)前用戶的根目錄下創(chuàng)建一個(gè).sincedb開頭的文件
攒射。
sincedb的具體內(nèi)容:
[172.20.33.5:hadoop@sz-pg-dm-it-001:/home/hadoop/elk/logstash-agent/conf/datafall]$ cat test_sincedb
4306020249 0 2052 12
4306020236 0 2052 18
第一列是收集日志文件的inode
第二列、第三列是當(dāng)前設(shè)備的一些值(先不用關(guān)心)。
第四列是已收集文件日志的offset。
每一行代表一個(gè)日志文件脓诡。
看到上述內(nèi)容交惯,發(fā)現(xiàn)我們明明只收集了一個(gè)日志文件袭异,為什么會(huì)又多出來一條記錄呢上真。
這就要追述到使用vim
命令編輯文件并保存時(shí)寇壳。相當(dāng)于會(huì)創(chuàng)建擁有全新inode的文件匿辩。
此時(shí)logstash會(huì)發(fā)現(xiàn)一個(gè)文件名一樣但是inode卻不一樣的文件。logstash還是會(huì)進(jìn)行文件的收集工作锨用。
通過debug模式可以看到更詳細(xì)的細(xì)節(jié):
each: new inode: /home/hadoop/hao.zhang/logstashTest/1.txt: old inode was ["4306020236", 0, 2052], new is ["4305990263", 0, 2052] {:level=>:debug, :file=>"filewatch/watch.rb", :line=>"245", :method=>"each"}
:delete for /home/hadoop/hao.zhang/logstashTest/1.txt, closing file {:level=>:debug, :file=>"filewatch/observing_tail.rb", :line=>"52", :method=>"subscribe"}
_open_file: /home/hadoop/hao.zhang/logstashTest/1.txt: opening {:level=>:debug, :file=>"filewatch/tail_base.rb", :line=>"86", :method=>"_open_file"}
Received line {:path=>"/home/hadoop/hao.zhang/logstashTest/1.txt", :text=>"hello world", :level=>:debug, :file=>"logstash/inputs/file.rb", :line=>"306", :method=>"log_line_received"}
Pushing flush onto pipeline {:level=>:debug, :file=>"logstash/pipeline.rb", :line=>"458", :method=>"flush"}
Received line {:path=>"/home/hadoop/hao.zhang/logstashTest/1.txt", :text=>"happy", :level=>:debug, :file=>"logstash/inputs/file.rb", :line=>"306", :method=>"log_line_received"}
Received line {:path=>"/home/hadoop/hao.zhang/logstashTest/1.txt", :text=>"happy", :level=>:debug, :file=>"logstash/inputs/file.rb", :line=>"306", :method=>"log_line_received"}
writing sincedb (delta since last write = 1557233399) {:level=>:debug, :file=>"filewatch/observing_tail.rb", :line=>"102", :method=>"observe_read_file"}
filter received {:event=>{"message"=>"hello world", "@version"=>"1", "@timestamp"=>"2019-05-07T12:49:59.108Z", "path"=>"/home/hadoop/hao.zhang/logstashTest/1.txt", "host"=>"sz-pg-dm-it-001.tendcloud.com", "type"=>"test"}, :level=>:debug, :file=>"(eval)", :line=>"17", :method=>"filter_func"}
filter received {:event=>{"message"=>"happy", "@version"=>"1", "@timestamp"=>"2019-05-07T12:49:59.116Z", "path"=>"/home/hadoop/hao.zhang/logstashTest/1.txt", "host"=>"sz-pg-dm-it-001.tendcloud.com", "type"=>"test"}, :level=>:debug, :file=>"(eval)", :line=>"17", :method=>"filter_func"}
filter received {:event=>{"message"=>"happy", "@version"=>"1", "@timestamp"=>"2019-05-07T12:49:59.118Z", "path"=>"/home/hadoop/hao.zhang/logstashTest/1.txt", "host"=>"sz-pg-dm-it-001.tendcloud.com", "type"=>"test"}, :level=>:debug, :file=>"(eval)", :line=>"17", :method=>"filter_func"}
output received {:event=>{"message"=>"hello world", "@version"=>"1", "@timestamp"=>"2019-05-07T12:49:59.108Z", "path"=>"/home/hadoop/hao.zhang/logstashTest/1.txt", "host"=>"sz-pg-dm-it-001.tendcloud.com", "type"=>"test"}, :level=>:debug, :file=>"(eval)", :line=>"22", :method=>"output_func"}
output received {:event=>{"message"=>"happy", "@version"=>"1", "@timestamp"=>"2019-05-07T12:49:59.116Z", "path"=>"/home/hadoop/hao.zhang/logstashTest/1.txt", "host"=>"sz-pg-dm-it-001.tendcloud.com", "type"=>"test"}, :level=>:debug, :file=>"(eval)", :line=>"22", :method=>"output_func"}
output received {:event=>{"message"=>"happy", "@version"=>"1", "@timestamp"=>"2019-05-07T12:49:59.118Z", "path"=>"/home/hadoop/hao.zhang/logstashTest/1.txt", "host"=>"sz-pg-dm-it-001.tendcloud.com", "type"=>"test"}, :level=>:debug, :file=>"(eval)", :line=>"22", :method=>"output_func"}
2019-05-07T12:49:59.108Z sz-pg-dm-it-001.tendcloud.com hello world
2019-05-07T12:49:59.116Z sz-pg-dm-it-001.tendcloud.com happy
2019-05-07T12:49:59.118Z sz-pg-dm-it-001.tendcloud.com happy
Pushing flush onto pipeline {:level=>:debug, :file=>"logstash/pipeline.rb", :line=>"458", :method=>"flush"}
從上述日志可以看出logstash會(huì)產(chǎn)生監(jiān)控到一個(gè)新的inode文件,并且在原有sincedb
文件中并沒有這個(gè)inode記錄阵难,因此logstash會(huì)從頭開始收集這個(gè)日志文件中的日志岳枷。
當(dāng)我們?cè)谑褂胠ogstash收集日志文件時(shí)空繁,盡量不要用Vim、vi
命令去打開日志文件,盡量使用cat钳榨、more
這之類的。
注:
inode: 操作系統(tǒng)中的文件數(shù)據(jù)都儲(chǔ)存在"塊"中枉层,當(dāng)然返干,我們還必須找到一個(gè)地方儲(chǔ)存文件的元信息,比如文件的創(chuàng)建者、文件的創(chuàng)建日期癌淮、文件的大小等等躺坟。這種儲(chǔ)存文件元信息的區(qū)域就叫做inode,中文譯名為"索引節(jié)點(diǎn)"
inode包含文件的元信息乳蓄,具體來說有以下內(nèi)容:
- 文件的字節(jié)數(shù)
- 文件擁有者的User ID
- 文件的Group ID
- 文件的讀咪橙、寫、執(zhí)行權(quán)限
- 文件的時(shí)間戳虚倒,共有三個(gè):ctime指inode上一次變動(dòng)的時(shí)間美侦,mtime指文件內(nèi)容上一次變動(dòng)的時(shí)間,atime指文件上一次打開的時(shí)間魂奥。
- 鏈接數(shù)菠剩,即有多少文件名指向這個(gè)inode
- 文件數(shù)據(jù)block的位置
一般情況下,文件名和inode號(hào)碼是"一一對(duì)應(yīng)"關(guān)系耻煤,每個(gè)inode號(hào)碼對(duì)應(yīng)一個(gè)文件名具壮。
vim、vi:為什么在編輯的時(shí)候會(huì)產(chǎn)生一個(gè)新的inode: 在使用vim打開文件是哈蝇,會(huì)把當(dāng)前打開的文件放入buffer中(內(nèi)存)棺妓,然后進(jìn)行操作。當(dāng)我們保存時(shí)炮赦,相當(dāng)于替換了原來的文件怜跑。所以會(huì)有個(gè)新的inode的文件產(chǎn)生。