grok的基本概念
grok是logstash的filter插件针余,可以實(shí)現(xiàn)對(duì)日志信息的過(guò)濾,詳細(xì)資料參考官方grok解析
它的語(yǔ)法結(jié)構(gòu)為
%{NUMBER:duration} %{IP:client}
grok支持正則匹配愧驱,熟悉正則的話就沒(méi)什么困難
自定義的grok正則派桩,可以訪問(wèn)Grok Debugger來(lái)調(diào)試
默認(rèn)正則可以在 $logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.0.0/patterns/ 目錄里面查看
基本定義在grok-patterns中泼差,我們可以使用其中的正則贵少,當(dāng)然并不是所有的都適合nginx字段,這時(shí)就需要我們自定義正則堆缘,然后通過(guò)指定patterns_dir來(lái)調(diào)用滔灶。
利用grok分析nginx的access.log
首先我們先看下nginx的access的日志格式
log_format access '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" $http_x_forwarded_for';
grok根據(jù)access的日志格式去制定正則匹配規(guī)則,所以我們可以在$logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.0.0/patterns/ 目錄創(chuàng)建nginx_access吼肥,正則內(nèi)容為以下
NGUSERNAME [a-zA-Z\.\@\-\+_%]+
METHOD (OPTIONS|GET|HEAD|POST|PUT|DELETE|TRACE|CONNECT)
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:client_ip} - %{NGUSER:remote_user} \[%{HTTPDATE:timestamp}\] "(?:%{METHOD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:http_version})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NOTSPACE:http_x_forwarded_for}
利用grok分析nginx的error日志
在$logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.0.0/patterns/ 目錄創(chuàng)建nginx_error录平,內(nèi)容為以下
ERRORDATE %{YEAR}/%{MONTHNUM}/%{MONTHDAY} %{TIME}
METHOD (OPTIONS|GET|HEAD|POST|PUT|DELETE|TRACE|CONNECT)
NGINXERROR %{ERRORDATE:timestamp} \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER}: %{GREEDYDATA:errormessage}(?:, client: (?<remote_addr>%{IP}|%{HOSTNAME}))(?:, server: %{IPORHOST:server})(?:, request: "%{METHOD:verb} %{NOTSPACE:request}( HTTP/%{NUMBER:http_version})")?(?:, upstream: "%{NOTSPACE:upstream}",)?(?: host: "%{HOSTNAME:host_domain}")?(?:, referrer: "%{NOTSPACE:referrer}")?
配置logstash.confd的filter模塊
filter {
if [filename] == "nginx_access" {
grok {
match => { "message" => "%{NGINXACCESS}" }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
target => "@timestamp"
remove_field => "timestamp"
}
geoip {
source => "client_ip"
}
useragent {
source => "agent"
target => "useragent"
remove_field => "agent"
}
}
if [filename] == "nginx_error" {
grok {
match => { "message" => "%{NGINXERROR}" }
}
date {
match => [ "timestamp" , "yyyy/MM/dd HH:mm:ss" ]
timezone => "Asia/Shanghai"
target => "@timestamp"
remove_field => "timestamp"
}
}
}
因?yàn)閑rror日志的timestamp未指定時(shí)區(qū),所以需要添加timezone指定為Asia/Shanghai
geoip和useragent也屬于filter的插件缀皱,用來(lái)分析ip和agent的
輸出的時(shí)候也可以指定輸出的模板斗这,詳細(xì)參考這個(gè)文檔