Grafana loki來自官方的介紹

Introduction

This blog post is a companion piece for my talk at https://devopsdaysindia.org. I will discuss the motivations, architecture, and the future of logging in Grafana! Let’s get right down to it. You can see the slides for the talk here: https://speakerdeck.com/gouthamve/devopsdaysindia-2018-loki-prometheus-but-for-logs

Motivation

Grafana is the defacto dashboarding solution for time-series data. It supports over 40 datasources (as of this writing), and the dashboarding story has matured considerably with new features, including the addition of teams and folders. We now want to move on from being a dashboarding solution to being an observability platform, to be the go-to place when you need to debug systems on fire.

Full Observability

Observability. There are a lot of definitions out there as to what that means. Observability to me is visibility into your systems and how they are behaving and performing. I quite like the model where observability can be split into 3 parts (or pillars): metrics, logs and traces; each complimenting each other to help you figure out what’s wrong quickly.

The following example illustrates how I tackle incidents at my job:
how I tackle incidents

Prometheus sends me an alert that something is wrong and I open the relevant dashboard for the service. If I find a panel or graph anomalous, I’ll open the query in Grafana’s new Explore UI for a deeper dive. For example, if I find that one of the services is throwing 500 errors, I’ll try to figure out if a particular handler/route is throwing that error or if all instances are throwing the error, etc.

Next up, once I have a vague mental model as to what is going wrong or where it is going wrong, I’ll look at logs. Pre Loki, I used to use kubectl to get the relevant logs to see what the error is and if I could do something about it. This works great for errors, but sometimes I get paged due to high latency. In this situation I get more info from traces regarding what is slow and which method/operation/function is slow. We use Jaeger to get the traces.

While these didn’t always directly tell me what is wrong, they usually got me close enough to look at the code and figure out what is going wrong. Then I can either scale up the service (if the service is overloaded) or deploy the fix.

Logging

Prometheus works great, Jaeger is getting there, and kubectl was decent. The label model was powerful enough for me to get to the bottom of erroring services. If I found that the ingester service was erroring, I’d do: kubectl --namespace prod logs -l name=ingester | grep XXX to get the relevant logs and grep through them.

If I found a particular instance was erroring or if I wanted to tail the logs of a service, I’d have to use the individual pod for tailing as kubectl doesn’t let you tail based on label selectors. This is not ideal, but works for most use-cases.

This worked, as long as the pod wasn’t crashing or wasn’t being replaced. If the pod or node is terminated, the logs are lost forever. Also, kubectl only stores recent logs, so we’re blind when we want logs from the day before or earlier. Further, having to jump from Grafana to CLI and back again wasn’t ideal. We needed a solution that reduced context switching, and many of the solutions we explored were super pricey or didn’t scale very well.

This was expected as they do waaaay more than select + grep, which is essentially what we needed. After looking at existing solutions, we decided to build our own.

Loki

Not happy with any of the open-source solutions, we started speaking to people and noticed that A LOT of people had the same issues. In fact, I’ve come to realise that lots of developers still SSH and grep/tail the logs on machines even today! The solutions they were using were either too pricey or not stable enough. In fact, people were being asked to log less which we think is an anti-pattern for logs. We thought we could build something that we internally, and the wider open-source community could use. We had one main goal:

  • Keep it simple. Just support grep!
Keep it simple. Just support grep!

<small style="box-sizing: border-box; line-height: inherit; font-size: 14.4px;">This tweet from @alicegoldfuss is not an endorsement and only serves to illustrate the problem Loki is attempting to solve</small>

  • We also aimed for other things:
    • Logs should be cheap. Nobody should be asked to log less.
    • Easy to operate and scale
    • Metrics, logs (and traces later) need to work together

The final point was important. We were already collecting metadata from Prometheus for the metrics and we wanted to use that for log correlation. For example, Prometheus tags each metric with the namespace, service name, instance ip, etc. When I get an alert, I use the metadata to figure out where to look for logs. If we manage to tag the logs with the same metadata, we can seamlessly switch between metrics and logs. You can see the internal design doc we wrote here. See a demo video of Loki in action below:

Video: Loki - Prometheus-inspired, open source logging for cloud natives.

<iframe src="https://www.youtube.com/embed/7n342UsAMo0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen="" style="box-sizing: border-box; position: absolute; top: 0px; left: 0px; width: 900px; height: 513px;"></iframe>

Architecture

With our experience building and running Cortex– the horizontally scalable, distributed version of Prometheus we run as a service– we came up with the following architecture:

Logging architecture!

Metadata between metrics and logs matching is critical for us and we initially decided to just target Kubernetes. The idea is to run a log-collection agent on each node, collect logs using that, talk to the kubernetes API to figure out the right metadata for the logs, and send them to a central service which we can use to show the logs collected inside Grafana.

The agent supports the same configuration (relabelling rules) as Prometheus to make sure the metadata matches. We called this agent promtail.

Enter Loki, the scalable log collection engine.
Logging architecture!

The write path and read path (query) are pretty decoupled from each other and it helps to talk about it each separately.
Loki: Architecture!

Distributor

Once promtail collects and sends the logs to Loki, the distributor is the first component to receive them. Now we could be receiving millions of writes per second and we wouldn’t want to write them to a database as they come in. That would kill any database out there. We would need batch and compress the data as it comes in.

We do this via building compressed chunks of the data, by gzipping logs as they come in. The ingester component is a stateful component in charge of building and then later flushing the chunks. We have multiple ingesters, and the logs belonging to each stream should always end up in the same ingester for all the relevant entries to end up in the same chunk. We do this by building a ring of ingesters and using consistent hashing. When an entry comes in, the distributor hashes the labels of the logs and then looks up which ingester to send the entry to based on the hash value.
Loki: Distributor

Further, for redundancy and resilience, we replicate it n (3, by default) times.

Ingester

Now the ingester will receive the entries and start building chunks.
Loki: Ingester

This is basically gzipping the logs and appending them. Once the chunk “fills up”, we flush it to the database. We use separate databases for the chunks (ObjectStorage) and the index, as the type of data they store is different.

Once the chunk “fills up”, we flush it to the database

After flushing a chunk, the ingester then creates a new empty chunk and adds the new entries into that chunk.

Querier

The read path is quite simple and has the querier doing most of the heavy lifting. Given a time-range and label selectors, it looks at the index to figure out which chunks match, and greps through them to give you the results. It also talks to the ingesters to get the recent data that has not been flushed yet.

Note that, right now, for each query, a single querier greps through all the relevant logs for you. We’ve implemented query parallelisation in Cortex using a frontend and the same can be extended to Loki to give distributed grep which will make even large queries snappy enough.

Loki: A look at the Querier

Scalability

Now let’s see if this scales.

  1. We’re putting the chunks into an object store and that scales.
  2. We put the index into Cassandra/Bigtable/DynamoDB which again scales.
  3. The distributors and queriers are stateless components that you can horizontally scale.

Coming to the ingester, it is a stateful component but we’ve built the full sharding and resharding lifecycle into them. When a rollout is done or when ingesters are scaled up or down, the ring topology changes and the ingesters redistribute their chunks to match the new topology. This is mostly code taken from Cortex which has been running in production for more than 2 years.

Caveats

While all of this works conceptually, we expect to hit new issues and limitations as we grow. It should be super cheap to run, given all the data will be sitting in an Object Store like S3. But you would only be able to grep through the data. This might not be suitable for other use-cases like alerting or building dashboards, which you’re better off doing in metrics.

Conclusion

Loki is very much alpha software and should not be used in production environments. We wanted to announce and release Loki as soon as possible to get feedback and contributions from the community and find out what’s working and what needs improvement. We believe this will help us deliver a higher quality and more on-point production release next year.

Loki can be run on-prem or as a free demo on Grafana Cloud. We urge you to give it a try and drop us a line and let us know what you think. Visit the Loki homepage to get started today.

?著作權歸作者所有,轉載或內容合作請聯系作者
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市帆谍,隨后出現的幾起案子滴铅,更是在濱河造成了極大的恐慌余素,老刑警劉巖,帶你破解...
    沈念sama閱讀 216,692評論 6 501
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件讯壶,死亡現場離奇詭異,居然都是意外死亡椭住,警方通過查閱死者的電腦和手機趁窃,發(fā)現死者居然都...
    沈念sama閱讀 92,482評論 3 392
  • 文/潘曉璐 我一進店門牧挣,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人醒陆,你說我怎么就攤上這事瀑构。” “怎么了刨摩?”我有些...
    開封第一講書人閱讀 162,995評論 0 353
  • 文/不壞的土叔 我叫張陵寺晌,是天一觀的道長。 經常有香客問我澡刹,道長呻征,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 58,223評論 1 292
  • 正文 為了忘掉前任罢浇,我火速辦了婚禮陆赋,結果婚禮上,老公的妹妹穿的比我還像新娘嚷闭。我一直安慰自己攒岛,他們只是感情好,可當我...
    茶點故事閱讀 67,245評論 6 388
  • 文/花漫 我一把揭開白布胞锰。 她就那樣靜靜地躺著灾锯,像睡著了一般。 火紅的嫁衣襯著肌膚如雪胜蛉。 梳的紋絲不亂的頭發(fā)上挠进,一...
    開封第一講書人閱讀 51,208評論 1 299
  • 那天色乾,我揣著相機與錄音誊册,去河邊找鬼。 笑死暖璧,一個胖子當著我的面吹牛案怯,可吹牛的內容都是我干的。 我是一名探鬼主播澎办,決...
    沈念sama閱讀 40,091評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼嘲碱,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了局蚀?” 一聲冷哼從身側響起麦锯,我...
    開封第一講書人閱讀 38,929評論 0 274
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎琅绅,沒想到半個月后扶欣,有當地人在樹林里發(fā)現了一具尸體,經...
    沈念sama閱讀 45,346評論 1 311
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 37,570評論 2 333
  • 正文 我和宋清朗相戀三年料祠,在試婚紗的時候發(fā)現自己被綠了骆捧。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 39,739評論 1 348
  • 序言:一個原本活蹦亂跳的男人離奇死亡髓绽,死狀恐怖敛苇,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情顺呕,我是刑警寧澤枫攀,帶...
    沈念sama閱讀 35,437評論 5 344
  • 正文 年R本政府宣布,位于F島的核電站塘匣,受9級特大地震影響脓豪,放射性物質發(fā)生泄漏。R本人自食惡果不足惜忌卤,卻給世界環(huán)境...
    茶點故事閱讀 41,037評論 3 326
  • 文/蒙蒙 一擂送、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧贫奠,春花似錦惦银、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,677評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至牺弹,卻和暖如春浦马,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背张漂。 一陣腳步聲響...
    開封第一講書人閱讀 32,833評論 1 269
  • 我被黑心中介騙來泰國打工晶默, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人航攒。 一個月前我還...
    沈念sama閱讀 47,760評論 2 369
  • 正文 我出身青樓磺陡,卻偏偏與公主長得像,于是被迫代替她去往敵國和親漠畜。 傳聞我的和親對象是個殘疾皇子币他,可洞房花燭夜當晚...
    茶點故事閱讀 44,647評論 2 354