本文為《Go in Action》的第二章讀書筆記剧包。
第二章主要是介紹了一個(gè)go語言的示例應(yīng)用肿嘲。
Q: 這個(gè)應(yīng)用干了啥?
A: 簡答來說次乓,就是將配置文件里面的rss源讀取出來吓歇,然后把源的內(nèi)容拉取下來,在各個(gè)源的內(nèi)容里面搜索一個(gè)文字票腰,顯示結(jié)果城看。
Q: 文件結(jié)構(gòu)?
A: 文件結(jié)構(gòu)如下:
sample/ #目錄結(jié)構(gòu)
├── data
│ └── data.json #存放的rss源地址杏慰,以json的格式
├── main.go #程序入口main文件
├── matchers #匹配程序测柠,rss是一種源類型炼鞠,后續(xù)可以擴(kuò)展
│ └── rss.go
└── search #主要邏輯代碼
├── default.go
├── feed.go
├── match.go
└── search.go
后續(xù)會(huì)對(duì)各個(gè)文件進(jìn)行分析:
main文件
首先看看內(nèi)容:
package main
import (
"log"
"os"
_ "sample/matchers"
"sample/search"
)
func init() {
log.SetOutput(os.Stdout)
}
// main is the entry point for the program
func main() {
search.Run("president")
}
幾個(gè)點(diǎn):
- 每個(gè)應(yīng)用都有入口函數(shù),這里就是main函數(shù)鹃愤,同時(shí)注意的是main函數(shù)在package main下面才行簇搅,其余的package是不行的完域,否則代碼不會(huì)被編譯成為可執(zhí)行文件
- 第6行的“_"软吐,表明引入了package,但是沒有顯示使用吟税。這里是為了觸發(fā)該package的init函數(shù)
- init函數(shù)先于main函數(shù)執(zhí)行
- log的默認(rèn)output是stderr凹耙,這里在init函數(shù)里面設(shè)置成了Stdout
- import的包,編譯器會(huì)從GOROOT和GOPATH兩個(gè)環(huán)境變量的值表示的目錄下面去找
- main函數(shù)里面調(diào)用了search.Run函數(shù)肠仪,傳入了president作為搜索字符串
data.json
這里面包含了rss的地址和名字:
[
{
"site" : "npr",
"link" : "http://www.npr.org/rss/rss.php?id=1001",
"type" : "rss"
},
{
"site" : "npr",
"link" : "http://www.npr.org/rss/rss.php?id=1008",
"type" : "rss"
},
{
"site" : "npr",
"link" : "http://www.npr.org/rss/rss.php?id=1006",
"type" : "rss"
},
{
"site" : "npr",
"link" : "http://www.npr.org/rss/rss.php?id=1007",
"type" : "rss"
},
{
"site" : "npr",
"link" : "http://www.npr.org/rss/rss.php?id=1057",
"type" : "rss"
},
{
"site" : "npr",
"link" : "http://www.npr.org/rss/rss.php?id=1021",
"type" : "rss"
},
{
"site" : "npr",
"link" : "http://www.npr.org/rss/rss.php?id=1012",
"type" : "rss"
},
{
"site" : "npr",
"link" : "http://www.npr.org/rss/rss.php?id=1003",
"type" : "rss"
},
{
"site" : "npr",
"link" : "http://www.npr.org/rss/rss.php?id=2",
"type" : "rss"
},
{
"site" : "npr",
"link" : "http://www.npr.org/rss/rss.php?id=3",
"type" : "rss"
}
]
其為一個(gè)json數(shù)組肖抱,每個(gè)元素有site、link和type三個(gè)字段异旧。
feed.go
feed表示的就是一個(gè)rss的源意述。看看源碼:
package search
import (
"encoding/json"
"log"
"os"
)
const dataFile = "data/data.json"
// Feed contains information we need to process a feed.
type Feed struct {
Name string `json:"site"`
URI string `json:"link"`
Type string `json:"type"`
}
// RetrieveFeeds reads and unmarshals the feed data file
func RetrieveFeeds() ([]*Feed, error) {
file, err := os.Open(dataFile)
if err != nil {
return nil, err
}
defer file.Close()
var feeds []*Feed
err = json.NewDecoder(file).Decode(&feeds)
log.Printf("Retrieve feeds result: %v\n", feeds)
return feeds, err
}
如下:
- 其包名為search吮蛹,與文件夾的名字一致
- 引入了encoding/json荤崇,作為json解析使用
- 引入了os,用于讀取文件
- 使用const創(chuàng)建了一個(gè)常量潮针,注意這里是=,不是:=
- 定義了一個(gè)類型Feed,首字母大寫蹦浦,表示是可以被外部使用的
- Feed的每個(gè)field都有tag镐牺,用于json庫里field同json對(duì)象的屬性的對(duì)應(yīng)
- 定義了RetrieveFeeds函數(shù),用于獲取feed焦读,該函數(shù)輸入無子库,輸出Feed指針slice和一個(gè)error
- 通過os.Open打開文件
- 通過defer,達(dá)到在函數(shù)返回之后立即執(zhí)行file.close操作矗晃。
The keyword defer is used to schedule a function call to be executed right after a function returns. It’s our responsibility to close the file once we’re done with it. By using the keyword defer to schedule the call to the close method, we can guarantee that the method will be called.This will happen even if the function panics and terminates unexpectedly.
就算函數(shù)非正常終止了仑嗅,也會(huì)執(zhí)行該defer的操作。
- 通過json.NewDecoder(file)創(chuàng)建一個(gè)Decoder喧兄,然后調(diào)用Decode方法把json文件里面的值寫入到feeds對(duì)象中
default.go
先看源碼:
package search
// defaultMatcher implements the default matcher.
type defaultMatcher struct{}
func init() {
var matcher defaultMatcher
Register("default", matcher)
}
// Search implements the behavior for the default matcher.
func (m defaultMatcher) Search(feed *Feed, searchTerm string) ([]*Result, error) {
return nil, nil
}
以下:
- 由于在search文件夾下面无畔,所以package還是search
- 創(chuàng)建了類型defaultMatcher,小寫開頭吠冤,表示外部不可用
- 初始化方法中調(diào)用了Register浑彰,注冊(cè)了default類型的matcher
- 注意Register方法的調(diào)用沒有import,因?yàn)榇蠹叶荚谕粋€(gè)package下面
- 定義Search方法拯辙,此為defaultMatcher的方法郭变。該方法簽名與match.go文件中定義的Matcher接口方法一致颜价,即認(rèn)為類型defaultMatcher實(shí)現(xiàn)了Matcher接口
match.go
先看源碼:
package search
import "log"
// Result contains the result of a search
type Result struct {
Field string
Content string
}
// Matcher defiens the behavior required by types that want
// to implement a new search type
type Matcher interface {
Search(feed *Feed, searchTerm string) ([]*Result, error)
}
// Match is launched as a goroutine for each individual feed to run
// searches concurrently
func Match(matcher Matcher, feed *Feed, searchTerm string, results chan<- *Result) {
searchResults, err := matcher.Search(feed, searchTerm)
if err != nil {
log.Println(err)
return
}
for _, result := range searchResults {
results <- result
}
}
// Display writes results to the console window as they
// are received by the individual goroutines
func Display(results chan *Result) {
// The channel blocks until a result is written to the channel.
// Once the channel is closed the for loop terminates.
for result := range results {
log.Printf("%s:\n%s\n\n", result.Field, result.Content)
}
}
如下:
- 創(chuàng)建Result類型,作為結(jié)果诉濒,包含了兩個(gè)string類型的屬性
- 創(chuàng)建Matcher接口周伦,定義搜索行為,該接口輸入為一個(gè)feed和一個(gè)搜索字符串未荒,返回為result數(shù)組和error专挪。為什么是數(shù)組?因?yàn)榭赡茉谝粋€(gè)feed的內(nèi)容中搜到多處出現(xiàn)搜索字符串的地方
- 定義match函數(shù)片排,調(diào)用參數(shù)matcher中的Search方法寨腔,返回Result數(shù)組,遍歷數(shù)組率寡,將每個(gè)Result發(fā)送到channel results中
- 定義Display函數(shù)迫卢,遍歷results channel,打印result的內(nèi)容
- 注意
:=
符號(hào)冶共。該符號(hào)表示同時(shí)定義并初始化變量
search.go
先看源碼:
package search
import (
"log"
"sync"
)
var matchers = make(map[string]Matcher)
// Run performs
func Run(searchTerm string) {
feeds, err := RetrieveFeeds()
if err != nil {
log.Fatal(err)
}
// Create an unbuffered channel to receive match results to display
results := make(chan *Result)
// Setup a wait group so we can process all the feeds
var waitGroup sync.WaitGroup
// Set the number of go routines we need to wait for while
// they process the individual feeds.
waitGroup.Add(len(feeds))
// Launch a goroutine for each feed to find the results.
for _, feed := range feeds {
// Retrieve a matcher for the search.
matcher, exists := matchers[feed.Type]
if !exists {
matcher = matchers["default"]
}
// Launch the goroutine to perform the search
go func(matcher Matcher, feed *Feed) {
Match(matcher, feed, searchTerm, results)
waitGroup.Done()
}(matcher, feed)
}
// Launch a goroutine to monitor when all the work is done.
go func() {
waitGroup.Wait()
//Close the channel to signal to the Display
// function that we can exit the program
close(results)
}()
Display(results)
}
// Register is called to register a matcher for use by the program.
func Register(feedType string, matcher Matcher) {
if _, exists := matchers[feedType]; exists {
log.Fatalln(feedType, "Matcher already registered")
}
log.Println("Register", feedType, "matcher")
matchers[feedType] = matcher
}
如下:
-
var matchers = make(map[string]Matcher)
乾蛤,創(chuàng)建了一個(gè)map,其key為string類型捅僵,值為Matcher類型家卖。注意Matcher類型在match.go里面進(jìn)行了定義,為一個(gè)interface命咐。這個(gè)matchers定義在了函數(shù)的外面篡九,是一個(gè)package level的變量。在Register函數(shù)里面進(jìn)行了鍵值對(duì)的添加 - 之后定義了Run函數(shù)醋奠,即在main里面進(jìn)行調(diào)用的那個(gè)方法
- 調(diào)用RetrieveFeeds函數(shù)獲取Feeds
-
log.Fatal
會(huì)在結(jié)束程序前打印信息 -
results := make(chan *Result)
榛臼,創(chuàng)建Result Channel - 創(chuàng)建waitGroup。其用于計(jì)數(shù)窜司,當(dāng)每個(gè)goroutine完成任務(wù)之后沛善,waitGroup中保存的值減一。
- 遍歷Feeds塞祈,根據(jù)feed的類型從matchers map中獲取對(duì)應(yīng)的Matcher
- 使用go func(){}()啟動(dòng)goroutine金刁,為每一個(gè)feed啟動(dòng)一個(gè)goroutine。此處的函數(shù)為一個(gè)匿名函數(shù)议薪。這個(gè)時(shí)候的匿名函數(shù)為一個(gè)closure尤蛮,然后多個(gè)closure持有了同一個(gè)變量results。
- goroutine里面調(diào)用了match.go里面的Match方法斯议,進(jìn)行字符串的搜索产捞。搜索完成之后調(diào)用waitGroup.Done()方法
- 新建一個(gè)goroutine,當(dāng)waitGroup.Wait()執(zhí)行之后哼御,close掉results這個(gè)channel
- 調(diào)用Display()函數(shù)坯临,傳入Results channel焊唬。Display函數(shù)定義在match.go文件中
rss.go
源碼:
package matchers
import (
"encoding/xml"
"errors"
"fmt"
"log"
"net/http"
"regexp"
"sample/search"
)
type (
// item defines the fields associated with the item tag
// in the rss document.
item struct {
XMLName xml.Name `xml:"item"`
PubDate string `xml:"pubDate"`
Title string `xml:"title"`
Description string `xml:"description"`
Link string `xml:"link"`
GUID string `xml:"guid"`
GeoRssPoint string `xml:"georss:point"`
}
// image defines the fields associated with the image tag
// in the rss document.
image struct {
XMLName xml.Name `xml:"image"`
URL string `xml:"url"`
Title string `xml:"title"`
Link string `xml:"link"`
}
// channel defines the fields associated with the channel tag
// in the rss document.
channel struct {
XMLName xml.Name `xml:"channel"`
Title string `xml:"title"`
Description string `xml:"description"`
Link string `xml:"link"`
PubDate string `xml:"pubDate"`
LastBuildDate string `xml:"lastBuildDate"`
TTL string `xml:"ttl"`
Language string `xml:"language"`
ManagingEditor string `xml:"managingEditor"`
WebMaster string `xml:"webMaster"`
Image image `xml:"image"`
Item []item `xml:"item"`
}
// rssDocument defines the fields associated with the rss document.
rssDocument struct {
XMLName xml.Name `xml:"rss"`
Channel channel `xml:"channel"`
}
)
// rssMatcher implements the Matcher interface
type rssMatcher struct{}
// init registers the matcher with the program
func init() {
var matcher rssMatcher
log.Println("register rss matcher")
search.Register("rss", matcher)
}
func (m rssMatcher) Search(feed *search.Feed, searchTerm string) ([]*search.Result, error) {
var results []*search.Result
log.Printf("Search Feed Type[%s] Site[%s] For URI[%s]\n", feed.Type, feed.Name, feed.URI)
// Retrieve the data to search.
document, err := m.retrieve(feed)
if err != nil {
return nil, err
}
for _, channelItem := range document.Channel.Item {
// Check the title for the search term.
matched, err := regexp.MatchString(searchTerm, channelItem.Title)
if err != nil {
return nil, err
}
// If we found a match save the result
if matched {
results = append(results, &search.Result{
Field: "Title",
Content: channelItem.Title, // 注意此處的逗號(hào)哦,很容易遺忘的
})
}
// Check the description for the search Item
matched, err = regexp.MatchString(searchTerm, channelItem.Description)
if err != nil {
return nil, err
}
if matched {
results = append(results, &search.Result{
Field: "Description",
Content: channelItem.Description,
})
}
}
return results, nil
}
func (m rssMatcher) retrieve(feed *search.Feed) (*rssDocument, error) {
if feed.URI == "" {
return nil, errors.New("No rss feed uri provided")
}
resp, err := http.Get(feed.URI)
if err != nil {
return nil, err
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
return nil, fmt.Errorf("HTTP Response Error %d\n", resp.StatusCode)
}
var document rssDocument
err = xml.NewDecoder(resp.Body).Decode(&document)
return &document, err
}
如下:
- 使用
type ()
定義了四個(gè)類型:rssDocument看靠、channel赶促、image、item挟炬。rssDocument包括了channel鸥滨,channel包括了image和item數(shù)組。 - 定義rssMatcher類型辟宗,后續(xù)該類型實(shí)現(xiàn)了Matcher的Search方法
- 定義init函數(shù)爵赵,其中調(diào)用Register函數(shù)進(jìn)行注冊(cè)
- retrieve函數(shù)首字母小寫吝秕,并沒有導(dǎo)出
- Search函數(shù)泊脐,首先調(diào)用retrieve函數(shù),發(fā)送http請(qǐng)求烁峭,解析返回容客,將數(shù)據(jù)組裝為rssDocument,然后遍歷其channel下的item约郁,根據(jù)搜索字符串缩挑,使用正則表達(dá)式進(jìn)行解析,對(duì)于解析到的結(jié)果鬓梅,構(gòu)建成result供置,append到results數(shù)組
- 最終返回results數(shù)組
總結(jié)
有幾點(diǎn):
- 所有的文件都是放在
$GOPATH/src/sample
文件夾里面。放這里面import的時(shí)候才能用import sample/...
- 運(yùn)行的時(shí)候cd到sample目錄绽快,使用
go run .
- 所有在search文件夾下面的文件芥丧,都是屬于
package search
- 總體流程就是:
- 各個(gè)package初始化的時(shí)候調(diào)用init方法,init方法調(diào)用search.go中的Register方法坊罢,注冊(cè)matcher到一個(gè)map里面
- main函數(shù)之后調(diào)用search.go中的Run方法
- 從data.json文件中獲取Feeds
- 遍歷Feeds续担,為每一個(gè)Feed開一個(gè)goroutine
- 調(diào)用Rss類型的matcher進(jìn)行搜索,將結(jié)果寫到Results channel中
- 調(diào)用match.go中的Display函數(shù)顯示結(jié)果
- Display函數(shù)里面對(duì)channel進(jìn)行遍歷活孩,會(huì)讓channel進(jìn)行block物遇,此時(shí)也就將運(yùn)行main的goroutine block住,也就不會(huì)直接退出憾儒。當(dāng)channel被close的時(shí)候询兴,遍歷才會(huì)結(jié)束,此時(shí)main函數(shù)退出起趾。如果main的goroutine不block住的話诗舰,那當(dāng)main退出之后,所有的其他goroutine也會(huì)退出阳掐。