我們介紹一個(gè)新的基于TensorFlow的Rasa NLU管然想,新的管道解決了聊天機(jī)器人開發(fā)人員面臨的兩個(gè)主要問題:
- 你如何超越預(yù)先訓(xùn)練的嵌入的限制奕枝?
- 如何構(gòu)建一個(gè)可以理解多個(gè)意圖的聊天機(jī)器人呢框咙?
在這篇文章中餐胀,我們將全面了解基于TensorFlow的管道如何幫助我們解決第二個(gè)問題:多個(gè)意圖咽筋。本教程的結(jié)果將是一個(gè)非常簡(jiǎn)單的聊天機(jī)器人晌块,可以推薦聚會(huì)在柏林參加爱沟。
如果您想繼續(xù)學(xué)習(xí),可以在此處找到本教程中使用的代碼和數(shù)據(jù)集匆背。
什么是新的TensorFlow管道呼伸?
處理管道是任何Rasa NLU模型的構(gòu)建塊。管道定義了如何解析用戶輸入钝尸,標(biāo)記化以及如何提取功能括享。管道的組件很重要,因?yàn)樗鼈儗?duì)NLU模型的執(zhí)行方式有直接影響珍促。與常規(guī)Rasa NLU流水線相比铃辖,新的TensorFlow流水線可以訓(xùn)練為單個(gè)輸入消息分配兩個(gè)或更多意圖的模型。例如猪叙,當(dāng)用戶說“Yes, make a booking. Can you also book me a taxi from the airport to the hotel?“有兩個(gè)意圖 - 確認(rèn)應(yīng)該進(jìn)行預(yù)訂以及預(yù)訂出租車的額外請(qǐng)求娇斩。我們可以通過分配這些輸入來模擬這些輸入仁卷,其中包括多個(gè)意圖上面的例子將是confirm+book_taxi
。
讓我們看看它是如何在實(shí)踐中完成的犬第。
創(chuàng)建一個(gè)聚會(huì)聊天機(jī)器人
我最近搬到了柏林锦积,我認(rèn)為加入聚會(huì)是結(jié)識(shí)該地區(qū)新人的最佳方式。這就是為什么在這個(gè)教程中我決定建立一個(gè)小型聊天機(jī)器人歉嗓,可以推薦很酷的聚會(huì)在柏林參加丰介。一點(diǎn)免責(zé)聲明 - 出于可重復(fù)性的原因,我不打算使用任何花哨的API遥椿,但我想鼓勵(lì)您使用代碼基矮,實(shí)現(xiàn)自定義操作,連接到實(shí)時(shí)聚會(huì)冠场,位置或其他API并使這個(gè)聊天機(jī)器人更有趣家浇!
- 定義管道
讓我們從本教程的全部?jī)?nèi)容 - 管道開始。下面的代碼塊包含我將用于聊天機(jī)器人的管道配置(請(qǐng)查看config.yml
文件)碴裙。它包含一個(gè)處理參數(shù)CountVectorsFeaturizer
钢悲,它定義了如何提取模型特征(您可以在這里閱讀更多關(guān)于參數(shù)的信息)和另外一個(gè)組件EmbeddingIntentClassifier
,它表明我們將使用TensorFlow嵌入進(jìn)行意圖分類舔株。通過設(shè)置標(biāo)志intent_tokenization_flag:true
莺琳,我們告訴模型我們要將意圖標(biāo)簽拆分為標(biāo)記,這意味著模型將知道哪些意圖是多意圖载慈,并且使用intent_split_symbol
我們定義應(yīng)該使用哪個(gè)字符進(jìn)行拆分惭等,在這種情況下是+
。
# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en
pipeline:
- name: "CountVectorsFeaturizer"
- name: "EmbeddingIntentClassifier"
intent_tokenization_flag: true
intent_split_symbol: "+"
# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
- name: MemoizationPolicy
max_history: 5
- name: KerasPolicy
batch_size: 50
epochs: 200
max_training_samples: 300
- name: MappingPolicy
- name: FormPolicy
- fallback_action_name: action_default_fallback
name: FallbackPolicy
- NLU訓(xùn)練數(shù)據(jù)
使用TensorFlow管道的模型的訓(xùn)練數(shù)據(jù)是什么樣的办铡?與常規(guī)方法沒有什么不同 - 唯一的補(bǔ)充是我們必須添加多目標(biāo)輸入的示例并為它們分配相應(yīng)的多目標(biāo)標(biāo)簽辞做。下面我有一小段訓(xùn)練數(shù)據(jù),我將用它來訓(xùn)練NLU模型(檢查data / nlu_data.md
文件)寡具。正如您所看到的秤茅,我有一些常規(guī)示例,每個(gè)輸入有一個(gè)intent童叠,以及分配了多個(gè)Intent的示例框喳。例如,輸入“Can you suggest any cool meetups in Berlin area?”只有一個(gè)意圖 - 用戶要求提供聚會(huì)建議厦坛,這就是為什么它有一個(gè)分配給它的意圖五垮。另一方面,輸入“Sounds good. Do you know how I could get there from home?”意味著兩件事 - 用戶想要加入聚會(huì)的確認(rèn)和關(guān)于到達(dá)場(chǎng)地的交通的查詢杜秸,這就是為什么這些例子具有組合的affirm+ask_transport
意圖的原因拼余。
## intent: greet
- Hi
- hey
- heya
- Hello
- What's up
- Heya
- Greetings
- Good morning
- Good afternoon
- Good evening
- Hey sir
- Hi person
- Hey robot
- Hello bot
## intent: goodbye
- Bye
- Goodbye
- Talk to you later
- See you
- See you later
- Bye bye
- Bye for now
- Goodbye bot
## intent: affirm
- yes
- yup
- yes, that sounds good
- sure
- definitely
- absolutely
- please do
- yes, please
- yes for sure
## intent: deny
- no
- nope
- No, I don't think so
- Maybe not
- Not today
- No, I'm good
- No thanks
## intent: thanks
- thanks
- thank you
- Thanks a lot
- Thanks a bunch
- Thank you very much
- Thank you so much
## intent: thanks+goodbye
- Awesome. Talk to you later!
- Thanks. Bye for now
- Awesome, bye bye
- That's great. Goodbye.
- Perfect. Talk to you later
- Thanks! Goodbye
- Thank you very much. Talk to you later
- Thanks a lot. Bye for now
- Thanks bot. Goodbye
## intent: meetup
- I am new to the area. What meetups I could join in Berlin?
- I have just moved to Berlin. Can you suggest any cool meetups for me?
- I have moved to the area and would like to join some tech meetups. Any suggestions?
- I am new to London. Can you suggest any cool meetups I could attend?
- I have just arrived in Berlin. What meetups I could attend here?
- Looking for a tech meetup in the area.
- Are there any good AI meetups in Berlin?
## intent: ask_transport
- How do I get there?
- Can you tell me what is the easiest way to get to the venue?
- Tell me how should I get to the venue from home.
- Do you know how I should get to the venue of the meetup?
- Can you tell me how to get to the venue?
## intent: affirm+ask_transport
- Yes. How do I get there?
- Sounds good. Do you know how I could get there from home?
- Yes, please. How do I get there from work?
- Yes, sure! Can you also tell me what is the best way to get to the venue?
- Sure, sounds good. Can you tell the best way to get to the venue?
- Yes! Can you also recommend me the best way to get to the venue?
- Yes, sure. Also, can you tell me what is the fastest way to get to the venue?
- Sure. Can you suggests how should get there?
- Yes, thanks. I wonder, how should I get to the venue from work?
- Yes, definitely. I wonder how should I get to the venue?
- Definitely. Can you tell me how could I get to the venue of the meetup?
- 訓(xùn)練并且測(cè)試NLU模型
一旦NLU數(shù)據(jù)準(zhǔn)備完畢,我們就可以通過執(zhí)行下面命令訓(xùn)練模型亩歹。
rasa train nlu
它調(diào)用Rasa NLU訓(xùn)練函數(shù)匙监,提供管道配置和數(shù)據(jù)文件,并打印出訓(xùn)練結(jié)果小作。
在訓(xùn)練模型時(shí)亭姥,我們可以測(cè)試其在各種輸入上的性能。要做到這一點(diǎn):
rasa shell nlu
下面我們可以看到輸入消息的模型輸出“Yes. Can you give me suggestions on how to get there?”顾稀。我們可以看到达罗,輸入被歸類為多意圖affirm+ ask_transport
,它基于訓(xùn)練數(shù)據(jù)是我們對(duì)此示例的期望静秆。
Next message:
Yes. Can you give me suggestions on how to get there?
{
"intent": {
"name": "affirm+ask_transport",
"confidence": 0.9121424555778503
},
"entities": [],
"intent_ranking": [
{
"name": "affirm+ask_transport",
"confidence": 0.9121424555778503
},
{
"name": "ask_transport",
"confidence": 0.22044242918491364
},
{
"name": "thanks+goodbye",
"confidence": 0.022139914333820343
},
{
"name": "thanks",
"confidence": 0.0
},
{
"name": "goodbye",
"confidence": 0.0
},
{
"name": "deny",
"confidence": 0.0
},
{
"name": "affirm",
"confidence": 0.0
},
{
"name": "greet",
"confidence": 0.0
},
{
"name": "meetup",
"confidence": 0.0
}
],
"text": "Yes. Can you give me suggestions on how to get there?"
}
- 定義域和訓(xùn)練數(shù)據(jù)
為了演示所有部分如何組合在一起粮揉,我們構(gòu)建一個(gè)對(duì)話管理模型,其中包含一些模板作為響應(yīng)(如前所述抚笔,為了重現(xiàn)性和簡(jiǎn)單性扶认,我們不會(huì)使用任何實(shí)時(shí)API或數(shù)據(jù)庫)。域文件包含模板殊橙,對(duì)話管理模型將用于響應(yīng)用戶(檢查domain.yml
文件):
intents:
- greet
- goodbye
- affirm
- thanks
- thanks+goodbye
- meetup
- affirm+ask_transport
- ask_transport
- deny
templates:
utter_greet:
- text: "Hey, how can I help you?"
utter_goodbye:
- text: "Talk to you later!"
- text: "Goodbye :("
- text: "Bye!"
- text: "Have a great day!"
utter_confirm:
- text: "Done - I have just booked you a spot at the Bots Berlin meetup."
- text: "Great, just made an RSVP for you."
utter_meetup:
- text: "Rasa Bots Berlin meetup is definitely worth checking out! They are having an event today at Behrenstra?e 42. Would you like to join?"
utter_affirm_suggest_transport:
- text: "Great, I have just booked a spot for you. The venue is close to the Berlin Friedrichstra?e station, you can get there by catching U-Bahn U6."
utter_suggest_transport:
- text: "The venue is close to the Berlin Friedrichstra?e station, so the best option is to catch a U-Bahn U6."
utter_thanks:
- text: "You are very welcome."
- text: "Glad I could help!"
utter_deny:
- text: "That's a shame. Let me know if you change your mind."
actions:
- utter_greet
- utter_goodbye
- utter_confirm
- utter_meetup
- utter_affirm_suggest_transport
- utter_suggest_transport
- utter_thanks
- utter_deny
這些模板將用作對(duì)用戶輸入的響應(yīng)辐宾,具體取決于它們?cè)趧?chuàng)建故事數(shù)據(jù)時(shí)的使用方式。我們將在下一節(jié)中更詳細(xì)地研究它膨蛮。
在繼續(xù)之前叠纹,我想指出像utter_goodbye
這樣的模板有多個(gè)可能的響應(yīng)。添加這樣的選項(xiàng)是使聊天機(jī)器人更有趣并防止它在每次對(duì)話中重復(fù)相同答案的好方法敞葛。
- 生成故事集
像往常一樣誉察,為了訓(xùn)練對(duì)話管理模型,我們需要一些故事惹谐。新的TensorFlow管道不需要故事數(shù)據(jù)的任何特殊格式 - 我們可以使用先前定義的多個(gè)或單個(gè)意圖和相應(yīng)的操作持偏。在下面的表格中,您可以找到兩個(gè)非常相似的故事豺鼻,將用于我們的模型 - 一個(gè)具有多個(gè)意圖综液,另一個(gè)具有單個(gè)意圖(檢查data / stories.md
文件)
第一個(gè)故事有兩個(gè)多重意圖 - affirm+ ask_transport
,對(duì)應(yīng)于一個(gè)用戶說“Yes, book me a spot at the meetup. Also, can you tell me how should I get to the venue?”和另一個(gè)多意圖的thanks+goodbye
儒飒,對(duì)應(yīng)于用戶說“Thank you. Talk to you later”谬莹。第二個(gè)故事代表了一個(gè)非常相似的對(duì)話,但它只使用單個(gè)意圖桩了。與第二個(gè)故事相比附帽,第一個(gè)故事反映了更有機(jī)和人類的對(duì)話。
需要強(qiáng)調(diào)的另一件事是井誉,有多種不同方式可以編寫具有多種意圖的故事蕉扮。下表顯示了同一對(duì)話的不同表示:
故事find_meetup_01使用特殊操作utter_affirm_suggest_transport
作為對(duì)多意圖affirm+ ask_transport
的響應(yīng)】攀ィ或者喳钟,就像在find_meetup_03中一樣屁使,我們可以使用兩個(gè)單獨(dú)的模板 - utter_confirm
和utter_suggest_transport
來編寫這個(gè)故事,它們也可以用作對(duì)單意圖輸入的響應(yīng)奔则。關(guān)于具有多意圖的故事的另一個(gè)重要注意事項(xiàng)是沒有必要為每一個(gè)多意圖執(zhí)行動(dòng)作蛮寂。例如,故事find_meetup_03有兩個(gè)動(dòng)作作為對(duì)多意圖thanks+goodbye
的響應(yīng)易茬,但是酬蹋,就像在故事find_meetup_04中一樣,跳過其中一個(gè)標(biāo)記的動(dòng)作是完全可以的抽莱。
決定哪種方法最好高度依賴于聊天機(jī)器人的域和邏輯 - 在某些情況下范抓,為多方意圖創(chuàng)建單獨(dú)的操作絕對(duì)沒有必要,您可以使用所有相同的操作作為多方意圖的響應(yīng)和對(duì)于單意圖食铐。在構(gòu)建聊天機(jī)器人之前匕垫,確定哪些操作確實(shí)需要與聊天機(jī)器人進(jìn)行自然對(duì)話始終是一種很好的做法。
- 訓(xùn)練對(duì)話管理模型并測(cè)試機(jī)器人
為了訓(xùn)練模型我們可以使用下面的命令去訓(xùn)練NLU和Core模型璃岳,并且保存成一個(gè)tar.gz
的文件年缎。你可以在訓(xùn)練時(shí)config.yml
文件中指定參數(shù)像`epochs的數(shù)目。
rasa train
一旦訓(xùn)練結(jié)束铃慷,就到了最激動(dòng)人心的時(shí)刻-測(cè)試你的機(jī)器人单芜。我們可以在命令行中使用下面代碼塊去啟動(dòng)它,它將加載對(duì)話管理模型和NLU模型并在命令行中啟動(dòng)它犁柜。
rasa shell
這是與聊天機(jī)器人的實(shí)際對(duì)話:
Bot loaded. Type a message and press enter (use '/stop' to exit):
Your input -> Hello
Hey, how can I help you?
Your input -> I am new to Berlin and I would like to join some meetups. Any sug
gestions?
Rasa Bots Berlin meetup is definitely worth checking out! They are having an event today at Behrenstra?e 42. Would you like to join?
Your input -> Sure! Can you also tell me how can I get to the venue?
Done - I have just booked you a spot at the Bots Berlin meetup.
The venue is close to the Berlin Friedrichstra?e station, so the best option is to catch a U-Bahn U6.
Your input -> Thanks a lot. Talk to you later.
You are very welcome.
Talk to you later!
Your input -> /stop
2019-09-19 16:13:33 INFO root - Killing Sanic server now.