更復雜的用戶輸入
這一章是分析用戶輸入,有點人工智能的意思了牍白,哈哈脊凰。
用戶在輸入命令時,open door
與open the door
應當是一個意思茂腥,現(xiàn)在交給程序去判斷狸涌。
首先得從英語組成上分析
句子由單詞組成
單詞與單詞之間通過空格間隔
單詞有動詞、名詞最岗、修飾詞帕胆、數字等構成
句子的意思由語法控制
所以分析一個句子,首先得將它拆分成單詞般渡,然后分析每個單詞的類型惶楼,最后將其重組為指令右蹦。
獲取用戶輸入,拆分成單詞
stuff = raw_input('> ')
words = stuff.split() #返回一個列表
分析單詞類型
使用(type,word)元組來保存單詞類型對
first_word = ('direction','north')
second_word = ('verb','go')
sentence = [first_word,second_word]
單元測試
書中提供了測試用例歼捐,
from nose.tools import *
from EX48 import lexicon
def test_directions():
assert_equal(lexicon.scan("north"),[('direction','north')])
result = lexicon.scan("north south east")
assert_equal(result,[('direction','north'),
('direction','south'),
('direction','east')])
def test_verbs():
assert_equal(lexicon.scan("go"),[('verb','go')])
result = lexicon.scan("go kill eat")
assert_equal(result,[('verb','go'),
('verb','kill'),
('verb','eat')])
def test_stops():
assert_equal(lexicon.scan("the"),[('stop','the')])
result = lexicon.scan("the in of")
assert_equal(result, [('stop','the'),
('stop','in'),
('stop','of')])
def test_nouns():
assert_equal(lexicon.scan("bear"),[('noun','bear')])
result = lexicon.scan("bear princess")
assert_equal(result, [('noun','bear'),
('noun','princess')])
def test_numbers():
assert_equal(lexicon.scan('1234'),[('number',1234)])
result = lexicon.scan("3 91234")
assert_equal(result,[('number',3),
('number',91234)])
def test_errors():
assert_equal(lexicon.scan('ASDFADFASDF'),[('error','ASDFADFASDF')])
result = lexicon.scan("bear IAS princess")
assert_equal(result,[('noun','bear'),
('error','IAS'),
('noun','princess')])
根據測試用例寫出詞匯掃描器。
通過assert_equal函數可以發(fā)現(xiàn)
lexicon中有個帶字符串參數的scan函數
詞匯類型有‘direction’晨汹、'number'豹储、'noun'、'stop'淘这、'verb'剥扣、'error'
再增加一個名為'unkown'的類型以便收集預定詞匯表中沒有的單詞
scan函數的返回值是一個列表,列表的元素是(type,word)元組對
詞匯掃描器
應該有個預定列表來保存常用的單詞和它所代表的類型
當獲取用戶輸入后铝穷,拆分成詞钠怯,與預定的詞匯類型表對比獲取單詞類型,返回多個(type,word)元組
def scan(stuff):
sentence = []
directions = ['north','south','east']
verbs = ['go','kill','eat']
stops = ['in','of','the']
nouns = ['bear','princess']
numbers = [3,91234,1234]
errors = ['IAS','ASDFADFASDF']
words = stuff.split()
for word in words:
if word in directions:
sentence.append(('direction',word))
elif word in verbs:
sentence.append(('verb',word))
elif word in stops:
sentence.append(('stop',word))
elif word in nouns:
sentence.append(('noun',word))
elif word in errors:
sentence.append(('error',word))
elif int(word) in numbers:
sentence.append(('number',int(word)))
else:
sentence.append(('unkown',word))
return sentence
執(zhí)行nosetests
damao@damao:~/Documents/ex48$ nosetests
.........
~----------------------------------------------------------------------
Ran 9 tests in 0.005sOK
這個掃描器可以再改進曙聂。
def scan(stuff):
sentence = []
directions = ['north','south','east']
verbs = ['go','kill','eat']
stops = ['in','of','the']
nouns = ['bear','princess']
numbers = [3,91234,1234]
errors = ['IAS','ASDFADFASDF']
words = stuff.split()
for word in words:
try:
intword = int(word)
sentence.append(('number',int(word)))
except ValueError:
if word in directions:
sentence.append(('direction',word))
elif word in verbs:
sentence.append(('verb',word))
elif word in stops:
sentence.append(('stop',word))
elif word in nouns:
sentence.append(('noun',word))
elif word in errors:
sentence.append(('error',word))
else:
sentence.append(('unkown',word))
return sentence
print scan("go north")
print scan("kill the princess")
print scan("eat the bear")
print scan("open the door and smack the bear in the nose")
print scan("open 1234 door")
單獨運行輸出效果
damao@damao:~/Documents/ex48/EX48$ python lexicon.py
[('verb', 'go'), ('direction', 'north')]
[('verb', 'kill'), ('stop', 'the'), ('noun', 'princess')]
[('verb', 'eat'), ('stop', 'the'), ('noun', 'bear')]
[('unkown', 'open'), ('stop', 'the'), ('unkown', 'door'), ('unkown', 'and'), ('unkown', 'smack'), ('stop', 'the'), ('noun', 'bear'), ('stop', 'in'), ('stop', 'the'), ('unkown', 'nose')]
[('unkown', 'open'), ('number', 1234), ('unkown', 'door')]
可以正常輸入元組列表晦炊。
使用骨架目錄,以一個新項目形式生成宁脊,項目名字叫EX48