1.在循環(huán)過(guò)程中刪除了列表中的元素镐捧,導(dǎo)致列表長(zhǎng)度變短茫孔,索引出錯(cuò)返咱。
english_punctuations = [',','.',':',';','?','(',')','[',']','&','!','*','@','#','$','%','...']
sentence='''@justinbieber Thank u Justin for this amazing'''
line=nltk.word_tokenize(sentence)
foriinrange(len(line)):
ifline[i]=='@':
line[i+1]='name'
ifline[i]=='%':
line[i]='percentage'
ifline[i].isdigit():
line[i]='number'
ifline[i]inenglish_punctuations:
line.pop(i)
iflen(line[i])<2:
line.pop(i)
line[i]=line[i].lower()