五分鐘搭建一個(gè)基于BERT的NER模型

BERT 簡介

BERT是2018年google 提出來的預(yù)訓(xùn)練的語言模型,并且它打破很多NLP領(lǐng)域的任務(wù)記錄,其提出在nlp的領(lǐng)域具有重要意義。預(yù)訓(xùn)練的(pre-train)的語言模型通過無監(jiān)督的學(xué)習(xí)掌握了很多自然語言的一些語法或者語義知識,之后在做下游的nlp任務(wù)時(shí)就會顯得比較容易。BERT在做下游的有監(jiān)督nlp任務(wù)時(shí)就像一個(gè)做了充足預(yù)習(xí)的學(xué)生去上課寒矿,那效果肯定事半功倍。之前的word2vec若债,glove等Word Embedding技術(shù)也是通過無監(jiān)督的訓(xùn)練讓模型預(yù)先掌握了一些基礎(chǔ)的語言知識符相,但是word embeding技術(shù)無論從預(yù)訓(xùn)練的模型復(fù)雜度(可以理解成學(xué)習(xí)的能力),以及無監(jiān)督學(xué)習(xí)的任務(wù)難度都無法和BERT相比蠢琳。

模型部分

首先BERT模型采用的是12層或者24層的雙向的Transformer的Encoder作為特征提取器啊终,如下圖所示。要知道在nlp領(lǐng)域傲须,特征提取能力方面的排序大致是Transformer>RNN>CNN蓝牲。對Transformer不了解的同學(xué)可以看看筆者之前的這篇文章,而且一用就是12層泰讽,將nlp真正的往深度的方向推進(jìn)了一大步例衍。

bert base

預(yù)訓(xùn)練任務(wù)方面

BERT 為了讓模型能夠比較好的掌握自然語言方面的知識昔期,提出了下面兩種預(yù)訓(xùn)練的任務(wù):
1.遮蓋詞的預(yù)測任務(wù)(mask word prediction),如下圖所示:
將輸入文本中15%的token隨機(jī)遮蓋佛玄,然后輸入給模型硼一,最終希望模型能夠輸出遮蓋的詞是什么,這就是讓模型在做完形填空啊梦抢,而且還是選項(xiàng)的可能是所有詞的完形填空般贼,想想我們經(jīng)常考試時(shí)做完形填空奥吩,給你四個(gè)選項(xiàng)你都不一定能夠做對哼蛆,這個(gè)任務(wù)可以讓模型學(xué)到很多語法檐晕,甚至語義方面的知識裹刮。

mask word prediction

2.下一個(gè)句子預(yù)測任務(wù)(next sentence prediction)
下一個(gè)句子預(yù)測任務(wù)如下圖所示:給模型輸入A,B兩個(gè)句子斗埂,讓模型判斷B句子是否是A句子的下一句端衰。這個(gè)任務(wù)是希望模型能夠?qū)W到句子間的關(guān)系萤厅,更近一步的加強(qiáng)模型對自然語言的理解。
next sentence prediction

這兩個(gè)頗具難度的預(yù)訓(xùn)練任務(wù)靴迫,讓模型在預(yù)訓(xùn)練階段就對自然語言有了比較深入的學(xué)習(xí)和認(rèn)知,而這些知識對下游的nlp任務(wù)有著巨大的幫助楼誓。當(dāng)然玉锌,想要模型通過預(yù)訓(xùn)練掌握知識,我們需要花費(fèi)大量的語料疟羹,大量的計(jì)算資源和大量的時(shí)間主守。但是訓(xùn)練一遍就可以一直使用,這種一勞永逸的工作榄融,依然很值得去做一做参淫。

BERT的NER實(shí)戰(zhàn)

這里筆者先介紹一下kashgari這個(gè)框架,此框架的github鏈接在這,封裝這個(gè)框架的作者希望大家能夠很方便的調(diào)用一些NLP領(lǐng)域高大上的技術(shù)愧杯,快速的進(jìn)行一些實(shí)驗(yàn)涎才。kashgari封裝了BERT embedingg模型,LSTM-CRF實(shí)體識別模型力九,還有一些經(jīng)典的文本分類的網(wǎng)絡(luò)模型耍铜。這里筆者就是利用這個(gè)框架五分鐘在自己的數(shù)據(jù)集上完成了基于BERT的NER實(shí)戰(zhàn)。

數(shù)據(jù)讀入
with open("train_data","rb") as f:
     data = f.read().decode("utf-8")
train_data = data.split("\n\n")
train_data = [token.split("\n") for token in train_data]
train_data = [[j.split() for j in i ] for i in train_data]
train_data.pop()
數(shù)據(jù)預(yù)處理
train_x = [[token[0] for token in sen] for sen in train_data]
train_y = [[token[1] for token in sen] for sen in train_data]

這里 train_x和 train_y都是一個(gè)list跌前,
train_x: [[char_seq1],[char_seq2],[char_seq3],..... ]
train_y:[[label_seq1],[label_seq2],[label_seq3],..... ]
其中 char_seq1:["我"棕兼,"愛","荊"抵乓,"州"]
對應(yīng)的的label_seq1:["O"伴挚,"O"靶衍,"B_LOC","I_LOC"]
數(shù)據(jù)預(yù)處理成一個(gè)字對應(yīng)一個(gè)label就可以了茎芋,是不是很方便颅眶。kashgari已經(jīng)封裝了數(shù)據(jù)數(shù)值化,向量化的模塊了败徊,所以你已經(jīng)不用操心文本和label的數(shù)值化問題了帚呼。這里要強(qiáng)調(diào)一下,由于google開源的BERT中文預(yù)訓(xùn)練模型采用的是字符級別的輸入皱蹦,所以數(shù)據(jù)預(yù)處理部分只能將文本處理成字符煤杀。

載入BERT

只需通過下面三行就可以輕易的加載BERT模型。

from kashgari.embeddings import BERTEmbedding
from kashgari.tasks.seq_labeling import BLSTMCRFModel
embedding = BERTEmbedding("bert-base-chinese", 200)

運(yùn)行后沪哺,代碼會自動到BERT模型儲存的地方下載預(yù)訓(xùn)練模型的參數(shù)沈自,這里谷歌已經(jīng)幫我們預(yù)訓(xùn)練好BERT模型了,所以我們只需要用它做下游任務(wù)即可辜妓。


download_bert_chinese
搭建模型并訓(xùn)練

使用kashgari封裝的LSTM+CRF模型枯途,將數(shù)據(jù)喂給模型,同時(shí)設(shè)置好batch-size,就可以訓(xùn)練起來了籍滴。感覺整個(gè)過程是不是不需要五分鐘(當(dāng)然如果網(wǎng)速慢酪夷,下載預(yù)訓(xùn)練的BERT模型可能就超過五分鐘了)。

model = BLSTMCRFModel(embedding)
model.fit(train_x,train_y,epochs=1,batch_size=100)
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
Input-Token (InputLayer)        (None, 200)          0                                            
__________________________________________________________________________________________________
Input-Segment (InputLayer)      (None, 200)          0                                            
__________________________________________________________________________________________________
Embedding-Token (TokenEmbedding [(None, 200, 768), ( 16226304    Input-Token[0][0]                
__________________________________________________________________________________________________
Embedding-Segment (Embedding)   (None, 200, 768)     1536        Input-Segment[0][0]              
__________________________________________________________________________________________________
Embedding-Token-Segment (Add)   (None, 200, 768)     0           Embedding-Token[0][0]            
                                                                 Embedding-Segment[0][0]          
__________________________________________________________________________________________________
Embedding-Position (PositionEmb (None, 200, 768)     153600      Embedding-Token-Segment[0][0]    
__________________________________________________________________________________________________
Embedding-Dropout (Dropout)     (None, 200, 768)     0           Embedding-Position[0][0]         
__________________________________________________________________________________________________
Embedding-Norm (LayerNormalizat (None, 200, 768)     1536        Embedding-Dropout[0][0]          
__________________________________________________________________________________________________
Encoder-1-MultiHeadSelfAttentio (None, 200, 768)     2362368     Embedding-Norm[0][0]             
__________________________________________________________________________________________________
Encoder-1-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-1-MultiHeadSelfAttention[
__________________________________________________________________________________________________
Encoder-1-MultiHeadSelfAttentio (None, 200, 768)     0           Embedding-Norm[0][0]             
                                                                 Encoder-1-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-1-MultiHeadSelfAttentio (None, 200, 768)     1536        Encoder-1-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-1-FeedForward (FeedForw (None, 200, 768)     4722432     Encoder-1-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-1-FeedForward-Dropout ( (None, 200, 768)     0           Encoder-1-FeedForward[0][0]      
__________________________________________________________________________________________________
Encoder-1-FeedForward-Add (Add) (None, 200, 768)     0           Encoder-1-MultiHeadSelfAttention-
                                                                 Encoder-1-FeedForward-Dropout[0][
__________________________________________________________________________________________________
Encoder-1-FeedForward-Norm (Lay (None, 200, 768)     1536        Encoder-1-FeedForward-Add[0][0]  
__________________________________________________________________________________________________
Encoder-2-MultiHeadSelfAttentio (None, 200, 768)     2362368     Encoder-1-FeedForward-Norm[0][0] 
__________________________________________________________________________________________________
Encoder-2-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-2-MultiHeadSelfAttention[
__________________________________________________________________________________________________
Encoder-2-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-1-FeedForward-Norm[0][0] 
                                                                 Encoder-2-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-2-MultiHeadSelfAttentio (None, 200, 768)     1536        Encoder-2-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-2-FeedForward (FeedForw (None, 200, 768)     4722432     Encoder-2-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-2-FeedForward-Dropout ( (None, 200, 768)     0           Encoder-2-FeedForward[0][0]      
__________________________________________________________________________________________________
Encoder-2-FeedForward-Add (Add) (None, 200, 768)     0           Encoder-2-MultiHeadSelfAttention-
                                                                 Encoder-2-FeedForward-Dropout[0][
__________________________________________________________________________________________________
Encoder-2-FeedForward-Norm (Lay (None, 200, 768)     1536        Encoder-2-FeedForward-Add[0][0]  
__________________________________________________________________________________________________
Encoder-3-MultiHeadSelfAttentio (None, 200, 768)     2362368     Encoder-2-FeedForward-Norm[0][0] 
__________________________________________________________________________________________________
Encoder-3-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-3-MultiHeadSelfAttention[
__________________________________________________________________________________________________
Encoder-3-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-2-FeedForward-Norm[0][0] 
                                                                 Encoder-3-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-3-MultiHeadSelfAttentio (None, 200, 768)     1536        Encoder-3-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-3-FeedForward (FeedForw (None, 200, 768)     4722432     Encoder-3-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-3-FeedForward-Dropout ( (None, 200, 768)     0           Encoder-3-FeedForward[0][0]      
__________________________________________________________________________________________________
Encoder-3-FeedForward-Add (Add) (None, 200, 768)     0           Encoder-3-MultiHeadSelfAttention-
                                                                 Encoder-3-FeedForward-Dropout[0][
__________________________________________________________________________________________________
Encoder-3-FeedForward-Norm (Lay (None, 200, 768)     1536        Encoder-3-FeedForward-Add[0][0]  
__________________________________________________________________________________________________
Encoder-4-MultiHeadSelfAttentio (None, 200, 768)     2362368     Encoder-3-FeedForward-Norm[0][0] 
__________________________________________________________________________________________________
Encoder-4-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-4-MultiHeadSelfAttention[
__________________________________________________________________________________________________
Encoder-4-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-3-FeedForward-Norm[0][0] 
                                                                 Encoder-4-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-4-MultiHeadSelfAttentio (None, 200, 768)     1536        Encoder-4-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-4-FeedForward (FeedForw (None, 200, 768)     4722432     Encoder-4-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-4-FeedForward-Dropout ( (None, 200, 768)     0           Encoder-4-FeedForward[0][0]      
__________________________________________________________________________________________________
Encoder-4-FeedForward-Add (Add) (None, 200, 768)     0           Encoder-4-MultiHeadSelfAttention-
                                                                 Encoder-4-FeedForward-Dropout[0][
__________________________________________________________________________________________________
Encoder-4-FeedForward-Norm (Lay (None, 200, 768)     1536        Encoder-4-FeedForward-Add[0][0]  
__________________________________________________________________________________________________
Encoder-5-MultiHeadSelfAttentio (None, 200, 768)     2362368     Encoder-4-FeedForward-Norm[0][0] 
__________________________________________________________________________________________________
Encoder-5-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-5-MultiHeadSelfAttention[
__________________________________________________________________________________________________
Encoder-5-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-4-FeedForward-Norm[0][0] 
                                                                 Encoder-5-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-5-MultiHeadSelfAttentio (None, 200, 768)     1536        Encoder-5-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-5-FeedForward (FeedForw (None, 200, 768)     4722432     Encoder-5-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-5-FeedForward-Dropout ( (None, 200, 768)     0           Encoder-5-FeedForward[0][0]      
__________________________________________________________________________________________________
Encoder-5-FeedForward-Add (Add) (None, 200, 768)     0           Encoder-5-MultiHeadSelfAttention-
                                                                 Encoder-5-FeedForward-Dropout[0][
__________________________________________________________________________________________________
Encoder-5-FeedForward-Norm (Lay (None, 200, 768)     1536        Encoder-5-FeedForward-Add[0][0]  
__________________________________________________________________________________________________
Encoder-6-MultiHeadSelfAttentio (None, 200, 768)     2362368     Encoder-5-FeedForward-Norm[0][0] 
__________________________________________________________________________________________________
Encoder-6-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-6-MultiHeadSelfAttention[
__________________________________________________________________________________________________
Encoder-6-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-5-FeedForward-Norm[0][0] 
                                                                 Encoder-6-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-6-MultiHeadSelfAttentio (None, 200, 768)     1536        Encoder-6-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-6-FeedForward (FeedForw (None, 200, 768)     4722432     Encoder-6-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-6-FeedForward-Dropout ( (None, 200, 768)     0           Encoder-6-FeedForward[0][0]      
__________________________________________________________________________________________________
Encoder-6-FeedForward-Add (Add) (None, 200, 768)     0           Encoder-6-MultiHeadSelfAttention-
                                                                 Encoder-6-FeedForward-Dropout[0][
__________________________________________________________________________________________________
Encoder-6-FeedForward-Norm (Lay (None, 200, 768)     1536        Encoder-6-FeedForward-Add[0][0]  
__________________________________________________________________________________________________
Encoder-7-MultiHeadSelfAttentio (None, 200, 768)     2362368     Encoder-6-FeedForward-Norm[0][0] 
__________________________________________________________________________________________________
Encoder-7-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-7-MultiHeadSelfAttention[
__________________________________________________________________________________________________
Encoder-7-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-6-FeedForward-Norm[0][0] 
                                                                 Encoder-7-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-7-MultiHeadSelfAttentio (None, 200, 768)     1536        Encoder-7-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-7-FeedForward (FeedForw (None, 200, 768)     4722432     Encoder-7-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-7-FeedForward-Dropout ( (None, 200, 768)     0           Encoder-7-FeedForward[0][0]      
__________________________________________________________________________________________________
Encoder-7-FeedForward-Add (Add) (None, 200, 768)     0           Encoder-7-MultiHeadSelfAttention-
                                                                 Encoder-7-FeedForward-Dropout[0][
__________________________________________________________________________________________________
Encoder-7-FeedForward-Norm (Lay (None, 200, 768)     1536        Encoder-7-FeedForward-Add[0][0]  
__________________________________________________________________________________________________
Encoder-8-MultiHeadSelfAttentio (None, 200, 768)     2362368     Encoder-7-FeedForward-Norm[0][0] 
__________________________________________________________________________________________________
Encoder-8-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-8-MultiHeadSelfAttention[
__________________________________________________________________________________________________
Encoder-8-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-7-FeedForward-Norm[0][0] 
                                                                 Encoder-8-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-8-MultiHeadSelfAttentio (None, 200, 768)     1536        Encoder-8-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-8-FeedForward (FeedForw (None, 200, 768)     4722432     Encoder-8-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-8-FeedForward-Dropout ( (None, 200, 768)     0           Encoder-8-FeedForward[0][0]      
__________________________________________________________________________________________________
Encoder-8-FeedForward-Add (Add) (None, 200, 768)     0           Encoder-8-MultiHeadSelfAttention-
                                                                 Encoder-8-FeedForward-Dropout[0][
__________________________________________________________________________________________________
Encoder-8-FeedForward-Norm (Lay (None, 200, 768)     1536        Encoder-8-FeedForward-Add[0][0]  
__________________________________________________________________________________________________
Encoder-9-MultiHeadSelfAttentio (None, 200, 768)     2362368     Encoder-8-FeedForward-Norm[0][0] 
__________________________________________________________________________________________________
Encoder-9-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-9-MultiHeadSelfAttention[
__________________________________________________________________________________________________
Encoder-9-MultiHeadSelfAttentio (None, 200, 768)     0           Encoder-8-FeedForward-Norm[0][0] 
                                                                 Encoder-9-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-9-MultiHeadSelfAttentio (None, 200, 768)     1536        Encoder-9-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-9-FeedForward (FeedForw (None, 200, 768)     4722432     Encoder-9-MultiHeadSelfAttention-
__________________________________________________________________________________________________
Encoder-9-FeedForward-Dropout ( (None, 200, 768)     0           Encoder-9-FeedForward[0][0]      
__________________________________________________________________________________________________
Encoder-9-FeedForward-Add (Add) (None, 200, 768)     0           Encoder-9-MultiHeadSelfAttention-
                                                                 Encoder-9-FeedForward-Dropout[0][
__________________________________________________________________________________________________
Encoder-9-FeedForward-Norm (Lay (None, 200, 768)     1536        Encoder-9-FeedForward-Add[0][0]  
__________________________________________________________________________________________________
Encoder-10-MultiHeadSelfAttenti (None, 200, 768)     2362368     Encoder-9-FeedForward-Norm[0][0] 
__________________________________________________________________________________________________
Encoder-10-MultiHeadSelfAttenti (None, 200, 768)     0           Encoder-10-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-10-MultiHeadSelfAttenti (None, 200, 768)     0           Encoder-9-FeedForward-Norm[0][0] 
                                                                 Encoder-10-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-10-MultiHeadSelfAttenti (None, 200, 768)     1536        Encoder-10-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-10-FeedForward (FeedFor (None, 200, 768)     4722432     Encoder-10-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-10-FeedForward-Dropout  (None, 200, 768)     0           Encoder-10-FeedForward[0][0]     
__________________________________________________________________________________________________
Encoder-10-FeedForward-Add (Add (None, 200, 768)     0           Encoder-10-MultiHeadSelfAttention
                                                                 Encoder-10-FeedForward-Dropout[0]
__________________________________________________________________________________________________
Encoder-10-FeedForward-Norm (La (None, 200, 768)     1536        Encoder-10-FeedForward-Add[0][0] 
__________________________________________________________________________________________________
Encoder-11-MultiHeadSelfAttenti (None, 200, 768)     2362368     Encoder-10-FeedForward-Norm[0][0]
__________________________________________________________________________________________________
Encoder-11-MultiHeadSelfAttenti (None, 200, 768)     0           Encoder-11-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-11-MultiHeadSelfAttenti (None, 200, 768)     0           Encoder-10-FeedForward-Norm[0][0]
                                                                 Encoder-11-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-11-MultiHeadSelfAttenti (None, 200, 768)     1536        Encoder-11-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-11-FeedForward (FeedFor (None, 200, 768)     4722432     Encoder-11-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-11-FeedForward-Dropout  (None, 200, 768)     0           Encoder-11-FeedForward[0][0]     
__________________________________________________________________________________________________
Encoder-11-FeedForward-Add (Add (None, 200, 768)     0           Encoder-11-MultiHeadSelfAttention
                                                                 Encoder-11-FeedForward-Dropout[0]
__________________________________________________________________________________________________
Encoder-11-FeedForward-Norm (La (None, 200, 768)     1536        Encoder-11-FeedForward-Add[0][0] 
__________________________________________________________________________________________________
Encoder-12-MultiHeadSelfAttenti (None, 200, 768)     2362368     Encoder-11-FeedForward-Norm[0][0]
__________________________________________________________________________________________________
Encoder-12-MultiHeadSelfAttenti (None, 200, 768)     0           Encoder-12-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-12-MultiHeadSelfAttenti (None, 200, 768)     0           Encoder-11-FeedForward-Norm[0][0]
                                                                 Encoder-12-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-12-MultiHeadSelfAttenti (None, 200, 768)     1536        Encoder-12-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-12-FeedForward (FeedFor (None, 200, 768)     4722432     Encoder-12-MultiHeadSelfAttention
__________________________________________________________________________________________________
Encoder-12-FeedForward-Dropout  (None, 200, 768)     0           Encoder-12-FeedForward[0][0]     
__________________________________________________________________________________________________
Encoder-12-FeedForward-Add (Add (None, 200, 768)     0           Encoder-12-MultiHeadSelfAttention
                                                                 Encoder-12-FeedForward-Dropout[0]
__________________________________________________________________________________________________
Encoder-12-FeedForward-Norm (La (None, 200, 768)     1536        Encoder-12-FeedForward-Add[0][0] 
__________________________________________________________________________________________________
non_masking_layer_4 (NonMasking (None, 200, 768)     0           Encoder-12-FeedForward-Norm[0][0]
__________________________________________________________________________________________________
bidirectional_3 (Bidirectional) (None, 200, 512)     2099200     non_masking_layer_4[0][0]        
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 200, 128)     65664       bidirectional_3[0][0]            
__________________________________________________________________________________________________
crf_3 (CRF)                     (None, 200, 10)      1410        dense_3[0][0]                    
==================================================================================================
Total params: 103,603,714
Trainable params: 2,166,274
Non-trainable params: 101,437,440
__________________________________________________________________________________________________
Epoch 1/1
506/506 [==============================] - 960s 2s/step - loss: 0.0377 - crf_accuracy: 0.9892 - acc: 0.7759

從模型的可視化輸出里面可以清晰的看到BERT的12層Transformer結(jié)構(gòu)孽惰,以及它的參數(shù)量晚岭。kashgari的作者做實(shí)驗(yàn)對比了基于BERT的NER和其他NER方法的效果,發(fā)現(xiàn)BERT確實(shí)強(qiáng)過其他方法勋功。預(yù)訓(xùn)練的語言模型確實(shí)展現(xiàn)出驚人的能力坦报。

結(jié)語

BERT 就像圖像領(lǐng)域的Imagenet,通過高難度的預(yù)訓(xùn)練任務(wù),以及強(qiáng)網(wǎng)絡(luò)模型去預(yù)先學(xué)習(xí)到領(lǐng)域相關(guān)的知識狂鞋,然后去做下游任務(wù)片择。 想較于一些比較于直接使用naive的模型去做深度學(xué)習(xí)任務(wù),BERT就像班里贏在起跑線上的孩子骚揍,肯定比其他孩子要強(qiáng)出一大截∽止埽現(xiàn)在是不是感受到BERT的威力了,嘗試用起來吧信不。

注意

由于kashgari框架的作者使用tensorflow2.0對整個(gè)框架進(jìn)行了重寫纤掸,導(dǎo)致有的接口不能使用,上訴代碼會bug浑塞,若想跑通基于BERT的NER借跪,請移步https://github.com/BrikerMan/Kashgari

參考文獻(xiàn)

https://jalammar.github.io/illustrated-bert/
https://eliyar.biz/nlp_chinese_bert_ner/
https://github.com/BrikerMan/Kashgari

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末酌壕,一起剝皮案震驚了整個(gè)濱河市掏愁,隨后出現(xiàn)的幾起案子歇由,更是在濱河造成了極大的恐慌,老刑警劉巖果港,帶你破解...
    沈念sama閱讀 211,123評論 6 490
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件沦泌,死亡現(xiàn)場離奇詭異,居然都是意外死亡辛掠,警方通過查閱死者的電腦和手機(jī)谢谦,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 90,031評論 2 384
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來萝衩,“玉大人回挽,你說我怎么就攤上這事⌒梢辏” “怎么了千劈?”我有些...
    開封第一講書人閱讀 156,723評論 0 345
  • 文/不壞的土叔 我叫張陵,是天一觀的道長牌捷。 經(jīng)常有香客問我墙牌,道長,這世上最難降的妖魔是什么暗甥? 我笑而不...
    開封第一講書人閱讀 56,357評論 1 283
  • 正文 為了忘掉前任喜滨,我火速辦了婚禮,結(jié)果婚禮上撤防,老公的妹妹穿的比我還像新娘虽风。我一直安慰自己,他們只是感情好即碗,可當(dāng)我...
    茶點(diǎn)故事閱讀 65,412評論 5 384
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著陌凳,像睡著了一般剥懒。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上合敦,一...
    開封第一講書人閱讀 49,760評論 1 289
  • 那天初橘,我揣著相機(jī)與錄音,去河邊找鬼充岛。 笑死保檐,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的崔梗。 我是一名探鬼主播夜只,決...
    沈念sama閱讀 38,904評論 3 405
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼蒜魄!你這毒婦竟也來了扔亥?” 一聲冷哼從身側(cè)響起场躯,我...
    開封第一講書人閱讀 37,672評論 0 266
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎旅挤,沒想到半個(gè)月后踢关,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 44,118評論 1 303
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡粘茄,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 36,456評論 2 325
  • 正文 我和宋清朗相戀三年签舞,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片柒瓣。...
    茶點(diǎn)故事閱讀 38,599評論 1 340
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡儒搭,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出嘹朗,到底是詐尸還是另有隱情师妙,我是刑警寧澤,帶...
    沈念sama閱讀 34,264評論 4 328
  • 正文 年R本政府宣布屹培,位于F島的核電站默穴,受9級特大地震影響,放射性物質(zhì)發(fā)生泄漏褪秀。R本人自食惡果不足惜蓄诽,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,857評論 3 312
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望媒吗。 院中可真熱鬧仑氛,春花似錦、人聲如沸闸英。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,731評論 0 21
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽甫何。三九已至出吹,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間辙喂,已是汗流浹背捶牢。 一陣腳步聲響...
    開封第一講書人閱讀 31,956評論 1 264
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留巍耗,地道東北人秋麸。 一個(gè)月前我還...
    沈念sama閱讀 46,286評論 2 360
  • 正文 我出身青樓,卻偏偏與公主長得像炬太,于是被迫代替她去往敵國和親灸蟆。 傳聞我的和親對象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 43,465評論 2 348

推薦閱讀更多精彩內(nèi)容