上半部分:Flair-1
Model訓(xùn)練代碼
一個(gè)是獲取訓(xùn)練集(礦)沐扳,一個(gè)是構(gòu)造Embedding+SequenceTagger(礦機(jī))句占,然后兩個(gè)再trainner obj的幫助下開(kāi)始工作(挖礦)
ps: downSample--debug神器
下面一個(gè)部分找了半天:文本分類(lèi)請(qǐng)使用make_label_dictionary(),還是太菜
from flair.data import Corpus
from flair.datasets import WNUT_17
from flair.embeddings import TokenEmbeddings, FlairEmbeddings, StackedEmbeddings
from typing import List
# 訓(xùn)練集獲取
corpus = WNUT_17()
down_sample = corpus.downsample(0.1)
print(down_sample)
# 獲取標(biāo)簽集合->用于SequenceTagger
tag_type = 'ner'
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
#文本分類(lèi)請(qǐng)使用make_label_dictionary()
print(tag_dictionary)
# 初始化embedding對(duì)象->用于SequenceTagger
flair_embedding_forward = FlairEmbeddings('news-forward')
embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=
[
flair_embedding_forward,
])
# SequenceTagger
from flair.models import SequenceTagger
tagger: SequenceTagger = SequenceTagger(hidden_size=256,
embeddings=embeddings,
tag_dictionary=tag_dictionary,
tag_type=tag_type,
use_crf=True)
# training 設(shè)定超參數(shù)
from flair.trainers import ModelTrainer
trainer: ModelTrainer = ModelTrainer(tagger, corpus)
trainer.train('resources/taggers/example-ner',
learning_rate=0.1,
mini_batch_size=32,
max_epochs=150)
# 訓(xùn)練曲線制作
from flair.visual.training_curves import Plotter
plotter = Plotter()
plotter.plot_weights('resources/taggers/example-ner/weights.txt')
命令行結(jié)果
2020-05-11 14:32:51,537 ----------------------------------------------------------------------------------------------------
2020-05-11 14:32:51,537 Corpus: "Corpus: 3394 train + 1009 dev + 1287 test sentences"
2020-05-11 14:32:51,538 ----------------------------------------------------------------------------------------------------
2020-05-11 14:32:51,538 Parameters:
2020-05-11 14:32:51,538 - learning_rate: "0.1"
2020-05-11 14:32:51,538 - mini_batch_size: "32"
2020-05-11 14:32:51,538 - patience: "3"
2020-05-11 14:32:51,538 - anneal_factor: "0.5"
2020-05-11 14:32:51,538 - max_epochs: "150"
2020-05-11 14:32:51,538 - shuffle: "True"
2020-05-11 14:32:51,538 - train_with_dev: "False"
2020-05-11 14:32:51,539 - batch_growth_annealing: "False"
2020-05-11 14:32:51,539 ----------------------------------------------------------------------------------------------------
2020-05-11 14:32:51,539 Model training base path: "resources/taggers/example-ner"
2020-05-11 14:32:51,539 ----------------------------------------------------------------------------------------------------
2020-05-11 14:32:51,539 Device: cpu
2020-05-11 14:32:51,539 ----------------------------------------------------------------------------------------------------
2020-05-11 14:32:51,539 Embeddings storage mode: cpu
2020-05-11 14:32:51,540 ----------------------------------------------------------------------------------------------------
2020-05-11 14:33:06,177 epoch 1 - iter 10/107 - loss 27.19113178 - samples/sec: 21.87
終止上面的代碼還會(huì)完成自動(dòng)測(cè)試哄陶,牛掰
問(wèn)題:目前不知道這么上gpu
train函數(shù)的參數(shù):
超參 | 可選/默認(rèn) | 作用 |
---|---|---|
embeddings_storage_mode | ['cpu', 'gpu','none'] | 內(nèi)存屋吨,GPU,需要時(shí)候?qū)?/td> |
learning_rate | 0.1 | 學(xué)習(xí)率 |
mini_batch_size | 32 | batch_size |
patience | 3 | |
anneal_factor | 0.5 | |
max_epochs | 150 | 最大epoch |
shuffle | True | |
train_with_dev | False | 最后時(shí)刻用 |
batch_growth_annealing | False | |
tag_type | ['ner', 'upos','pos',''] | 文本分類(lèi)中不用賦值 |
制作checkpoint 與導(dǎo)入重新訓(xùn)練
trainer: ModelTrainer = ModelTrainer(tagger, corpus)
# 7. start training
trainer.train('resources/taggers/example-ner',
learning_rate=0.1,
mini_batch_size=32,
max_epochs=150,
checkpoint=True)
# 8. stop training at any point
# 9. continue trainer at later point
from pathlib import Path
checkpoint = 'resources/taggers/example-ner/checkpoint.pt'
trainer = ModelTrainer.load_checkpoint(checkpoint, corpus)
trainer.train('resources/taggers/example-ner',
learning_rate=0.1,
mini_batch_size=32,
max_epochs=150,
checkpoint=True)
Model 測(cè)試代碼
# load the model you trained
model = SequenceTagger.load('resources/taggers/example-ner/final-model.pt')
# create example sentence
sentence = Sentence('I love Berlin')
# predict tags and print
model.predict(sentence)
print(sentence.to_tagged_string())
結(jié)果展示
文件 | 作用 |
---|---|
best-model.pt | dev最佳模型 |
final-model.pt | 最后終止時(shí)候的模型 |
loss.tsv | loss曲線 |
test.tsv | 測(cè)試結(jié)果 |
training.log | log |
weights.png | 所有的權(quán)重圖像化 |
weights.txt | 所有的權(quán)重 |
超參煉丹爐
from hyperopt import hp
from flair.hyperparameter.param_selection import SearchSpace, Parameter
# define your search space
search_space = SearchSpace()
search_space.add(Parameter.EMBEDDINGS, hp.choice, options=[
[ WordEmbeddings('en') ],
[ FlairEmbeddings('news-forward'), FlairEmbeddings('news-backward') ]
])
search_space.add(Parameter.HIDDEN_SIZE, hp.choice, options=[32, 64, 128])
search_space.add(Parameter.RNN_LAYERS, hp.choice, options=[1, 2])
search_space.add(Parameter.DROPOUT, hp.uniform, low=0.0, high=0.5)
search_space.add(Parameter.LEARNING_RATE, hp.choice, options=[0.05, 0.1, 0.15, 0.2])
search_space.add(Parameter.MINI_BATCH_SIZE, hp.choice, options=[8, 16, 32])
ps:Attention: You should always add your embeddings to the search space (as shown above). If you don't want to test different kind of embeddings, simply pass just one embedding option to the search space, which will then be used in every test run
Learning Rate調(diào)節(jié)
為什么要把learning rate 單獨(dú)拿出來(lái),因?yàn)閘earning rate是最重要直秆,且最多變的一個(gè)超參數(shù)。The learning rate is one of the most important hyper parameter and it fundamentally depends on the topology of the loss landscape via the architecture of your model and the training data it consumes.并且推薦了Cyclical Learning Rates for Training Learning Rate從很小開(kāi)始萨咳,每個(gè)Batch都指數(shù)上升培他,然后到一定程度后再開(kāi)始變小遗座,(主要是未來(lái)找到合適的開(kāi)始LR)
1.Cyclical Learning Rates
learning_rate_tsv = trainer.find_learning_rate('resources/taggers/example-ner',
'learning_rate.tsv')
#后面還可以追加打印
plotter.plot_learning_rate(learning_rate_tsv)
2.use ADAM or else
from torch.optim.adam import Adam
trainer = ModelTrainer(tagger, corpus,
optimizer=Adam)
trainer.train(
"resources/taggers/example",
weight_decay=1e-4
)
學(xué)習(xí)目錄
- Tutorial 1: Basics
- Tutorial 2: Tagging your Text
- Tutorial 3: Embedding Words
- Tutorial 4: List of All Word Embeddings
- Tutorial 5: Embedding Documents
- Tutorial 6: Loading your own Corpus
- Tutorial 7: Training your own Models
- Tutorial 8: Optimizing your Models
- Tutorial 9: Training your own Flair Embeddings
PS
訓(xùn)練你自己的語(yǔ)言模型(embedding)
暫時(shí)pass猛遍,用處不大号坡。