筆者PyTorch的全部簡(jiǎn)單教程請(qǐng)?jiān)L問(wèn):http://www.reibang.com/nb/48831659
PyTorch教程-8:舉例詳解TensorBoard的使用
TensroBoard 是一個(gè)幫助我們對(duì)模型訓(xùn)練唤蔗、數(shù)據(jù)處理等很有幫助的可視化工具枉侧,雖然這些可視化操作都可以通過(guò)代碼配合matplotlib
這些很好用的繪圖庫(kù)來(lái)實(shí)現(xiàn)烛缔,但是TensorBoard使得它變得更加簡(jiǎn)單碉考。
官方的一個(gè)簡(jiǎn)單教程:https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html
首先依舊使用最開(kāi)始的一個(gè)簡(jiǎn)單例子:CIFAR10的分類(lèi)任務(wù)升酣,先引入數(shù)據(jù)叽唱、構(gòu)建模型狸膏、創(chuàng)建優(yōu)化器雨效、損失函數(shù)等任務(wù):
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
loss_function = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
接入TensorBoard
tensorboard
包需要在 torch.utils
中引入迅涮,首先我們先通過(guò)一個(gè) SummaryWriter
實(shí)例來(lái)準(zhǔn)備一個(gè)接入TensorBoard的接口:
from torch.utils.tensorboard import SummaryWriter
summaryWriter = SummaryWriter("./runs/")
這里我們?cè)O(shè)置了 run
文件夾用來(lái)存儲(chǔ)記錄信息的位置。
如果在引入tensorboard
時(shí)報(bào)錯(cuò):No module named 'torch.utils.tensorboard'
徽龟,那么需要你手動(dòng)安裝tensorboard
叮姑,比如pip的方式下可以使用:
pip install tensorboard
啟動(dòng)TensorBoard
啟動(dòng)Tensorboard需要在命令行中使用,其中的 logdir
參數(shù)就是我們存放log的文件夾位置:
tensorboard --logdir=runs
TensorFlow installation not found - running with reduced feature set.
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.4.0 at http://localhost:6006/ (Press CTRL+C to quit)
等待TensorBoard啟動(dòng)后据悔,在瀏覽器中訪問(wèn) https://localhost:6006 或者 http://127.0.0.1:6006/ 就能打開(kāi)TensorBoard了传透,6006
是它的默認(rèn)端口。此時(shí)打開(kāi)后會(huì)發(fā)現(xiàn)它提示沒(méi)有任何數(shù)據(jù)极颓,是因?yàn)槲覀冞€沒(méi)有向 runs 文件夾下寫(xiě)入記錄信息的log文件朱盐。
向TensorBoard中寫(xiě)入信息
向TensorBoard中寫(xiě)入信息很簡(jiǎn)單,使用 SummaryWriter
的方法就可以完成讼昆,比如我們將第一個(gè)batch的圖片展示到TensorBoard中的例子:
trainloader_iterator = iter(trainloader)
images, labels = trainloader_iterator.next()
# create grid of images
img_grid = torchvision.utils.make_grid(images)
# write to tensorboard
summaryWriter.add_image("A Batch of Image Samples",img_grid)
運(yùn)行后打開(kāi)/刷新TensorBoard頁(yè)面就可以看到新的效果托享,圖片被顯示了出來(lái)(這里Tensor格式的圖片的RGB通道是不對(duì)的,要顯示正常的圖片浸赫,需要進(jìn)行轉(zhuǎn)換闰围,這里為了展示TensorBoard的效果所以忽略了這一點(diǎn))
可以看到,使用了 add_image
將一個(gè)圖片寫(xiě)入到TensorBoard中既峡,SummaryWriter
還提供了很多方法羡榴,這里列一下,詳細(xì)的參數(shù)說(shuō)明和例子請(qǐng)參考:
https://pytorch.org/docs/stable/tensorboard.html#torch-utils-tensorboard
-
add_scalar(tag, scalar_value, global_step=None, walltime=None)
:添加標(biāo)量數(shù)據(jù) -
add_scalars(main_tag, tag_scalar_dict, global_step=None, walltime=None)
:添加多個(gè)標(biāo)量數(shù)據(jù) -
add_histogram(tag, values, global_step=None, bins='tensorflow', walltime=None, max_bins=None)
:添加一個(gè)柱狀圖 -
add_image(tag, img_tensor, global_step=None, walltime=None, dataformats='CHW')
:添加一張圖片 -
add_images(tag, img_tensor, global_step=None, walltime=None, dataformats='NCHW')
:添加多個(gè)圖片 -
add_figure(tag, figure, global_step=None, close=True, walltime=None)
:渲染一個(gè)matplotlib
的圖片然后添加到TensorBoard -
add_video(tag, vid_tensor, global_step=None, fps=4, walltime=None)
:添加視頻 -
add_audio(tag, snd_tensor, global_step=None, sample_rate=44100, walltime=None)
:添加音頻 -
add_text(tag, text_string, global_step=None, walltime=None)
:添加文本 -
add_graph(model, input_to_model=None, verbose=False)
:添加圖像 -
add_embedding(mat, metadata=None, label_img=None, global_step=None, tag='default', metadata_header=None)
:添加嵌入式投影运敢,一個(gè)很好的例子就是我們可以將高維數(shù)據(jù)映射到三維空間中進(jìn)行直觀地展示和可視化 -
add_pr_curve(tag, labels, predictions, global_step=None, num_thresholds=127, weights=None, walltime=None)
:添加PR曲線 -
add_custom_scalars(layout)
:添加用戶定義的標(biāo)量 -
add_mesh(tag, vertices, colors=None, faces=None, config_dict=None, global_step=None, walltime=None)
:添加3D模型 -
add_hparams(hparam_dict, metric_dict, hparam_domain_discrete=None, run_name=None)
:添加一些可以調(diào)節(jié)的超參數(shù)
在TensorBoard中展示網(wǎng)絡(luò)模型
TensorBoard不僅可以可視化數(shù)據(jù)校仑,還可以可視乎模型,使用 add_graph
方法就可以將一個(gè)模型寫(xiě)入传惠,他接受的第一個(gè)參數(shù)用于傳入一個(gè)模型迄沫,第二個(gè)參數(shù)是要喂給這個(gè)模型的數(shù)據(jù),這里就是一個(gè)batch的圖片:
images, labels = trainloader_iterator.next()
summaryWriter.add_graph(net,images)
使用TensorBoard記錄訓(xùn)練過(guò)程
我們?cè)谥暗睦又姓故具^(guò)卦方,對(duì)于訓(xùn)練過(guò)程羊瘩,我們把訓(xùn)練的loss
打印到了控制臺(tái)上,當(dāng)然如果為了能夠更加直觀的展示loss
的變化過(guò)程,我們可以使用一個(gè)list保存這些loss
尘吗,等待訓(xùn)練完成后使用matplotlib
等工具對(duì)其進(jìn)行可視化逝她。TensorBoard提供了更簡(jiǎn)單的方式,我們可以直接將loss
寫(xiě)到TensorBoard中睬捶,這樣更加的簡(jiǎn)單黔宛,只要對(duì)之前訓(xùn)練的代碼做小小的修改即可:
epochs = 2
# running_loss to record the losses
running_loss = 0.0
for epoch in range(epochs):
for i,data in enumerate(trainloader,0):
# get input images and their labels
inputs, labels = data
# set optimizer buffer to 0
optimizer.zero_grad()
# forwarding
outputs = net(inputs)
# computing loss
loss = loss_function(outputs, labels)
# loss backward
loss.backward()
# update parameters using optimizer
optimizer.step()
# printing some information
running_loss += loss.item()
# for every 1000 mini-batches, print the loss
if i % 1000 == 0:
print("epoch {} - iteration {}: average loss {:.3f}".format(epoch+1, i, running_loss/1000))
summaryWriter.add_scalar("training_loss",running_loss/1000, epoch * len(trainloader) + i)
running_loss = 0.0
print("Training Finished!")
然后在TensorBoard中就可以看到我們的loss的變化結(jié)果,還可以對(duì)其進(jìn)行一些其他設(shè)置擒贸,比如橫坐標(biāo)的格式臀晃、曲線的光滑程度等。