1 DataLoader的作用
簡單來說操漠,DataLoader就是數(shù)據(jù)加載器收津,結(jié)合了數(shù)據(jù)集和取樣器,并且可以提供多個線程處理數(shù)據(jù)集浊伙。在訓(xùn)練模型時使用到此函數(shù)撞秋,用來把訓(xùn)練數(shù)據(jù)分成多個小組,此函數(shù)每次拋出一組數(shù)據(jù)嚣鄙。直至把所有的數(shù)據(jù)都拋出吻贿。就是做一個數(shù)據(jù)的初始化。
在實(shí)踐中哑子,數(shù)據(jù)讀取經(jīng)常是訓(xùn)練的性能瓶頸舅列,特別當(dāng)模型較簡單或者計算硬件性能較高時肌割。Pytorch的Dataloader中一個很方便的功能是允許使用多進(jìn)程來加速數(shù)據(jù)讀取,我們可以通過num_workers來設(shè)置使用幾個進(jìn)程讀取數(shù)據(jù)帐要。
2 DataLoader的使用
from torch.utils.data import DataLoader
test_loader = DataLoader(dataset=test_data,batch_size=64,shuffle=True,num_workers=0,drop_last=False)
參數(shù)解釋:
dataset:要取的數(shù)據(jù)集把敞,一般要返回img和label
batch_size:每次從dataset中取多少數(shù)據(jù)進(jìn)行打包
shuffle:是否打亂數(shù)據(jù)
num_workers:加載數(shù)據(jù)的時候采用單進(jìn)程還是多進(jìn)程,默認(rèn)設(shè)置為0榨惠,意為采用主進(jìn)程進(jìn)行加載
*注:num_works在windows中會偶爾出現(xiàn)問題奋早,如果遇到workerror可以考慮將num_works設(shè)置為0
drop_last:當(dāng)數(shù)據(jù)集最后一批小于batch_size時,是否舍去最后一批數(shù)據(jù)集
test_data = torchvision.datasets.CIFAR10("../dataset",train=False,transform=torchvision.transforms.ToTensor())
test_loader = DataLoader(dataset=test_data,batch_size=64,shuffle=True,num_workers=0,drop_last=False)
img, target = test_data[0]
writer = SummaryWriter("../logs/P11_logs")
step = 0
for data in test_loader:
imgs, target = data
writer.add_images("test_data",imgs,step)
step = step+1
writer.close()
結(jié)果:
如果增加epoch赠橙,并將shuffle設(shè)置為False耽装,drop_last設(shè)置為True:
test_data = torchvision.datasets.CIFAR10("../dataset",train=False,transform=torchvision.transforms.ToTensor())
test_loader = DataLoader(dataset=test_data,batch_size=64,shuffle=False,num_workers=0,drop_last=True)
img, target = test_data[0]
writer = SummaryWriter("../logs/P11_logs")
for epoch in range(2):
step = 0
for data in test_loader:
imgs, target = data
writer.add_images("epoch:{}".format(epoch),imgs,step)
step = step+1
writer.close()
將shuffle設(shè)置為True后的結(jié)果:
參考資料:
1.https://www.bilibili.com/video/BV1hE411t7RN?p=15&spm_id_from=pageDriver
2.https://blog.csdn.net/csdn_of_ding/article/details/109138049
3.https://zhuanlan.zhihu.com/p/234825890