DCGAN TUTORIAL----做項目學(xué)pytorch
@(PyTorch)[GANs]
[TOC]
手把手用pytorch寫GANs
使用torch.utils.data.DataLoader獲取數(shù)據(jù)
dset.ImageFolder
用來將數(shù)據(jù)統(tǒng)一進(jìn)行處理栅屏,第一個參數(shù)root
是原始文件的存放文件夾逛艰,第二個參數(shù)transforms
接受一個transorms.Compose
對象用于設(shè)置將每一個圖像進(jìn)行怎樣的處理蘑志,其輸入?yún)?shù)是一個列表金砍,列表中的每一項表示進(jìn)行何等的處理transforms.Resize(image_size)
傳入?yún)?shù)可以是一個列表(a,b)和具體的值a址否,表示resize成(a,b)餐蔬,或者(a,a)大小,transforms.CenterCrop(image_size)
對圖像進(jìn)行中心裁剪成(a,b)和具體的值a佑附,表示中心裁剪成(a,b)樊诺,或者(a,a)大小,即可以接受兩種參數(shù)一種是tuple音同,一種是數(shù)值词爬,transforms.ToTensor()
將圖片轉(zhuǎn)換成tensor,transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
表示將圖片進(jìn)行正則化,前面一個參數(shù)是RGB圖像的均值权均,后面一個tuple是方差, 給定均值:(R,G,B) 方差:(R顿膨,G,B)叽赊,將會把Tensor正則化恋沃。即:Normalized_image=(image-mean)/std。然后使用一個torch.utils.data.DataLoader
函數(shù)加載數(shù)據(jù)必指,加載數(shù)據(jù)能設(shè)置每次的batch_size
是否進(jìn)行shuffle
以及使用多少個線程進(jìn)行加載num_workers
囊咏,real_batch
就是我們得到的真正的一個batch的處理過的圖片的數(shù)據(jù)了,next
函數(shù)式取迭代器中的下一個數(shù)據(jù)塔橡,iter
是將一個對象(列表)變成迭代器對象梅割。
dataset = dset.ImageFolder(root=dataroot,
transform=transforms.Compose([
transforms.Resize(image_size),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
]))
# Create the dataloader
dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size,
shuffle=True, num_workers=workers)
real_batch = next(iter(dataloader))
初始化網(wǎng)絡(luò)權(quán)重
因為GANs的初始化比較重要,我們不能初始化為全零葛家,應(yīng)該給網(wǎng)絡(luò)中的參數(shù)值初始化為一些接近0的很小的值户辞,weight_init
函數(shù)接受一個初始化過的model作為傳入?yún)?shù)并將這個model內(nèi)部的所有層的參數(shù)進(jìn)行重新初始化為均值為0,方差為1的隨機(jī)變量癞谒,nn.init.normal_(a,b,c)
用于將a使用均值為b底燎,方差為c的高斯函數(shù)進(jìn)行初始化,m.__class__.__name__
返回m的名字(就是class定義時候的名字扯俱,nn.init.constant_(a,b)
是將a初始化成常數(shù)b书蚪,m是一個model,m.weight.data就是網(wǎng)絡(luò)中W的tensor值迅栅,m.bias.data就是偏置值殊校。
def weights_init(m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
nn.init.normal_(m.weight.data, 0.0, 0.02)
elif classname.find('BatchNorm') != -1:
nn.init.normal_(m.weight.data, 1.0, 0.02)
nn.init.constant_(m.bias.data, 0)
Generator
- 生成器就是用來將一個潛在空間的向量z映射到數(shù)據(jù)空間,
nz
表示z的長度读存,此處是100为流,ngf
是Generator輸出特征圖的size呕屎,此處是64,nc
是輸出圖像的channel數(shù)目敬察,RGB圖像是3秀睛。 -
super(Generator,self).__init__()
是對繼承自父類的nn.Module
的屬性進(jìn)行初始化。而且是用父類的初始化方法來初始化繼承的屬性莲祸,nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1)
,相當(dāng)于解卷積操作蹂安,只要指定了卷積核的大小那么輸出feature map 的大小就確定的了,因為每次開始的時候解卷積操作只有輸入feature map 中的一個像素點(看動圖更清楚)锐帜,nn.Sequential(arg*)
是一個序列容器田盈,將模塊按照次序添加到這個容器里面組成一個model,model=nn.Sequential(arg*)
缴阎,以第一個解卷積操作為例允瞧,將一個長度為nz的向量,使用4 * 4的解卷積核蛮拔,使用(1,1)的stride述暂,不填充的方法解卷積成4 * 4 * (64 * 8)的特征圖,nn.batchNorm2d(c)
,c是輸入特征圖的數(shù)目建炫,也就是channel數(shù)目畦韭,對每個特征圖上的點進(jìn)行減均值除方差的操作(均值和方差是每個mini-batch內(nèi)的對應(yīng)feature層的均值和方差,The mean and standard-deviation are calculated per-dimension over the mini-batches and)踱卵,處理的是一個4維矩陣廊驼,輸入C from an expected input of size (N,C,H,W)据过,(batch_size,channel_numbers,height,width),可以以卷積的思維來理解解卷積(transpose conv)惋砂,比如nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False)
如果是卷積的話,那么輸入時特征圖的大小是10 * 10 因為本來大小是8 * 8 做了一個padding绳锅,然后卷積核的大小是 4 * 4西饵,步長是(2,2),那么進(jìn)行卷積后特征圖的大小是4 * 4鳞芙,將這個過程倒過來看就是解卷積的過程(4 * 4 映射成8 * 8)眷柔,nn.Tanh
,目的是將生成器的輸出(特征圖)的數(shù)據(jù)范圍變回[-1,1]之間,因為輸入的向量z的數(shù)據(jù)范圍也是[-1,1]原朝。
class Generator(nn.Module):
def __init__(self, ngpu):
super(Generator, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is Z, going into a convolution
nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 8),
nn.ReLU(True),
# state size. (ngf*8) x 4 x 4
nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 4),
nn.ReLU(True),
# state size. (ngf*4) x 8 x 8
nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 2),
nn.ReLU(True),
# state size. (ngf*2) x 16 x 16
nn.ConvTranspose2d( ngf * 2, ngf, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf),
nn.ReLU(True),
# state size. (ngf) x 32 x 32
nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
nn.Tanh()
# state size. (nc) x 64 x 64
)
def forward(self, input):
return self.main(input)
Discriminator
- 判別器是一個二元分類器驯嘱,輸入是一個圖像,輸出是這張圖片是真圖的可能性喳坠,沒使用池化層的原因是作證覺得使用了卷積相當(dāng)于讓網(wǎng)絡(luò)自己學(xué)習(xí)池化的方式(DCGAN paper mentions it is a good practice to use strided convolution rather than pooling to downsample because it lets the network learn its own pooling function.)鞠评,同時使用leaky relu,能加快梯度傳播,有助于訓(xùn)練(Also batch norm and leaky relu functions promote healthy gradient flow which is critical for the learning process of both G and D.)
class Discriminator(nn.Module):
def __init__(self, ngpu):
super(Discriminator, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is (nc) x 64 x 64
nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf) x 32 x 32
nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 2),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*2) x 16 x 16
nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 4),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*4) x 8 x 8
nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 8),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*8) x 4 x 4
nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False),
nn.Sigmoid()
)
def forward(self, input):
return self.main(input)
Loss Functions and Optimizers
- 損失函數(shù)使用的是交叉熵(binary cross entropy loss壕鹉,BCELoss)剃幌,真實圖片的label是1聋涨,生成圖片的label是0,然后我們分別定義兩個優(yōu)化器负乡,在論文中使用的是學(xué)習(xí)率為0.0002牍白,beta1是0.5的Adam 優(yōu)化器,z向量來自高斯分布抖棘,在訓(xùn)練的過程中茂腥,我們周期性的給生成器中加入固定的噪聲,所以在生成圖片的過程中我們會看到圖像怎樣從噪聲中慢慢的變形切省。
-
torch.randn(n,m,e,v,device=device)
生成size為n * m * e * v的隨機(jī)數(shù)張量础芍,optim.Adam(netD.parameters(),lr=lr,betas=(beta1,beta2))
,Adam優(yōu)化器的三個參數(shù),第一個參數(shù)是優(yōu)化目標(biāo)数尿,是網(wǎng)絡(luò)中的參數(shù)仑性,使用netD.parameters()
來獲得,第二個參數(shù)是學(xué)習(xí)率右蹦,第三個是Adam里面的參數(shù)诊杆。
# Initialize BCELoss function
criterion = nn.BCELoss()
# Create batch of latent vectors that we will use to visualize
# the progression of the generator
fixed_noise = torch.randn(64, nz, 1, 1, device=device)
# Establish convention for real and fake labels during training
real_label = 1
fake_label = 0
# Setup Adam optimizers for both G and D
optimizerD = optim.Adam(netD.parameters(), lr=lr, betas=(beta1, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=lr, betas=(beta1, 0.999))
- 訓(xùn)練GANs的時候要小心翼翼,因為如果超參數(shù)設(shè)置不當(dāng)?shù)脑挄屇愕哪P蚦ollapse何陆,我們會給真實圖片和虛假圖片不同的mini_batch晨汹,訓(xùn)練過程可以分成兩個部分,第一部分是是更新判別器的參數(shù)贷盲,第二部分是更新生成器的參數(shù)淘这。
- 訓(xùn)練判別器:訓(xùn)練判別器的目的是最大正確分類的可能性,根據(jù)goodfellow的論文巩剖,我們通過隨機(jī)梯度下降更新判別器铝穷,
-log(D(x))-log(1?D(G(z)))
,由于生成器和判別器的batch_size 不同佳魔,所以訓(xùn)練分成兩個回合曙聂,首先我們構(gòu)建一個真實圖片的batch,進(jìn)行一次正向傳播和反向傳播計算loss(-log(D(x)))
鞠鲜。其次宁脊,構(gòu)建一個生成圖片的batch,進(jìn)行一次正向傳播和反向傳播計算loss(-log(1-D(G(z))))
,最后將這兩部分的loss相加贤姆。 - 訓(xùn)練生成器:我們訓(xùn)練生成器的目的是最大化
-log(1-D(G(z)))
(就是最小化生成圖片的得分榆苞,可以畫出log(1-x)的圖像,當(dāng)x越逼近1函數(shù)越邢技瘛)生成更逼真的假圖坐漏,但是根據(jù)goodfellow的論文,這樣訓(xùn)練的話在訓(xùn)練的初期提供的梯度不足,訓(xùn)練很慢甚至不能收斂仙畦,所以我們更改為最小化-log(D(G(z)))
(雖然直覺上也能理解需要最小化某一個數(shù)输涕,但是看來數(shù)學(xué)學(xué)好很關(guān)鍵啊,實質(zhì)看著是一樣的慨畸,沒想到改了一下數(shù)學(xué)形式不work的東西就work了)莱坎,所以訓(xùn)練的方法也要改變,就是使用假的圖片以及真實的標(biāo)簽訓(xùn)練G‘s loss,雖然使用假圖真標(biāo)簽訓(xùn)練生成器這個操作聽起來很反直覺寸士,但是這樣能夠更新G的參數(shù)檐什,并且實際上是work的,如果能將真實標(biāo)簽的數(shù)據(jù)訓(xùn)練到判別器輸出是真實圖片的概率非常接近0.5的時候訓(xùn)練結(jié)束弱卡。
-
epoch
是訓(xùn)練的輪數(shù)目乃正,是要將每一個訓(xùn)練數(shù)據(jù)訓(xùn)練多少個批次,因為每次都shuffle 婶博,所以不同數(shù)據(jù)組合在一起的mini_batch對訓(xùn)練有幫助瓮具,netD.zero_grad()
將模型的梯度初始化為0,data[0].to(device)
就是將數(shù)據(jù)放在device上面凡人,因為初始化的時候沒有指定將數(shù)據(jù)放在哪個device上面所以訓(xùn)練的時候要轉(zhuǎn)化一下名党。real_cpu.size(0)
第0維是數(shù)據(jù)批次的size,torch.full(size, fill_value , ...
就是給產(chǎn)生一個size大小的值為fill_value的張量挠轴,這個張量是被用來當(dāng)做真實標(biāo)簽的值都是1传睹。out_put=netD(real_cpu).view(-1)
隱式的調(diào)用forward,.view(-1)
就是將一個tensor變成一維的岸晦,是對真實圖片進(jìn)行一次正向傳播欧啤,criterion(output,label)
對于一個小批量數(shù)據(jù),計算loss启上,注意這里的output和label都是mini_batch_size 維度的邢隧。errD_real.backward()
進(jìn)行一次反向傳播更新梯度,output.mean().item()
是對一個tensor去均值并將這個tensor轉(zhuǎn)換成python中的數(shù)值碧绞,label.fill_(fake_label)
給張量 label 賦予值fake_label府框,netD(fake.detach()).view(-1)
,tensor.detach()的功能是將一個張量從graph中剝離出來吱窝,不用計算梯度讥邻,(Returns a new Tensor, detached from the current graph.)因為想D網(wǎng)絡(luò)和G網(wǎng)絡(luò)之間有耦合,我們想在想單獨的訓(xùn)練D院峡,G網(wǎng)絡(luò)兴使,所以在D,G網(wǎng)絡(luò)之間有數(shù)據(jù)傳遞的時間要用.detach()照激,tensor.view(*arg)
作用是將tensor進(jìn)行形變成arg所指示的形狀发魄,如果一個維度是-1那么表示這個維度由其他維度推導(dǎo)得到,這里的.view(-1)
表示reshape成一維度向量,這個一維向量的長度由推斷得出励幼, 注意這里的errD_fake是算出來的Loss這一tensor汰寓,然后errD_fakel.backward()是根據(jù)Loss這一tensor使用反向傳播算法計算梯度(其實可以對任意tensor使用tensor.backward(),那么只會計算輸入到這個tensor里面的所有"有關(guān)"tensor的梯度苹粟,這個概念比較難理解有滑,其實畫個圖表示tensor之間的關(guān)系,則只會更新箭頭指向這個tensor和其相關(guān)tensor的梯度值)嵌削,optimizerD.setp()是根據(jù)之前計算已經(jīng)算出來的梯度值使用某一種優(yōu)化算法進(jìn)行參數(shù)更新
毛好,最小化-log(D(G(z)))
計算也很簡單就是優(yōu)化D(G(z))的得分逼近1,使得G生成的圖像更為逼真苛秕,每隔50下輸出損失肌访,每隔500下或者是最后一個epoch的最后一批數(shù)據(jù)取出一個生成器的圖片儲存起來,with torch.no_grad():
的含義是確定上下文管理器里面的tensor都是不需要計算梯度的艇劫,可以減少計算單元的浪費( ...will have requires_grad=False, even when the inputs have requires_grad=True)吼驶,tensor.cpu()
就是講一個數(shù)據(jù)從顯存復(fù)制到內(nèi)存里面(return a copy)。
# Training Loop
# Lists to keep track of progress
img_list = []
G_losses = []
D_losses = []
iters = 0
print("Starting Training Loop...")
# For each epoch
for epoch in range(num_epochs):
# For each batch in the dataloader
for i, data in enumerate(dataloader, 0):
############################
# (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
###########################
## Train with all-real batch
netD.zero_grad()
# Format batch
real_cpu = data[0].to(device)
b_size = real_cpu.size(0)
label = torch.full((b_size,), real_label, device=device)
# Forward pass real batch through D
output = netD(real_cpu).view(-1)
# Calculate loss on all-real batch
errD_real = criterion(output, label)
# Calculate gradients for D in backward pass
errD_real.backward()
D_x = output.mean().item()
## Train with all-fake batch
# Generate batch of latent vectors
noise = torch.randn(b_size, nz, 1, 1, device=device)
# Generate fake image batch with G
fake = netG(noise)
label.fill_(fake_label)
# Classify all fake batch with D
output = netD(fake.detach()).view(-1)
# Calculate D's loss on the all-fake batch
errD_fake = criterion(output, label)
# Calculate the gradients for this batch
errD_fake.backward()
D_G_z1 = output.mean().item()
# Add the gradients from the all-real and all-fake batches
errD = errD_real + errD_fake
# Update D
optimizerD.step()
############################
# (2) Update G network: maximize log(D(G(z)))
###########################
netG.zero_grad()
label.fill_(real_label) # fake labels are real for generator cost
# Since we just updated D, perform another forward pass of all-fake batch through D
output = netD(fake).view(-1)
# Calculate G's loss based on this output
errG = criterion(output, label)
# Calculate gradients for G
errG.backward()
D_G_z2 = output.mean().item()
# Update G
optimizerG.step()
# Output training stats
if i % 50 == 0:
print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f\tD(x): %.4f\tD(G(z)): %.4f / %.4f'
% (epoch, num_epochs, i, len(dataloader),
errD.item(), errG.item(), D_x, D_G_z1, D_G_z2))
# Save Losses for plotting later
G_losses.append(errG.item())
D_losses.append(errD.item())
# Check how the generator is doing by saving G's output on fixed_noise
if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):
with torch.no_grad():
fake = netG(fixed_noise).detach().cpu()
img_list.append(vutils.make_grid(fake, padding=2, normalize=True))
iters += 1
全局變量初始化
# Set random seem for reproducibility
manualSeed = 999
#manualSeed = random.randint(1, 10000) # use if you want new results
print("Random Seed: ", manualSeed)
random.seed(manualSeed)
torch.manual_seed(manualSeed)
# Root directory for dataset
dataroot = "/home/ubuntu/facebook/datasets/celeba"
# Number of workers for dataloader
workers = 4
# Batch size during training
batch_size = 128
# Spatial size of training images. All images will be resized to this
# size using a transformer.
image_size = 64
# Number of channels in the training images. For color images this is 3
nc = 3
# Size of z latent vector (i.e. size of generator input)
nz = 100
# Size of feature maps in generator
ngf = 64
# Size of feature maps in discriminator
ndf = 64
# Number of training epochs
num_epochs = 5
# Learning rate for optimizers
lr = 0.0002
# Beta1 hyperparam for Adam optimizers
beta1 = 0.5
# Number of GPUs available. Use 0 for CPU mode.
ngpu = 1
生成模型實例
.to(device)
是將模型放在顯存上店煞,因為有GPU旨剥,GPU是和顯存進(jìn)行交互的,device.type
獲得設(shè)備的類型浅缸,netG=nn.DataParallel(netG,[0,2,3])
將模型放置在放在0,2,3號顯卡上執(zhí)行轨帜,DataParallel將數(shù)據(jù)自動分割送到不同的GPU上處理,在每個模塊完成工作后衩椒,DataParallel再收集整合這些結(jié)果返回蚌父,如果有過個閑置的卡的話建議這在操作一波,減少訓(xùn)練時間毛萌,print(netG)
很有意思將一個model 的實例輸出苟弛,輸出信息是這個model有哪些層,層的設(shè)置是怎樣的阁将。
# Create the generator
netG = Generator(ngpu).to(device)
# Handle multi-gpu if desired
if (device.type == 'cuda') and (ngpu > 1):
netG = nn.DataParallel(netG, list(range(ngpu)))
# Apply the weights_init function to randomly initialize all weights
# to mean=0, stdev=0.2.
netG.apply(weights_init)
# Print the model
print(netG)
# Create the Discriminator
netD = Discriminator(ngpu).to(device)
# Handle multi-gpu if desired
if (device.type == 'cuda') and (ngpu > 1):
netD = nn.DataParallel(netD, list(range(ngpu)))
# Apply the weights_init function to randomly initialize all weights
# to mean=0, stdev=0.2.
netD.apply(weights_init)
# Print the model
print(netD)
CUDA_VISIBLE_DEVICES=1,3 python face_gans.py
#!/usr/bin/env python
from __future__ import print_function
import argparse
import os
import random
import torch
import torch.nn as nn
import cv2
import torch.backends.cudnn as cudnn
import torchvision.transforms as transforms
import torch.utils.data
import torchvision.datasets as dset
import torch.optim as optim
import numpy as np
from tensorboardX import SummaryWriter
ROOT_PATH='/home/zhaoliang/project/face_GANs'
SEED=123
IMAGESIZE=64
BATCH_SIZE=512
WORKER_NUM=2 #use three GPUs
z=100 # the length of lateen vector in Generator
lr=0.0002 #leaning rate
epochs_num = 5 #All pictures are trained 5 times
def dataloader():
print("image folder:",os.path.join(ROOT_PATH,"img_align_celeba"))
dataset = dset.ImageFolder(root=ROOT_PATH, transform=transforms.Compose(
[transforms.Resize(IMAGESIZE), transforms.CenterCrop(IMAGESIZE), transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), ]))
dataloader = torch.utils.data.DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=WORKER_NUM)
print(" %d images were found there! "%len(dataloader)) #輸出圖片的數(shù)目
return dataloader
def weight_init(model):
classname=model.__class__.__name__
print("Initializing parameters of %s !"%classname)
if classname.find("Conv") !=-1:
nn.init.normal_(model.weight.data,0.0,0.02)
elif classname.find("BatchNorm")!=-1:
nn.init.normal_(model.weight.data,0.0,0.02)
nn.init.normal_(model.bias.data,0.0,0.0)
print("finished parameters initialization!")
class Generator(nn.Module):
def __init__(self):
super(Generator,self).__init__()
self.main=nn.Sequential(
nn.ConvTranspose2d (in_channels=z,out_channels=IMAGESIZE * 8,kernel_size=4,stride=1,padding=0),# 4*4
nn.BatchNorm2d(IMAGESIZE * 8),
nn.ReLU(),
nn.ConvTranspose2d(in_channels=IMAGESIZE * 8,out_channels=IMAGESIZE * 4,kernel_size=4,stride=2,padding=1),#8*8
nn.BatchNorm2d(IMAGESIZE*4),
nn.ReLU(),
nn.ConvTranspose2d(in_channels=IMAGESIZE * 4,out_channels=IMAGESIZE *2,kernel_size=4,stride=2,padding=1),#16*16
nn.BatchNorm2d(IMAGESIZE*2),
nn.ReLU(),
nn.ConvTranspose2d(in_channels=IMAGESIZE * 2,out_channels=IMAGESIZE,kernel_size=4,stride=2,padding=1),#32*32
nn.BatchNorm2d(IMAGESIZE),
nn.ReLU(),
nn.ConvTranspose2d(in_channels=IMAGESIZE,out_channels=3,kernel_size=4,stride=2,padding=1),#64*64
#nn.BatchNorm2d(IMAGESIZE) #There is a question why we do not use BatchNorm directly?
nn.Tanh()
)
def forward(self, noise): #Generate image based on a noise vector
return self.main(noise)
class Discriminator(nn.Module):
def __init__(self):
super(Discriminator,self).__init__()
self.main=nn.Sequential(
nn.Conv2d(in_channels=3,out_channels=IMAGESIZE,kernel_size=4,stride=2,padding=1,bias=False),
nn.LeakyReLU(0.2,inplace=True),
nn.Conv2d(IMAGESIZE,IMAGESIZE*2,4,2,1,bias=False),
nn.BatchNorm2d(IMAGESIZE*2),
nn.LeakyReLU(0.2,inplace=True),
nn.Conv2d(IMAGESIZE*2,IMAGESIZE*4,4,2,1,bias=False),
nn.BatchNorm2d(IMAGESIZE*4),
nn.LeakyReLU(0.2,inplace=True),
nn.Conv2d(IMAGESIZE*4,IMAGESIZE*8,4,2,1,bias=False),
nn.BatchNorm2d(IMAGESIZE*8),
nn.LeakyReLU(0.2,inplace=True),
nn.Conv2d(IMAGESIZE*8,1,4,1,0), #1*1
nn.Sigmoid()
)
def forward(self,image):
return self.main(image)
if __name__=='__main__': #define loss function and optimizer and train the networks
random.seed(SEED)
torch.manual_seed(SEED)
device=torch.device("cuda:0" if torch.cuda.is_available() else "cpu" )
criterion=nn.BCELoss()
noise=torch.randn(64,z,1,1,device=device) # the batch size of generated picture is 64
real_label=1
fake_label=0
netG=Generator().to(device)
if device.type=='cuda' and WORKER_NUM>0:
netG=nn.DataParallel(netG,list(range(WORKER_NUM)))
netG.apply(weight_init)
netD=Discriminator().to(device)
if device.type=='cuda' and WORKER_NUM>0:
netD=nn.DataParallel(netD,list(range(WORKER_NUM)))
netD.apply(weight_init)
print("Generative NetWork:")
print(netG)
print("")
print("Discriminative NetWork:")
print(netD)
optimizerD=optim.Adam(netD.parameters(),lr=lr,betas=(0.5,0.999))
optimizerG=optim.Adam(netG.parameters(),lr=lr,betas=(0.5,0.999))
print("OptimizerD:")
print(optimizerD)
print("OptimizerG:")
print(optimizerG)
iters=0
writer = SummaryWriter(log_dir=os.path.join(ROOT_PATH, 'logs'))
dataloader=dataloader()
print("Starting training loop...")
for epoch in range(epochs_num):
for i,data in enumerate(dataloader):
netD.zero_grad()
data=data[0].to(device)
#print("there are %d images in this batch."%data.size(0))
#update D network
label=torch.full((data.size(0),),real_label,device=device) #because the size of some batch is different than others , so we have to put the tensor label_real in loops.
output=netD(data).view(-1) #convert 4-dimensional vector to one-dimensional vector
lossD_real=criterion(output,label)
lossD_real.backward()
label.fill_(fake_label)
noise=torch.randn(data.size(0),z,1,1,device=device) #batch_size
fakeImage=netG(noise)
"""
fixing network G when update D network so we detach fakeImage from netG,
wo dont perform optimierG.step() , and use fakeImage.detach()
because we don't need caculate G's gradient and without caculating G's gradient dose not affect the update of D's parameters
"""
lossD_fake=criterion(netD(fakeImage.detach()).view(-1),label) #the purpose of first step is to distinguish real image from fake ones , so we use fake_label
lossD_fake.backward()
lossD=lossD_fake+lossD_real
writer.add_scalar('DLoss/iter', lossD.item(), iters)
optimizerD.step()
#update G network
"""
but in this case, we canot detach fakeImage out of G because in this time our goal is to update G'parameters.
"""
netG.zero_grad()
label.fill_(real_label)
output=netD(fakeImage).view(-1)
lossG=criterion(output,label)
writer.add_scalar('GLoss/iter', lossG.item(), iters)
lossG.backward() # this code will caculate the gradient of D and G ,but it's okay , we just need to update G's parameters.
optimizerG.step()
if i % 50 == 0:
print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f.'% (epoch, epochs_num, i, len(dataloader),lossD.item(), lossG.item()))
if (iters % 500 == 0) or ((epoch == epochs_num - 1) and (i == len(dataloader) - 1)):
with torch.no_grad():
fake = netG(noise).detach().cpu()[0]
cv2.imwrite('%d.jpg'%iters, np.transpose(fake.numpy()*256,(1,2,0)))
print("image %d.jpg has been saved."%iters)
iters+=1
torch.save(netG, os.path.join(ROOT_PATH, 'face_GANs_Generator_180928.pkl'))
torch.save(netD, os.path.join(ROOT_PATH, "face_GANs_Discriminator_180928.pkl"))
print("all done!")