一寫在前面

未經(jīng)允許式塌，不得轉(zhuǎn)載，謝謝~

pytorch發(fā)布了0.4版本，跟0.3相比還是有挺多不同的回官，所以學(xué)習(xí)了一下官網(wǎng)的資料蓉驹，然后在這里做一個內(nèi)容的整理與記錄城榛。方便規(guī)范自己以后的代碼，也便于大家參考态兴。ヾ(?°?°?)??

這里我只挑了我覺得重要的或者目前有用的東西整理狠持，沒有把所有的東西都寫在這里，要看完整版的可以文章拉到底點擊參考資料~

二核心變換

2.1 主要改變

Tensor和Variable合并
有些操作現(xiàn)在可以返回0維張量（scalar）
抑制了volatile標(biāo)志位

1 合并Tensor和Variable類

這是我覺得最大最重要的一點改變了瞻润。

torch.autograd.Variable 和 torch.Tensor現(xiàn)在是同一個類.
更準(zhǔn)確的說是torch.Tensor包括了Variable工坊，所以我們都稱之為Tensor好了。
Variable還能像以前一樣工作敢订，但是返回的類型是torch.Tensor王污。
這樣也就意味著你沒必要在代碼中用Variable將變量包裝起來了。

2 獲取Tensor 類型的函數(shù)變了

原來是：type(x)
現(xiàn)在是: x.type()
還有isinstance()函數(shù)的用法具體如下：

```source-python
>>> x = torch.DoubleTensor([1, 1, 1])
>>> print(type(x)) # was torch.DoubleTensor
<class 'torch.autograd.variable.Variable'>
>>> print(x.type())  # OK: 'torch.DoubleTensor'
'torch.DoubleTensor'
>>> print(isinstance(x, torch.DoubleTensor))  # OK: True
True

3 關(guān)于自動求梯度用法的變遷

自動求梯度requires_grad現(xiàn)在是Tensor的屬性楚午。
具體的使用可以見這個例子：

```source-python
>>> x = torch.ones(1)  # create a tensor with requires_grad=False (default)
>>> x.requires_grad
False
>>> y = torch.ones(1)  # another tensor with requires_grad=False
>>> z = x + y
>>> # both inputs have requires_grad=False. so does the output
>>> z.requires_grad
False
>>> # then autograd won't track this computation. let's verify!
>>> z.backward()
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
>>>
>>> # now create a tensor with requires_grad=True
>>> w = torch.ones(1, requires_grad=True)
>>> w.requires_grad
True
>>> # add to the previous result that has require_grad=False
>>> total = w + z
>>> # the total sum now requires grad!
>>> total.requires_grad
True
>>> # autograd can compute the gradients as well
>>> total.backward()
>>> w.grad
tensor([ 1.])
>>> # and no computation is wasted to compute gradients for x, y and z, which don't require grad
>>> z.grad == x.grad == y.grad == None
True

如上所示昭齐，我們可以得到如下信息：

默認(rèn)情況創(chuàng)建的張量不需要梯度；
所有的輸入inputs都不需要張量矾柜，那么輸出也不需要計算梯度阱驾；
當(dāng)我們希望創(chuàng)建的張量自動計算梯度時就谜，在定義的時候加上requires_grad=True
當(dāng)輸入inputs中的有一個需要梯度時，那么輸出也會自動計算梯度
但是其他不在定義的時候不需要計算梯度的張量在反向傳播及求梯度的時候就不再計算這些張量的梯度了里覆；

除了在定義的時候指定變量需要計算梯度外丧荐，也可以用函數(shù)requires_grad_()來對已經(jīng)存在的張量設(shè)置requires_grad屬性。

```source-python
>>> existing_tensor.requires_grad_()
>>> existing_tensor.requires_grad
True
>>> my_tensor = torch.zeros(3, 4, requires_grad=True)
>>> my_tensor.requires_grad
True

4 關(guān)于`.data`

之前是用.data來獲取Variable中的Tensor
合并以后也有類似的用法喧枷，y=x.data虹统，y會是一個跟x數(shù)據(jù)一樣的新張量，默認(rèn)不計算梯度隧甚。
某些情況下车荔，使用.data會比較不安全。因為所有在x.data中出現(xiàn)的變換就不再被autugrad記錄了戚扳，在反向傳播計算梯度的時候可能就會出錯忧便。
比較安全的方式是使用x.detach()，雖然返回的還是沒是不計算梯度的張量帽借，但是這個張量所作的in-place變換還是會被autograd自動記錄到珠增。

5. 開始支持0維標(biāo)量

以前pytorch會將標(biāo)量都擴展成1維的張量（1，）
現(xiàn)在開始支持標(biāo)量的存在了砍艾。
具體的標(biāo)量使用方法以及向量vector使用方法都如下所示：

```source-python
>>> torch.tensor(3.1416)         # create a scalar directly
tensor(3.1416)
>>> torch.tensor(3.1416).size()  # scalar is 0-dimensional
torch.Size([])
>>> torch.tensor([3]).size()     # compare to a vector of size 1
torch.Size([1])
>>>
>>> vector = torch.arange(2, 6)  # this is a vector
>>> vector
tensor([ 2.,  3.,  4.,  5.])
>>> vector.size()
torch.Size([4])
>>> vector[3]                    # indexing into a vector gives a scalar
tensor(5.)
>>> vector[3].item()             # .item() gives the value as a Python number
5.0
>>> sum = torch.tensor([2, 3]).sum()
>>> sum
tensor(5)
>>> sum.size()
torch.Size([])

所以以后在神經(jīng)網(wǎng)絡(luò)中計算損失的時候要將原來的total_loss += loss.data[0]改成total_loss += loss.item()

6. 限制了`volatile`標(biāo)志位的使用

原來用在Variable中使用volatile=True會讓autograd不再計算梯度值蒂教。
現(xiàn)在這個標(biāo)志位被限制了，即使用了也沒有作用辐董。
pytorch用了更加靈活的方式來代替悴品，具體的使用方法如下所示：

```source-python
>>> x = torch.zeros(1, requires_grad=True)
>>> with torch.no_grad():
...     y = x * 2
>>> y.requires_grad
False
>>>
>>> is_train = False
>>> with torch.set_grad_enabled(is_train):
...     y = x * 2
>>> y.requires_grad
False
>>> torch.set_grad_enabled(True)  # this can also be used as a function
>>> y = x * 2
>>> y.requires_grad
True
>>> torch.set_grad_enabled(False)
>>> y = x * 2
>>> y.requires_grad
False

emmmm,把最主要的幾個變換整理了，還有一些沒有包括進(jìn)來简烘，感興趣的大家還是戳底部看原文吧苔严，也沒有什么特別需要講的東西，純粹的翻譯好像有點太浪費時間了~

參考資料

Trade-off memory for compute, Windows support, 24 distributions with cdf, variance etc., dtypes, zero-dimensional Tensors, Tensor-Variable merge, , faster distributed, perf and bug fixes, CuDNN 7.1

PyTorch | 0.3到0.4不完整遷移手冊

PyTorch | 0.3到0.4不完整遷移手冊

一寫在前面

二核心變換

2.1 主要改變

1 合并Tensor和Variable類

2 獲取Tensor 類型的函數(shù)變了

3 關(guān)于自動求梯度用法的變遷

4 關(guān)于`.data`

5. 開始支持0維標(biāo)量

6. 限制了`volatile`標(biāo)志位的使用

參考資料

PyTorch | 0.3到0.4不完整遷移手冊

一 寫在前面

二 核心變換

2.1 主要改變

1 合并Tensor和Variable類

2 獲取Tensor 類型的函數(shù)變了

3 關(guān)于自動求梯度用法的變遷

4 關(guān)于.data

5. 開始支持0維標(biāo)量

6. 限制了volatile標(biāo)志位的使用

參考資料

一寫在前面

二核心變換

4 關(guān)于`.data`

6. 限制了`volatile`標(biāo)志位的使用