Variable和Tensor合併後，PyTorch的代碼要怎麼改？

最新 04-28

昨日（4 月 25 日），Facebook 推出了 PyTorch 0.4.0 版本，該版本有諸多更新和改變，比如支持 Windows，Variable 和 Tensor 合併等等，詳細介紹請查看文章《Pytorch 重磅更新》。

本文是一篇遷移指南，將介紹從先前版本遷移到新版本時，所需做出的一些代碼更改：

Tensors/Variables 合併

支持零維（標量）張量

棄用 volatile 標誌

dtypes，devices 和 Numpy-style Tensor 創建函數

編寫一些不依賴設備的代碼

▌合併 Tensor 和 Variable 類

新版本中，torch.autograd.Variable 和 torch.Tensor 將同屬一類。更確切地說，torch.Tensor 能夠追蹤日誌並像舊版本的 Variable 那樣運行; Variable 封裝仍舊可以像以前一樣工作，但返回的對象類型是 torch.Tensor。這意味著你的代碼不再需要變數封裝器。

Tensor 中 type () 的變化

這裡需要注意到張量的 type（）不再反映數據類型，而是改用 isinstance（）或 x.type（）來表示數據類型，代碼如下：

>>>x = torch.DoubleTensor([1,1,1])

>>>print(type(x))# was torch.DoubleTensor

>>>print(x.type())# OK: "torch.DoubleTensor"

"torch.DoubleTensor"

>>>print(isinstance(x, torch.DoubleTensor))# OK: True

True

autograd 用於跟蹤歷史記錄

作為 autograd 方法的核心標誌，requires_grad 現在是 Tensors 類的一個屬性。讓我們看看這個變化是如何體現在代碼中的。autograd 使用先前用於 Variable 的相同規則。當操作中任意輸入 Tensor 的 require_grad = True 時，它開始跟蹤歷史記錄。代碼如下所示：

>>> x = torch.ones(1)# create a tensor with requires_grad=False (default)

>>> x.requires_grad

False

>>> y = torch.ones(1)# another tensor with requires_grad=False

>>> z = x + y

>>># both inputs have requires_grad=False. so does the output

>>> z.requires_grad

False

>>># then autograd won"t track this computation. let"s verify!

>>> z.backward()

RuntimeError:elementof tensors doesnotrequiregradanddoesnothave a grad_fn

>>>

>>># now create a tensor with requires_grad=True

>>> w = torch.ones(1, requires_grad=True)

>>> w.requires_grad

True

>>># add to the previous result that has require_grad=False

>>> total = w + z

>>># the total sum now requires grad!

>>> total.requires_grad

True

>>># autograd can compute the gradients as well

>>> total.backward()

>>> w.grad

tensor([1.])

>>># and no computation is wasted to compute gradients for x, y and z, which don"t require grad

>>> z.grad == x.grad == y.grad == None

True

requires_grad 操作

除了直接設置屬性之外，你還可以使用 my_tensor.requires_grad_（requires_grad = True）在原地更改此標誌，或者如上例所示，在創建時將其作為參數傳遞（默認為 False）來實現，代碼如下：

>>> existing_tensor.requires_grad_()

>>> existing_tensor.requires_grad

True

>>> my_tensor = torch.zeros(3,4, requires_grad=True)

>>> my_tensor.requires_grad

True

關於 .data

.data 是從 Variable 中獲取底層 Tensor 的主要方式。合併後，調用 y = x.data 仍然具有相似的語義。因此 y 將是一個與 x 共享相同數據的 Tensor，並且 requires_grad = False，它與 x 的計算歷史無關。

然而，在某些情況下 .data 可能不安全。對 x.data 的任何更改都不會被 autograd 跟蹤，如果在反向過程中需要 x，那麼計算出的梯度將不正確。另一種更安全的方法是使用 x.detach（），它將返回一個與 requires_grad = False 時共享數據的 Tensor，但如果在反向過程中需要 x，那麼 autograd 將會就地更改它。

▌零維張量的一些操作

先前版本中，Tensor 矢量（1維張量）的索引將返回一個 Python 數字，但一個Variable矢量的索引將返回一個大小為（1,）的矢量。同樣地， reduce函數存在類似的操作，即tensor.sum（）會返回一個Python數字，但是 variable.sum（）會調用一個大小為（1,）的向量。

幸運的是，新版本的PyTorch中引入了適當的標量（0維張量）支持！可以使用新版本中的 torch.tensor 函數來創建標量（這將在後面更詳細地解釋，現在只需將它認為是PyTorch 中 numpy.array 的等效項），代碼如下：

>>> torch.tensor(3.1416)# create a scalar directly

tensor(3.1416)

>>> torch.tensor(3.1416).size()# scalar is 0-dimensional

torch.Size([])

>>> torch.tensor([3]).size()# compare to a vector of size 1

torch.Size([1])

>>>

>>> vector = torch.arange(2,6)# this is a vector

>>> vector

tensor([2.,3.,4.,5.])

>>> vector.size()

torch.Size([4])

>>> vector[3]# indexing into a vector gives a scalar

tensor(5.)

>>> vector[3].item()# .item() gives the value as a Python number

5.0

>>> mysum = torch.tensor([2,3]).sum()

>>> mysum

tensor(5)

>>> mysum.size()

torch.Size([])

累計損失函數

考慮在 PyTorch0.4.0 版本之前廣泛使用的 total_loss + = loss.data [0] 模式。Loss 是一個包含張量（1，）的 Variable，但是在新發布的 0.4.0 版本中，loss 是一個 0維標量。對於標量的索引是沒有意義的（目前的版本會給出一個警告，但在0.5.0中將會報錯一個硬錯誤）：使用 loss.item（）從標量中獲取 Python 數字。

值得注意得是，如果你在累積損失時未能將其轉換為 Python 數字，那麼程序中的內存使用量可能會增加。這是因為上面表達式的右側，在先前版本中是一個 Python 浮點型數字，而現在它是一個零維的張量。因此，總損失將會張量及其歷史梯度的累加，這可能會需要更多的時間來自動求解梯度值。

▌棄用volatile

新版本中，volatile 標誌將被棄用且不再會有任何作用。先前的版本中，任何涉及到 volatile = True 的 Variable 的計算都不會由 autograd 追蹤到。這已經被一組更靈活的上下文管理器所取代，包括 torch.no_grad（），torch.set_grad_enabled（grad_mode）等等。代碼如下：

>>>x = torch.zeros(1, requires_grad=True)

>>>withtorch.no_grad():

...y = x *2

>>>y.requires_grad

False

>>>

>>>is_train =False

>>>withtorch.set_grad_enabled(is_train):

...y = x *2

>>>y.requires_grad

False

>>>torch.set_grad_enabled(True)# this can also be used as a function

>>>y = x *2

>>>y.requires_grad

True

>>>torch.set_grad_enabled(False)

>>>y = x *2

>>>y.requires_grad

False

▌dtypes，devices和Numpy式Tensor創建函數

在新版本中，我們將引入 torch.dtype，torch.device 和 torch.layout 類，以便通過 NumPy 風格的創建函數來更好地管理這些屬性。

torch.dtype

以下給出可用的 torch.dtypes（數據類型）及其相應張量類型的完整列表。

使用 torch.set_default_dtype 和 torch.get_default_dtype 來操作浮點張量的默認 dtype。

torch.device

torch.device 包含設備類型（"cpu"或"cuda"）及可選的設備序號（id）。它可以通過 torch.device（""）或 torch.device（"："）來初始化所選設備。

張量所使用的設備可以通過訪問 device 屬性獲取。

torch.layout

torch.layout 表示張量的數據布局。新版本中，torch.strided（密集張量）和torch.sparse_coo（帶有 COO 格式的稀疏張量）均受支持。

張量的數據布局模式可以通過訪問 layout 屬性獲取。

創建張量

新版本中，創建 Tensor 的方法還可以使用 dtype，device，layout 和 requires_grad 選項在返回的 Tensor 中指定所需的屬性。代碼如下：

>>> device = torch.device("cuda:1")

>>> x = torch.randn(3,3, dtype=torch.float64, device=device)

tensor([[-.6344,.8562, -1.2758],

[.8414,1.7962,1.0589],

[-.1369, -1.0462, -.4373]], dtype=torch.float64, device="cuda:1")

>>> x.requires_grad# default is False

False

>>> x = torch.zeros(3, requires_grad=True)

>>> x.requires_grad

True

torch.tensor(data, …)

torch.tensor 是新添加的張量創建方法之一。它像所有類型的數據一樣排列，並將包含值複製到一個新的 Tensor 中。如前所述，PyTorch 中的 torch.tensor 等價於 NumPy 中的構造函數 numpy.array。與 torch.*tensor 方法不同的是，你也可以通過這種方式（單個 Python 數字在 torch.*tensor 方法中被視為大小）創建零維張量（也稱為標量）。此外，如果沒有給出 dtype 參數，它會根據給定的數據推斷出合適的 dtype。這是從現有數據（如 Python 列表）創建張量的推薦方法。代碼如下：

>>> cuda = torch.device("cuda")

>>> torch.tensor([[1], [2], [3]], dtype=torch.half, device=cuda)

tensor([[1],

[2],

[3]], device="cuda:0")

>>> torch.tensor(1)# scalar

tensor(1)

>>> torch.tensor([1,2.3]).dtype# type inferece

torch.float32

>>> torch.tensor([1,2]).dtype# type inferece

torch.int64

我們還添加了更多的張量創建方法。其中包括一些有 torch.*_like 或 tensor.new_ * 變體。

1. torch.*_like 輸入一個 tensor 而不是形狀。除非另有說明，它默認將返回一個與輸入張量相同屬性的張量。代碼如下：

2. tensor.new_ * 也可以創建與 tensor 具有相同屬性的 tensor，但它需要指定一個形狀參數：

要得到所需的形狀，在大多數情況下你可以使用元組（例如 torch.zeros（（2，3）））或可變參數（例如 torch.zeros（2,3））來指定。

其中：torch.from_numpy 只接受一個 NumPy ndarray 類型作為其輸入參數。

▌編寫一些不依賴設備的代碼

先前版本的 PyTorch 很難編寫一些設備不可知或不依賴設備的代碼（例如，可以在沒有修改的情況下，在CUDA環境下和僅CPU環境的計算機上運行）。

在新版本PyTorch 0.4.0中，你通過一下兩種方式讓這一過程變得更容易：

張量的device屬性將為所有張量提供 torch.device 屬性（get_device 僅適用於 CUDA 張量）

Tensors 和 Modules 的 to 方法可用於將對象輕鬆移動到不同的設備（而不必根據上下文信息調用 cpu() 或 cuda()）

我們推薦用以下的模式：

▌代碼示例

為了更直觀地得到0.4.0版本中代碼模式的整體變化特徵，我們來看一個0.3.1和0.4.0版本中一些常見的代碼例子：

0.3.1舊版本

model = MyRNN()

ifuse_cuda:

model = model.cuda()

# train

total_loss =

forinput, targetintrain_loader:

input, target = Variable(input), Variable(target)

hidden = Variable(torch.zeros(*h_shape)) # init hidden

ifuse_cuda:

input, target, hidden =input.cuda(), target.cuda(), hidden.cuda()

... # get lossandoptimize

total_loss += loss.data[]

# evaluate

forinput, targetintest_loader:

input= Variable(input, volatile=True)

ifuse_cuda:

...

0.4.0 新版本

# torch.device object used throughout this script

device = torch.device("cuda"ifuse_cudaelse"cpu")

model = MyRNN().to(device)

# train

total_loss =

forinput, targetintrain_loader:

input, target = input.to(device), target.to(device)

hidden = input.new_zeros(*h_shape)# has the same device & dtype as `input`

...# get loss and optimize

total_loss += loss.item()# get Python number from 1-element Tensor

# evaluate

withtorch.no_grad():# operations inside don"t track history

forinput, targetintest_loader:

...

作者：The PyTorch Team

http://pytorch.org/2018/04/22/0_4_0-migration-guide.html

AI科技大本營

公眾號ID：rgznai100

喜歡這篇文章嗎？立刻分享出去讓更多人知道吧！

本站內容充實豐富，博大精深，小編精選每日熱門資訊，隨時更新，點擊「搶先收到最新資訊」瀏覽吧！

請您繼續閱讀更多來自 AI科技大本營 的精彩文章:

※亞馬遜在研發家用機器人，它對智能家居的掌控欲更強了
※李飛飛團隊最新論文：如何對圖像中的實體精準「配對」？

TAG:AI科技大本營 |