PyTorch

torch

Tensors

torch.numel

torch.numel(input)->int

返回input 张量中的元素个数

>>> a = torch.randn(1,2,3,4,5)

>>> torch.numel(a)

120

>>> a = torch.zeros(4,4)

>>> torch.numel(a)

16

Creation Ops

torch.arange

torch.arange(start, end, step=1, out=None) → Tensor

返回一个1维张量，长度为 ( floor((end−start)/step) )。包含从 start到 end，以 step 为步长的一组序列值(默认步长为1)。
参数:
- start (float) – 序列的起始点
- end (float) – 序列的终止点
- step (float) – 相邻点的间隔大小
- out (Tensor, optional) – 结果张量

Indexing, Slicing, Joining, Mutating Ops

torch.cat

torch.cat(inputs, dimension=0)

在给定维度上对输入的张量序列seq 进行连接操作。

torch.cat()可以看做 torch.split() 和 torch.chunk()的反操作。 cat() 函数可以通过下面例子更好的理解。

参数:

inputs (sequence of Tensors) – 可以是任意相同Tensor 类型的python 序列

dimension (int, optional) – 沿着此维连接张量序列。

>>> x = torch.randn(2, 3)
>>> x
 0.5983 -0.0341  2.4918
 1.5981 -0.5265 -0.8735
[torch.FloatTensor of size 2x3]
>>> torch.cat((x, x, x), 0)
 0.5983 -0.0341  2.4918
 1.5981 -0.5265 -0.8735
 0.5983 -0.0341  2.4918
 1.5981 -0.5265 -0.8735
 0.5983 -0.0341  2.4918
 1.5981 -0.5265 -0.8735
[torch.FloatTensor of size 6x3]
>>> torch.cat((x, x, x), 1)
 0.5983 -0.0341  2.4918  0.5983 -0.0341  2.4918  0.5983 -0.0341  2.4918
 1.5981 -0.5265 -0.8735  1.5981 -0.5265 -0.8735  1.5981 -0.5265 -0.8735
[torch.FloatTensor of size 2x9]

torch.split

torch.split(tensor, split_size, dim=0)

将输入张量分割成相等形状的chunks（如果可分）。如果沿指定维的张量形状大小不能被 split_size 整分，则最后一个分块会小于其它分块。

参数:

tensor (Tensor) – 待分割张量

split_size (int) – 单个分块的形状大小

dim (int) – 沿着此维进行分割

>>> x = torch.zeros(2,1,2,1,2)
>>> x.size()
(2L, 1L, 2L, 1L, 2L)
>>> y = torch.squeeze(x)
>>> y.size()
(2L, 2L, 2L)
>>> y = torch.squeeze(x, 0)
>>> y.size()
(2L, 1L, 2L, 1L, 2L)
>>> y = torch.squeeze(x, 1)
>>> y.size()
(2L, 2L, 1L, 2L)

torch.stack

torch.stack(sequence, dim=0)

沿着一个新维度对输入张量序列进行连接。序列中所有的张量都应该为相同形状。
参数:
- sequence (Sequence) – 待连接的张量序列
- dim (int) – 插入的维度。必须介于 0 与待连接的张量序列数之间。

Math operations

PointWise Ops

torch.tanh

torch.tanh(input, out=None) → Tensor

返回一个新张量，包含输入 input 张量每个元素的双曲正切。
参数：
- input (Tensor) – 输入张量
- out (Tensor, optional) – 输出张量

Reduction Ops

torch.mean

torch.mean(input) → float

返回输入张量所有元素的均值

补充一个经验教训，当input.numel = 0的时候，则返回nan

参数： input (Tensor) – 输入张量

>>>a = torch.randn(1, 3)

>>>a

-0.2946 -0.9143 2.1809

[torch.FloatTensor of size 1x3]

>>>torch.mean(a)

0.32398951053619385

torch.mean(input, dim, out=None) → Tensor

返回输入张量给定维度dim上每行的均值

输出形状与输入相同，除了给定维度上为1
参数：
- input (Tensor) – 输入张量
- dim (int) – the dimension to reduce
- out (Tensor, optional) – 结果张量
>>> a = torch.randn(4, 4)

>>> a

-1.2738 -0.3058 0.1230 -1.9615

0.8771 -0.5430 -0.9233 0.9879

1.4107 0.0317 -0.6823 0.2255

-1.3854 0.4953 -0.2160 0.2435

[torch.FloatTensor of size 4x4]

>>> torch.mean(a, 1)

-0.8545

0.0997

0.2464

-0.2157

[torch.FloatTensor of size 4x1]

torch.Tensor

expand(*sizes)

返回tensor的一个新视图，单个维度扩大为更大的尺寸。 tensor也可以扩大为更高维，新增加的维度将附在前面。扩大tensor不需要分配新内存，只是仅仅新建一个tensor的视图，其中通过将stride设为0，一维将会扩展位更高维。任何一个一维的在不分配新内存情况下可扩展为任意的数值。
参数：
- sizes(torch.Size or int…)-需要扩展的大小

torch.nn

torch.nn.Conv2d

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)]

Parameters：

in_channels(int) – 输入信号的通道
out_channels(int) – 卷积产生的通道
kerner_size(int or tuple) - 卷积核的尺寸
stride(int or tuple, optional) - 卷积步长
padding(int or tuple, optional) - 输入的每一条边补充0的层数
dilation(int or tuple, optional) – 卷积核元素之间的间距
groups(int, optional) – 从输入通道到输出通道的阻塞连接数
bias(bool, optional) - 如果bias=True，添加偏置

torch.nn.normalization

Norm 归一化层链接

BN，LN，IN，GN从学术化上解释差异：
BatchNorm：batch方向做归一化，算NHW的均值，对小batchsize效果不好；BN主要缺点是对batchsize的大小比较敏感，由于每次计算均值和方差是在一个batch上，所以如果batchsize太小，则计算的均值、方差不足以代表整个数据分布
LayerNorm：channel方向做归一化，算CHW的均值，主要对RNN作用明显；
InstanceNorm：一个channel内做归一化，算H*W的均值，用在风格化迁移；因为在图像风格化中，生成结果主要依赖于某个图像实例，所以对整个batch归一化不适合图像风格化中，因而对HW做归一化。可以加速模型收敛，并且保持每个图像实例之间的独立。
GroupNorm：将channel方向分group，然后每个group内做归一化，算(C//G)HW的均值；这样与batchsize无关，不受其约束。
SwitchableNorm是将BN、LN、IN结合，赋予权重，让网络自己去学习归一化层应该使用什么方法。

nn.GroupNorm

torch.nn.GroupNorm(num_groups, num_channels, eps=1e-05, affine=True, device=None, dtype=None) 链接

num_groups (int) – number of groups to separate the channels into

num_channels (int) – number of channels expected in input

eps – a value added to the denominator for numerical stability. Default: 1e-5

affine – a boolean value that when set to True, this module has learnable per-channel affine parameters initialized to ones (for weights) and zeros (for biases). Default: True.

nn.BatchNorm

torch.nn.BatchNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
torch.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

对小批量(mini-batch)3d数据组成的4d输入进行批标准化(Batch Normalization)操作
$y = \frac{x - mean[x]}{ \sqrt{Var[x]} + \epsilon} * gamma + beta$
在每一个小批量（mini-batch）数据中，计算输入各个维度的均值和标准差。gamma与beta是可学习的大小为C的参数向量（C为输入大小）

在训练时，该层计算每次输入的均值与方差，并进行移动平均。移动平均默认的动量值为0.1。

在验证时，训练求得的均值/方差将用于标准化验证数据
torch.nn.BatchNorm3d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

num_features：来自期望输入的特征数，该期望输入的大小为’batch_size x num_features [x width]’
eps：为保证数值稳定性（分母不能趋近或取0）,给分母加上的值。默认为1e-5。
momentum：动态均值和动态方差所使用的动量。默认为0.1。
affine：布尔值，当设为true，给该层添加可学习的仿射变换参数。
track_running_stats：布尔值，当设为true，记录训练过程中的均值和方差；

nn.InstanceNorm2d 链接

torch.nn.InstanceNorm2d(num_features, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False, device=None, dtype=None)

num_features – C from an expected input of size (N,C,H,W)

eps – a value added to the denominator for numerical stability. Default: 1e-5

momentum – the value used for the running_mean and running_var computation. Default: 0.1

affine – a boolean value that when set to True, this module has learnable affine parameters, initialized the same way as done for batch normalization. Default: False.

track_running_stats – a boolean value that when set to True, this module tracks the running mean and variance, and when set to False, this module does not track such statistics and always uses batch statistics in both training and eval modes. Default: False

torch.cuda.amp

torch.cuda.amp.autocast

开启自动混合精度，链接

torch.cuda.amp.autocast(enabled=True)

# 前向过程(model + loss)开启 autocast
with autocast(~):
    output = model(input)
    loss = loss_fn(output, target)