Basic CNN
基本概念
- 卷积:保留输入的空间特征
- 下采样:通道数不变,改变特征层的高度和宽度。目的就是减少数据量
- 全连接层:映射指定的特征维度
图像:$C\times W \times H$
卷积过程
单通道:卷积核与对应输入对应元素相乘求和,得到一个元素,然后从左至右,从上至下移动。
多通道:每个通道与对应卷积核做单通道运算,最终按元素求和;此外若想输出多个通道的特征层,则需要多个卷积核,可以增加通道数。1
2
3
4
5
6
7
8
9
10
11
12
13
14
15# -*- coding: UTF-8 -*-
import torch
batch_size = 1
kernel_size = 3
width,height = 100,100
in_channels,out_channels = 5,10
input = torch.randn(batch_size,in_channels,width,height)
conv_layer = torch.nn.Conv2d(in_channels,out_channels,kernel_size=kernel_size)
out_put = conv_layer(input)
print(input.shape)
print(out_put.shape)
print(conv_layer.weight.shape)
Padding
$5\times 5$的输入与大小为$3\times 3$的卷积核得到的输出为$3\times 3$,若希望输出大小与输入保持不变则可以使用padding = 1进行填充。
$Output = \frac{Input - Kernel + 2\times Padding}{Stride} + 1$1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16# -*- coding: UTF-8 -*-
import torch
input = [3,4,5,6,7,
2,4,6,8,2,
1,6,7,8,4,
9,7,4,6,2,
3,7,5,4,1]
input = torch.Tensor(input).view(1,1,5,5)
conv_layer = torch.nn.Conv2d(1,1,kernel_size=3,padding=1,bias=False)
kernel = torch.Tensor([1,2,3,4,5,6,7,8,9]).view(1,1,3,3)
conv_layer.weight.data = kernel.data
output = conv_layer(input)
print(output)
Stride
使用步长参数可以减少输出特征的宽带和高度1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16# -*- coding: UTF-8 -*-
import torch
input = [3,4,5,6,7,
2,4,6,8,2,
1,6,7,8,4,
9,7,4,6,2,
3,7,5,4,1]
input = torch.Tensor(input).view(1,1,5,5)
conv_layer = torch.nn.Conv2d(1,1,kernel_size=3,stride=2,bias=False)
kernel = torch.Tensor([1,2,3,4,5,6,7,8,9]).view(1,1,3,3)
conv_layer.weight.data = kernel.data
output = conv_layer(input)
print(output)
MaxPooling
下采样,不改变输入的通道数1
2
3
4
5
6
7
8
9
10
11
12# -*- coding: UTF-8 -*-
import torch
input = [3,4,5,6,
2,4,6,8,
1,6,7,8,
9,7,4,6]
input = torch.Tensor(input).view(1,1,4,4)
maxpooling_layer = torch.nn.MaxPool2d(kernel_size=2)
output = maxpooling_layer(input)
print(output)
Minist
1 | import torch.nn as nn |
1 | import sys |
使用GPU训练模型
1 | import sys |