Linear_Model
机器学习,深度学习/监督学习学习过程
- DataSet 数据集
- 训练集:训练模型
- 测试集:评估模型
- Model 模型的设计与选择
- Training 训练模型(权重)
- Inferring 模型推断
基本概念
通过训练已知数据集,然后将训练好的模型来推理未知测试集数据。
数据集:
- 训练集
- 测试集
- 验证集
过拟合:模型在训练集上上表现很好,但是在测试集上的表现一般。
我们需要的是一个泛化能力强的模型
线性模型
例:$\widehat{y} = x \omega + b$
简化后的线性模型:$\widehat{y} = x \omega$
给出3个数据点:(1,2),(2,4),(3,6)。如何找到最优的权重。
Loss 损失函数
$loss = (\widehat{y} - y)^2 = (x * \omega - y)^2$
MSE(Mean Square Error)
$loss = \frac{1}{N}\Sigma_{n=1}^{N}(\widehat{y}_n - y_n)^2$
代码实现1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32# -*- coding: UTF-8 -*-
import numpy as np
import matplotlib.pyplot as plt
x_data = [1.0,2.0,3.0]
y_data = [2.0,4.0,6.0]
def forward(x):
return x * w
def loss(x,y)
return (forward(x) - y) ** 2
w_list = []
mse_list = []
for w in np.arange(0.0,4.1,0.1):
print('w= ',w)
l_sum = 0
for x_val,y_val in zip(x_data,y_data):
y_pred_val = forward(x_val)
loss_val = loss(x_val,y_val)
l_sum += loss_val
print('\t',x_val,y_val,y_pred_val,loss_val)
print('MSE = ',l_sum / 3)
w_list.append(w)
mse_list.append(l_sum / 3)
plt.plot(w_list,mse_list)
plt.ylabel('Loss')
plt.xlabel('x')
plt.show()
线性模型:$\widehat{y} = x * \omega + b$1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# prepare the training dataset
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
# define the Linear model
def forward(x):
return w * x + b
# define the loss function
def loss(x, y):
y_pred = forward(x)
return (y_pred - y) ** 2
mse_list = []
W = np.arange(0.0, 4.0, 0.1)
B = np.arange(-2.0, 2.0, 0.1)
[w, b] = np.meshgrid(W, B)
print([w, b])
l_sum = 0
for x_val, y_val in zip(x_data, y_data):
y_pred_val = forward(x_val)
loss_val = loss(x_val, y_val)
print('x_val:.{}\t y_val:.{}\t y_pred_val:.{}\t loss_val:.{}'.format(x_val, y_val, y_pred_val, loss_val))
l_sum += loss_val
fig = plt.figure()
ax = Axes3D(fig,auto_add_to_figure=False)
fig.add_axes(ax)
ax.plot_surface(w, b, l_sum / 3)
ax.set_zlabel('Loss')
ax.set_ylabel('b')
ax.set_xlabel('w')
plt.show()