多输入单个神经元,其中b为1-bit标量,为bias纠正,f为transfer function
那么假如是一层神经元,R输入,S输出,可以表示为下图,其中输入为R维向量,加权矩阵为S*R。
感想:神经元网络并不是万能的,但是适合在筛选出多个影响变量后,通过数据训练找出这些变量的影响程度,即权重。bias可以看做变量的稳定影响吧。
2, Training
(1) Incremental Training (of Adaptive and Other Networks)
Recall from the earlier discussion that for a static network the simulation of the network produces the same outputs whether the inputs are presented as a matrix of concurrent vectors or as a cell array of sequential vectors. This is not true when training the network, however. When using the adapt function, if the inputs are presented as a cell array of sequential vectors, then the weights are updated as each input is presented (incremental mode). As we see in the next section, if the inputs are presented as a matrix of concurrent vectors, then the weights are updated only after all inputs are presented (batch mode).
For incremental training we want to present the inputs and targets as
sequences:
P = {[1;2] [2;1] [2;3] [3;1]};
T = {4 5 7 7};
For batch training of a static network with adapt, the input vectors must be
placed in one matrix of concurrent vectors.
P = [1 2 2 3; 2 1 3 1];
T = [4 5 7 7];
(2) net.iw{1,1} 这个语句用作给神经网络的权值赋值,括号中第一个1表示层数,第二个1表示该层第一个神经元,IW表示输入层到隐含层的权值,三层网络中可以表示成net.IW{1,1}和net.LW{2,1}。LW表示隐含层到输出层的权值。
(3) Customize transfer function
(4) Matlab——神经网络之训练函数train
(3) Customize transfer function
(4) Matlab——神经网络之训练函数train
-进度
在Progress下面,显示了当前的训练状态。
--Epoch:训练次数
在其右边显示的是最大的训练次数,可以设定,上面例子中设为300;而进度条中显示的是实际训练的次数,上面例子中实际训练次数为146次。
一般情况下,训练的次数都会达到最大的训练次数才会停止训练(点击Stop Trainning按键的除外)。但是,如果在train参数中,指定了确定样本,则可能会提前停止训练。
--Time:训练时间,也就是本次训练中,使用的时间。
--Performance:性能指标
本例子中为均方误差(mse)的最大值。精度条中显示的是当前的均方误差;进度条右边显示的是设定的均方误差(如果当前的均方误差小于设定值,则停止训练),这个指标可以用用.trainParam.goal参数设定。
--Gradiengt:梯度;
进度条中显示的当前的梯度值,其右边显示的是设定的梯度值。如果当前的梯度值达到了设定值,则停止训练。
-- Validation Checks:校验检查
- 绘图
在Plots下面,有3个按键,分别用于绘制当前神经网络的性能图,训练状态和回归分析。分别如下图所示。
-
.trainParam.goal=0.1 % 训练目标最小误差,这里设置为0.1
.trainParam.goal=0.1 % 训练目标最小误差,这里设置为0.1
.trainParam.epochs=300; % 训练次数,这里设置为300次
.trainParam.show=20; % 现实频率,这里设置为没训练20次显示一次
.trainParam.mc=0.95; % 附加动量因子
.trainParam.lr=0.05; % 学习速率,这里设置为0.05
.trainParam.min_grad=1e-6; % 最小性能梯度
.trainParam.min_fail=5; % 最大确认失败次数
.trainFcn = 'trainrp' %training function
- 权值/阈值
net.iw % 权值元包:net.iw{1}——当网络只有一层时,net.iw是一个1x1的cell;net.iw{1,1}——当网络有多层时,net.iw是一个元包矩阵。
net.b % 阈值/偏置值,也是一个元包
(6) Gradient
Gradient can be viewed as slope of the result vs change of one parameter.
So if the parameter changes a lot, but the result stays the same, the absolute value of gradient should be small.
Besides, gradient will be positive if parameter and result is positive co-related, vice versa.
(7) 为什么初始值的不同,权重训练结果会有如此大的不同?
比如初始值为0,权重结果就较为平均;而采用randn,权重结果就有很大偏差,会出现极小值和极大值;明天看看为什么。
Make sure your test set is large enough compared to the training set (e.g. 10% of the overall data) and check it regarding diversity. If your test set only covers very specific cases, this could be a reason. Also make sure you always use the same test set. Alternatively you should google the term cross-validation.
Furthermore, observing good training set accuracy while observing bad test set accuracy is a sign foroverfitting. Try to apply regularization like a simple L2 weight decay (simply multiply your weight matrices with e.g. 0.999 after each weight update). Depending on your data, Dropout or L1 regularization could also help (especially if you have a lot of redundancies in your input data). Also try to choose a smaller network topology (fewer layers and/or fewer neurons per layer).
To speed up training, you could also try alternative learning algorithms like RPROP+, RPROP- or RMSProp instead of plain backpropagation.
(8) http://www.mathworks.nl/help/nnet/ug/train-and-apply-multilayer-neural-networks.html
(9) Maximum variable size allowed by the program is exceeded
http://www.mathworks.com/matlabcentral/newsreader/view_thread/112952
没有评论:
发表评论