摘要:最近一直在看,各類博客論文看得不少但是說實話,這樣做有些疏于實現,一來呢自己的電腦也不是很好,二來呢我目前也沒能力自己去寫一個只是跟著的寫了些已有框架的代碼這部分的代碼見后來發現了一個的的,發現其代碼很簡單,感覺比較適合用來學習算法再一個就
最近一直在看Deep Learning,各類博客、論文看得不少
但是說實話,這樣做有些疏于實現,一來呢自己的電腦也不是很好,二來呢我目前也沒能力自己去寫一個toolbox
只是跟著Andrew Ng的UFLDL tutorial?寫了些已有框架的代碼(這部分的代碼見github)
后來發現了一個matlab的Deep Learning的toolbox,發現其代碼很簡單,感覺比較適合用來學習算法
再一個就是matlab的實現可以省略掉很多數據結構的代碼,使算法思路非常清晰
所以我想在解讀這個toolbox的代碼的同時來鞏固自己學到的,同時也為下一步的實踐打好基礎
(本文只是從代碼的角度解讀算法,具體的算法理論步驟還是需要去看paper的
我會在文中給出一些相關的paper的名字,本文旨在梳理一下算法過程,不會深究算法原理和公式)
==========================================================================================
使用的代碼:DeepLearnToolbox?
,下載地址:點擊打開,感謝該toolbox的作者
==========================================================================================
第一章從分析NN(neural network)開始,因為這是整個deep learning的大框架,參見UFLDL
==========================================================================================
首先看一下 ests est_example_NN.m ,跳過對數據進行normalize的部分,最關鍵的就是:
(為了注釋顯示有顏色,我把matlab代碼中的%都改成了//)
[cpp] view plaincopy
- nn?=?nnsetup([784?100?10]);??
- opts.numepochs?=??1;???//??Number?of?full?sweeps?through?data??
- opts.batchsize?=?100;??//??Take?a?mean?gradient?step?over?this?many?samples??
- [nn,?L]?=?nntrain(nn,?train_x,?train_y,?opts);??
- [er,?bad]?=?nntest(nn,?test_x,?test_y);??
很簡單的幾步就訓練了一個NN,我們發現其中最重要的幾個函數就是nnsetup,nntrain和nntest了
那么我們分別來分析著幾個函數,NN
nsetup.m
nnsetup
[cpp] view plaincopy
- function?nn?=?nnsetup(architecture)??
- //首先從傳入的architecture中獲得這個網絡的整體結構,nn.n表示這個網絡有多少層,可以參照上面的樣例調用nnsetup([784?100?10])加以理解??
- ??
- ????nn.size???=?architecture;??
- ????nn.n??????=?numel(nn.size);??
- ????//接下來是一大堆的參數,這個我們到具體用的時候再加以說明??
- ????nn.activation_function??????????????=?"tanh_opt";???//??Activation?functions?of?hidden?layers:?"sigm"?(sigmoid)?or?"tanh_opt"?(optimal?tanh).??
- ????nn.learningRate?????????????????????=?2;????????????//??learning?rate?Note:?typically?needs?to?be?lower?when?using?"sigm"?activation?function?and?non-normalized?inputs.??
- ????nn.momentum?????????????????????????=?0.5;??????????//??Momentum??
- ????nn.scaling_learningRate?????????????=?1;????????????//??Scaling?factor?for?the?learning?rate?(each?epoch)??
- ????nn.weightPenaltyL2??????????????????=?0;????????????//??L2?regularization??
- ????nn.nonSparsityPenalty???????????????=?0;????????????//??Non?sparsity?penalty??
- ????nn.sparsityTarget???????????????????=?0.05;?????????//??Sparsity?target??
- ????nn.inputZeroMaskedFraction??????????=?0;????????????//??Used?for?Denoising?AutoEncoders??
- ????nn.dropoutFraction??????????????????=?0;????????????//??Dropout?level?(http://www.cs.toronto.edu/~hinton/absps/dropout.pdf)??
- ????nn.testing??????????????????????????=?0;????????????//??Internal?variable.?nntest?sets?this?to?one.??
- ????nn.output???????????????????????????=?"sigm";???????//??output?unit?"sigm"?(=logistic),?"softmax"?and?"linear"??
- ????//對每一層的網絡結構進行初始化,一共三個參數W,vW,p,其中W是主要的參數????
- ????//vW是更新參數時的臨時參數,p是所謂的sparsity,(等看到代碼了再細講)??????
- ???for?i?=?2?:?nn.n?????
- ????????//?weights?and?weight?momentum??
- ????????nn.W{i?-?1}?=?(rand(nn.size(i),?nn.size(i?-?1) 1)?-?0.5)?*?2?*?4?*?sqrt(6?/?(nn.size(i)? ?nn.size(i?-?1)));??
- ????????nn.vW{i?-?1}?=?zeros(size(nn.W{i?-?1}));??
- ??????????
- ????????//?average?activations?(for?use?with?sparsity)??
- ????????nn.p{i}?????=?zeros(1,?nn.size(i));?????
- ????end??
- end??
nntrain
setup大概就這樣一個過程,下面就到了train了,打開NN
ntrain.m
我們跳過那些檢驗傳入數據是否正確的代碼,直接到關鍵的部分
denoising 的部分請參考論文:Extracting and Composing Robust Features with Denoising Autoencoders
[cpp] view plaincopy
- m?=?size(train_x,?1);??
- //m是訓練樣本的數量??
- //注意在調用的時候我們設置了opt,batchsize是做batch?gradient時候的大小??
- batchsize?=?opts.batchsize;?numepochs?=?opts.numepochs;??
- numbatches?=?m?/?batchsize;??//計算batch的數量??
- assert(rem(numbatches,?1)?==?0,?"numbatches?must?be?a?integer");??
- L?=?zeros(numepochs*numbatches,1);??
- n?=?1;??
- //numepochs是循環的次數??
- for?i?=?1?:?numepochs??
- ????tic;??
- ????kk?=?randperm(m);??
- ????//把batches打亂順序進行訓練,randperm(m)生成一個亂序的1到m的數組??
- ????for?l?=?1?:?numbatches??
- ????????batch_x?=?train_x(kk((l?-?1)?*?batchsize? ?1?:?l?*?batchsize),?:);??
- ????????//Add?noise?to?input?(for?use?in?denoising?autoencoder)??
- ????????//加入noise,這是denoising?autoencoder需要使用到的部分??
- ????????//這部分請參見《Extracting?and?Composing?Robust?Features?with?Denoising?Autoencoders》這篇論文??
- ????????//具體加入的方法就是把訓練樣例中的一些數據調整變為0,inputZeroMaskedFraction表示了調整的比例??
- ????????if(nn.inputZeroMaskedFraction?~=?0)??
- ????????????batch_x?=?batch_x.*(rand(size(batch_x))>nn.inputZeroMaskedFraction);??
- ????????end??
- ????????batch_y?=?train_y(kk((l?-?1)?*?batchsize? ?1?:?l?*?batchsize),?:);??
- ????????//這三個函數??
- ????????//nnff是進行前向傳播,nnbp是后向傳播,nnapplygrads是進行梯度下降??
- ????????//我們在下面分析這些函數的代碼??
- ????????nn?=?nnff(nn,?batch_x,?batch_y);??
- ????????nn?=?nnbp(nn);??
- ????????nn?=?nnapplygrads(nn);??
- ????????L(n)?=?nn.L;??
- ????????n?=?n? ?1;??
- ????end??
- ??????
- ????t?=?toc;??
- ????if?ishandle(fhandle)??
- ????????if?opts.validation?==?1??
- ????????????loss?=?nneval(nn,?loss,?train_x,?train_y,?val_x,?val_y);??
- ????????else??
- ????????????loss?=?nneval(nn,?loss,?train_x,?train_y);??
- ????????end??
- ????????nnupdatefigures(nn,?fhandle,?loss,?opts,?i);??
- ????end??
- ??????????
- ????disp(["epoch?"?num2str(i)?"/"?num2str(opts.numepochs)?".?Took?"?num2str(t)?"?seconds"?".?Mean?squared?error?on?training?set?is?"?num2str(mean(L((n-numbatches):(n-1))))]);??
- ????nn.learningRate?=?nn.learningRate?*?nn.scaling_learningRate;??
- end??
下面分析三個函數nnff,nnbp和nnapplygrads
nnff
nnff就是進行feedforward pass,其實非常簡單,就是整個網絡正向跑一次就可以了
當然其中有dropout和sparsity的計算
具體的參見論文“Improving Neural Networks with Dropout“和Autoencoders and Sparsity
[cpp] view plaincopy
- function?nn?=?nnff(nn,?x,?y)??
- //NNFF?performs?a?feedforward?pass??
- //?nn?=?nnff(nn,?x,?y)?returns?an?neural?network?structure?with?updated??
- //?layer?activations,?error?and?loss?(nn.a,?nn.e?and?nn.L)??
- ??
- ????n?=?nn.n;??
- ????m?=?size(x,?1);??
- ??????
- ????x?=?[ones(m,1)?x];??
- ????nn.a{1}?=?x;??
- ??
- ????//feedforward?pass??
- ????for?i?=?2?:?n-1??
- ????????//根據選擇的激活函數不同進行正向傳播計算??
- ????????//你可以回過頭去看nnsetup里面的第一個參數activation_function??
- ????????//sigm就是sigmoid函數,tanh_opt就是tanh的函數,這個toolbox好像有一點改變??
- ????????//tanh_opt是1.7159*tanh(2/3.*A)??
- ????????switch?nn.activation_function???
- ????????????case?"sigm"??
- ????????????????//?Calculate?the?unit"s?outputs?(including?the?bias?term)??
- ????????????????nn.a{i}?=?sigm(nn.a{i?-?1}?*?nn.W{i?-?1}");??
- ????????????case?"tanh_opt"??
- ????????????????nn.a{i}?=?tanh_opt(nn.a{i?-?1}?*?nn.W{i?-?1}");??
- ????????end??
- ??????????
- ????????//dropout的計算部分部分?dropoutFraction?是nnsetup中可以設置的一個參數??
- ????????if(nn.dropoutFraction?>?0)??
- ????????????if(nn.testing)??
- ????????????????nn.a{i}?=?nn.a{i}.*(1?-?nn.dropoutFraction);??
- ????????????else??
- ????????????????nn.dropOutMask{i}?=?(rand(size(nn.a{i}))>nn.dropoutFraction);??
- ????????????????nn.a{i}?=?nn.a{i}.*nn.dropOutMask{i};??
- ????????????end??
- ????????end??
- ????????//計算sparsity,nonSparsityPenalty?是對沒達到sparsitytarget的參數的懲罰系數??
- ????????//calculate?running?exponential?activations?for?use?with?sparsity??
- ????????if(nn.nonSparsityPenalty>0)??
- ????????????nn.p{i}?=?0.99?*?nn.p{i}? ?0.01?*?mean(nn.a{i},?1);??
- ????????end??
- ??????????
- ????????//Add?the?bias?term??
- ????????nn.a{i}?=?[ones(m,1)?nn.a{i}];??
- ????end??
- ????switch?nn.output???
- ????????case?"sigm"??
- ????????????nn.a{n}?=?sigm(nn.a{n?-?1}?*?nn.W{n?-?1}");??
- ????????case?"linear"??
- ????????????nn.a{n}?=?nn.a{n?-?1}?*?nn.W{n?-?1}";??
- ????????case?"softmax"??
- ????????????nn.a{n}?=?nn.a{n?-?1}?*?nn.W{n?-?1}";??
- ????????????nn.a{n}?=?exp(bsxfun(@minus,?nn.a{n},?max(nn.a{n},[],2)));??
- ????????????nn.a{n}?=?bsxfun(@rdivide,?nn.a{n},?sum(nn.a{n},?2));???
- ????end??
- ????//error?and?loss??
- ????//計算error??
- ????nn.e?=?y?-?nn.a{n};??
- ??????
- ????switch?nn.output??
- ????????case?{"sigm",?"linear"}??
- ????????????nn.L?=?1/2?*?sum(sum(nn.e?.^?2))?/?m;???
- ????????case?"softmax"??
- ????????????nn.L?=?-sum(sum(y?.*?log(nn.a{n})))?/?m;??
- ????end??
- end??
nnbp
代碼:NN
nbp.m
nnbp呢是進行back propagation的過程,過程還是比較中規中矩,和ufldl中的Neural Network講的基本一致
值得注意的還是dropout和sparsity的部分
[cpp] view plaincopy
- if(nn.nonSparsityPenalty>0)??
- ????pi?=?repmat(nn.p{i},?size(nn.a{i},?1),?1);??
- ????sparsityError?=?[zeros(size(nn.a{i},1),1)?nn.nonSparsityPenalty?*?(-nn.sparsityTarget?./?pi?
?(1?-?nn.sparsityTarget)?./?(1?-?pi))];??
- end??
- ??
- //?Backpropagate?first?derivatives??
- if?i 1==n?%?in?this?case?in?d{n}?there?is?not?the?bias?term?to?be?removed???????????????
- ????d{i}?=?(d{i? ?1}?*?nn.W{i}? ?sparsityError)?.*?d_act;?//?Bishop?(5.56)??
- else?//?in?this?case?in?d{i}?the?bias?term?has?to?be?removed??
- ????d{i}?=?(d{i? ?1}(:,2:end)?*?nn.W{i}? ?sparsityError)?.*?d_act;??
- end??
- ??
- if(nn.dropoutFraction>0)??
- ????d{i}?=?d{i}?.*?[ones(size(d{i},1),1)?nn.dropOutMask{i}];??
- end??
這只是實現的內容,代碼中的d{i}就是這一層的delta值,在ufldl中有講的
dW{i}基本就是計算的gradient了,只是后面還要加入一些東西,進行一些修改
具體原理參見論文“Improving Neural Networks with Dropout“ 以及?Autoencoders and Sparsity的內容
nnapplygrads
代碼文件:NN
napplygrads.m
[cpp] view plaincopy
- for?i?=?1?:?(nn.n?-?1)??
- ????if(nn.weightPenaltyL2>0)??
- ????????dW?=?nn.dW{i}? ?nn.weightPenaltyL2?*?nn.W{i};??
- ????else??
- ????????dW?=?nn.dW{i};??
- ????end??
- ??????
- ????dW?=?nn.learningRate?*?dW;??
- ??????
- ????if(nn.momentum>0)??
- ????????nn.vW{i}?=?nn.momentum*nn.vW{i}? ?dW;??
- ????????dW?=?nn.vW{i};??
- ????end??
- ??????????
- ????nn.W{i}?=?nn.W{i}?-?dW;??
- end??
這個內容就簡單了,nn.weightPenaltyL2 是weight decay的部分,也是nnsetup時可以設置的一個參數
有的話就加入weight Penalty,防止過擬合,然后再根據momentum的大小調整一下,最后改變nn.W{i}即可
nntest
nntest再簡單不過了,就是調用一下nnpredict,在和test的集合進行比較
[cpp] view plaincopy
- function?[er,?bad]?=?nntest(nn,?x,?y)??
- ????labels?=?nnpredict(nn,?x);??
- ????[~,?expected]?=?max(y,[],2);??
- ????bad?=?find(labels?~=?expected);??????
- ????er?=?numel(bad)?/?size(x,?1);??
- end??
nnpredict
代碼文件:NN
npredict.m
[cpp] view plaincopy
- function?labels?=?nnpredict(nn,?x)??
- ????nn.testing?=?1;??
- ????nn?=?nnff(nn,?x,?zeros(size(x,1),?nn.size(end)));??
- ????nn.testing?=?0;??
- ??????
- ????[~,?i]?=?max(nn.a{end},[],2);??
- ????labels?=?i;??
- end??
繼續非常簡單,predict不過是nnff一次,得到最后的output~~
max(nn.a{end},[],2); 是返回每一行的較大值以及所在的列數,所以labels返回的就是標號啦
(這個test好像是專門用來test 分類問題的,我們知道nnff得到最后的值即可)
總結
? ?總的來說,神經網絡的代碼比較常規易理解,基本上和UFLDL中的內容相差不大
? ?只是加入了dropout的部分和denoising的部分
? ?本文的目的也不奢望講清楚這些東西,只是給出一個路線,可以跟著代碼去學習,加深對算法的理解和應用能力
文章版權歸作者所有,未經允許請勿轉載,若此文章存在違規行為,您可以聯系管理員刪除。
轉載請注明本文地址:http://specialneedsforspecialkids.com/yun/4288.html