Convolution Neural Network (CNN) 原理与实现

本文结合Deep learning的一个应用，Convolution Neural Network 进行一些基本应用，参考Lecun的Document 0.1进行部分拓展，与结果展示（in python）。

分为以下几部分：

1. Convolution（卷积）

2. Pooling（降采样过程）

3. CNN结构

4. 跑实验

下面分别介绍。

PS：本篇blog为ese机器学习短期班参考资料（20140516课程），本文只是简要讲最naive最simple的思想，重在实践部分，原理课上详述。

1. Convolution（卷积）

类似于高斯卷积，对imagebatch中的所有image进行卷积。对于一张图，其所有feature map用一个filter卷成一张feature map。如下面的代码，对一个imagebatch（含两张图）进行操作，每个图初始有3张feature map(R,G,B), 用两个9*9的filter进行卷积，结果是，每张图得到两个feature map。

卷积操作由theano的conv.conv2d实现，这里我们用随机参数W，b。结果有点像edge detector是不是？

Code: （详见注释）

# -*- coding: utf-8 -*-
"""
Created on Sat May 10 18:55:26 2014
@author: rachel
Function: convolution option of two pictures with same size (width,height)
input: 3 feature maps (3 channels <RGB> of a picture)
convolution: two 9*9 convolutional filters
"""
from theano.tensor.nnet import conv
import theano.tensor as T
import numpy, theano
rng = numpy.random.RandomState(23455)
# symbol variable
input = T.tensor4(name = 'input')
# initial weights
w_shape = (2,3,9,9) #2 convolutional filters, 3 channels, filter shape: 9*9
w_bound = numpy.sqrt(3*9*9)
W = theano.shared(numpy.asarray(rng.uniform(low = -1.0/w_bound, high = 1.0/w_bound,size = w_shape),
dtype = input.dtype),name = 'W')
b_shape = (2,)
b = theano.shared(numpy.asarray(rng.uniform(low = -.5, high = .5, size = b_shape),
dtype = input.dtype),name = 'b')
conv_out = conv.conv2d(input,W)
#T.TensorVariable.dimshuffle() can reshape or broadcast (add dimension)
#dimshuffle(self,*pattern)
# >>>b1 = b.dimshuffle('x',0,'x','x')
# >>>b1.shape.eval()
# array([1,2,1,1])
output = T.nnet.sigmoid(conv_out + b.dimshuffle('x',0,'x','x'))
f = theano.function([input],output)
# demo
import pylab
from PIL import Image
#minibatch_img = T.tensor4(name = 'minibatch_img')
#-------------img1---------------
img1 = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel.jpg'))
width1,height1 = img1.size
img1 = numpy.asarray(img1, dtype = 'float32')/256. # (height, width, 3)
# put image in 4D tensor of shape (1,3,height,width)
img1_rgb = img1.swapaxes(0,2).swapaxes(1,2).reshape(1,3,height1,width1) #(3,height,width)
#-------------img2---------------
img2 = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel1.jpg'))
width2,height2 = img2.size
img2 = numpy.asarray(img2,dtype = 'float32')/256.
img2_rgb = img2.swapaxes(0,2).swapaxes(1,2).reshape(1,3,height2,width2) #(3,height,width)
#minibatch_img = T.join(0,img1_rgb,img2_rgb)
minibatch_img = numpy.concatenate((img1_rgb,img2_rgb),axis = 0)
filtered_img = f(minibatch_img)
# plot original image and two convoluted results
pylab.subplot(2,3,1);pylab.axis('off');
pylab.imshow(img1)
pylab.subplot(2,3,4);pylab.axis('off');
pylab.imshow(img2)
pylab.gray()
pylab.subplot(2,3,2); pylab.axis("off")
pylab.imshow(filtered_img[0,0,:,:]) #0:minibatch_index; 0:1-st filter
pylab.subplot(2,3,3); pylab.axis("off")
pylab.imshow(filtered_img[0,1,:,:]) #0:minibatch_index; 1:1-st filter
pylab.subplot(2,3,5); pylab.axis("off")
pylab.imshow(filtered_img[1,0,:,:]) #0:minibatch_index; 0:1-st filter
pylab.subplot(2,3,6); pylab.axis("off")
pylab.imshow(filtered_img[1,1,:,:]) #0:minibatch_index; 1:1-st filter
pylab.show()

Convolution Neural Network (CNN) 原理与实现

2. Pooling（降采样过程）

最常用的Maxpooling. 解决了两个问题：

1. 减少计算量

2. 旋转不变性（原因自己悟）

PS：对于旋转不变性，回忆下SIFT，LBP：采用主方向；HOG：选择不同方向的模版

Maxpooling的降采样过程会将feature map的长宽各减半。（下面结果图中没有体现出来，python自动给拉到一样大了，但实际上像素数是减半的）

Code: （详见注释）

# -*- coding: utf-8 -*-
"""
Created on Sat May 10 18:55:26 2014
@author: rachel
Function: convolution option
input: 3 feature maps (3 channels <RGB> of a picture)
convolution: two 9*9 convolutional filters
"""
from theano.tensor.nnet import conv
import theano.tensor as T
import numpy, theano
rng = numpy.random.RandomState(23455)
# symbol variable
input = T.tensor4(name = 'input')
# initial weights
w_shape = (2,3,9,9) #2 convolutional filters, 3 channels, filter shape: 9*9
w_bound = numpy.sqrt(3*9*9)
W = theano.shared(numpy.asarray(rng.uniform(low = -1.0/w_bound, high = 1.0/w_bound,size = w_shape),
dtype = input.dtype),name = 'W')
b_shape = (2,)
b = theano.shared(numpy.asarray(rng.uniform(low = -.5, high = .5, size = b_shape),
dtype = input.dtype),name = 'b')
conv_out = conv.conv2d(input,W)
#T.TensorVariable.dimshuffle() can reshape or broadcast (add dimension)
#dimshuffle(self,*pattern)
# >>>b1 = b.dimshuffle('x',0,'x','x')
# >>>b1.shape.eval()
# array([1,2,1,1])
output = T.nnet.sigmoid(conv_out + b.dimshuffle('x',0,'x','x'))
f = theano.function([input],output)
# demo
import pylab
from PIL import Image
from matplotlib.pyplot import *
#open random image
img = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel.jpg'))
width,height = img.size
img = numpy.asarray(img, dtype = 'float32')/256. # (height, width, 3)
# put image in 4D tensor of shape (1,3,height,width)
img_rgb = img.swapaxes(0,2).swapaxes(1,2) #(3,height,width)
minibatch_img = img_rgb.reshape(1,3,height,width)
filtered_img = f(minibatch_img)
# plot original image and two convoluted results
pylab.figure(1)
pylab.subplot(1,3,1);pylab.axis('off');
pylab.imshow(img)
title('origin image')
pylab.gray()
pylab.subplot(2,3,2); pylab.axis("off")
pylab.imshow(filtered_img[0,0,:,:]) #0:minibatch_index; 0:1-st filter
title('convolution 1')
pylab.subplot(2,3,3); pylab.axis("off")
pylab.imshow(filtered_img[0,1,:,:]) #0:minibatch_index; 1:1-st filter
title('convolution 2')
#pylab.show()
# maxpooling
from theano.tensor.signal import downsample
input = T.tensor4('input')
maxpool_shape = (2,2)
pooled_img = downsample.max_pool_2d(input,maxpool_shape,ignore_border = False)
maxpool = theano.function(inputs = [input],
outputs = [pooled_img])
pooled_res = numpy.squeeze(maxpool(filtered_img))
#pylab.figure(2)
pylab.subplot(235);pylab.axis('off');
pylab.imshow(pooled_res[0,:,:])
title('down sampled 1')
pylab.subplot(236);pylab.axis('off');
pylab.imshow(pooled_res[1,:,:])
title('down sampled 2')
pylab.show()

Convolution Neural Network (CNN) 原理与实现

3. CNN结构

想必大家随便google下CNN的图都滥大街了，这里拖出来那时候学CNN的时候一张图，自认为陪上讲解的话画得还易懂（）

废话不多说了，直接上Lenet结构图：（从下往上顺着箭头看，最下面为底层original input）

Convolution Neural Network (CNN) 原理与实现

4. CNN代码

去资源里下载吧，我放上去了喔~（in python）

这里贴少部分代码，仅表示建模的NN：

rng = numpy.random.RandomState(23455)
# transfrom x from (batchsize, 28*28) to (batchsize,feature,28,28))
# I_shape = (28,28),F_shape = (5,5),
N_filters_0 = 20
D_features_0= 1
layer0_input = x.reshape((batch_size,D_features_0,28,28))
layer0 = LeNetConvPoolLayer(rng, input = layer0_input, filter_shape = (N_filters_0,D_features_0,5,5),
image_shape = (batch_size,1,28,28))
#layer0.output: (batch_size, N_filters_0, (28-5+1)/2, (28-5+1)/2) -> 20*20*12*12
N_filters_1 = 50
D_features_1 = N_filters_0
layer1 = LeNetConvPoolLayer(rng,input = layer0.output, filter_shape = (N_filters_1,D_features_1,5,5),
image_shape = (batch_size,N_filters_0,12,12))
# layer1.output: (20,50,4,4)
layer2_input = layer1.output.flatten(2) # (20,50,4,4)->(20,(50*4*4))
layer2 = HiddenLayer(rng,layer2_input,n_in = 50*4*4,n_out = 500, activation = T.tanh)
layer3 = LogisticRegression(input = layer2.output, n_in = 500, n_out = 10)

layer0, layer1 ：分别是卷积+降采样

layer2+layer3：组成一个MLP（ANN）

训练模型：

cost = layer3.negative_log_likelihood(y)
params = layer3.params + layer2.params + layer1.params + layer0.params
gparams = T.grad(cost,params)
updates = []
for par,gpar in zip(params,gparams):
updates.append((par, par - learning_rate * gpar))
train_model = theano.function(inputs = [minibatch_index],
outputs = [cost],
updates = updates,
givens = {x: train_set_x[minibatch_index * batch_size : (minibatch_index+1) * batch_size],
y: train_set_y[minibatch_index * batch_size : (minibatch_index+1) * batch_size]})

根据cost（最上层MLP的输出NLL），对所有层的parameters进行训练

剩下的具体见代码和注释。

PS：数据为MNIST所有数据

Convolution Neural Network (CNN) 原理与实现

final result：
Optimization complete. Best validation score of 0.990000 % obtained at iteration 122500, with test performance 0.950000 %

Convolution Neural Network (CNN) 原理与实现的更多相关文章

【面向代码】学习 Deep Learning（三）Convolution Neural Network(CNN)
========================================================================================== 最近一直在看Dee ...
Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3&period;1
3.Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.1 http://blog.csdn.net/sunbow0 ...
Deeplearning - Overview of Convolution Neural Network
Finally pass all the Deeplearning.ai courses in March! I highly recommend it! If you already know th ...
Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3&period;2
3.Spark MLlib Deep Learning Convolution Neural Network(深度学习-卷积神经网络)3.2 http://blog.csdn.net/sunbow0 ...
Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3&period;3
3.Spark MLlib Deep Learning Convolution Neural Network(深度学习-卷积神经网络)3.3 http://blog.csdn.net/sunbow0 ...
卷积神经网络(Convolutional Neural Network, CNN)简析
目录 1 神经网络 2 卷积神经网络 2.1 局部感知 2.2 参数共享 2.3 多卷积核 2.4 Down-pooling 2.5 多层卷积 3 ImageNet-2010网络结构 4 DeepID ...
Convolutional neural network (CNN) - Pytorch版
import torch import torch.nn as nn import torchvision import torchvision.transforms as transforms # ...
keras02 - hello convolution neural network 搭建第一个卷积神经网络
本项目参考: https://www.bilibili.com/video/av31500120?t=4657 训练代码 # coding: utf-8 # Learning from Mofan a ...
深度学习：卷积神经网络（convolution neural network）
(一)卷积神经网络卷积神经网络最早是由Lecun在1998年提出的. 卷积神经网络通畅使用的三个基本概念为: 1.局部视觉域: 2.权值共享: 3.池化操作. 在卷积神经网络中,局部接受域表明输入图 ...

随机推荐

iOS9,导航控制器中的子控制器设置StatusBar状态失效的问题
iOS9之前控制StatusBar的两种方式: 第一种方式:全局控制StatusBar 1. 在项目的Info.plist文件里设置UIViewControllerBasedStatusBarAppe ...
翻译：打造Edge渲染内核的浏览器
最近开始了解UWP和Edge内核,在微软技术博客中找到一篇文章,主要是介绍Edge渲染内核使用技术.顺手翻译了一下.不对之处请斧正! Over the past several months, we ...
微信小程序设计理念指南
在此处输入标题微信小程序的几条开发建议功能简约,场景贴近随用随走: 操作快捷方便,交互简单: 程序本身代码资源等文件大小限制在1MB之内,这是微信目前的硬限制,目的是为了使得最终到达用户设备上 ...
MVC埰坑日记文件权限
public static void DownLoadFile(string FileFullPath) { if (!string.IsNullOrEmpty(FileFullPath) & ...
angular2 学习笔记 ( 4&period;0 初探 )
目前是 4.0.0-rc.2. 刚好有个小项目要开发,就直接拿它来试水啦. 更新 cli 到最新版, 创建项目, 然后 follow https://github.com/angular/angula ...
Docker网络（五）--技术流ken
本章内容 1.dokcer默认自带的几种网络介绍 2. 自定义网络 3. 容器间通信 4. 容器与外界交互 docker网络分为单个主机上的容器网络和多个主机上的哇网络,本文主要讲解单个主机上的容器网 ...
C&num; 执行oracle sql 语句出现中文不兼容的问题
最近我用C#调用操作oracle 数据库出现了一个问题就是我的查询语中的条件语句含有中文字符在C#中查询不了 ,但是在pl sql 中能够正常的查询出来. 这个原因是 C# 执行orccl ...
利用搜狐新闻语料库训练100维的word2vec——使用python中的gensim模块
关于word2vec的原理知识参考文章https://www.cnblogs.com/Micang/p/10235783.html 语料数据来自搜狐新闻2012年6月—7月期间国内,国际,体育,社会, ...
nginx 代理静态资源报 403
用tomcat跑了一个上传服务,文件上传到指定nginx的html目录,用nginx来代理静态资源,结果上传能够成功,访问却报403. 解决办法,将html的拥有者改成nobody: chown -R ...
2018/03/09 每日一学PHP 之 require&lowbar;once require include include&lowbar;once 包含文件的区别
require_once require include include_once 方法的区别对于包含文件来说,如果只是使用框架来说的话,应该会很少碰到,因为框架底层对于文件的引用等做了很好的封装, ...