Theano:用自动编码器中的stride(子采样)重构卷积

时间:2022-06-29 07:17:28

I want to train a simple convolutional auto-encoder using Theano, which has been working great. However, I don't see how one can reverse the conv2d command when subsampling (stride) is used. Is there an efficient way to "invert" the convolution command when stride is used, like in the image below?

我想用Theano训练一个简单的卷积自动编码器,它工作得很好。但是,我不知道如何在使用子采样(stride)时逆转conv2d命令。是否有一种有效的方法来“反转”卷积命令,当步幅被使用时,像下图所示?

Theano:用自动编码器中的stride(子采样)重构卷积

For example, I want to change the following ...

例如,我想更改以下内容……

from theano.tensor.nnet.conv import conv2d
x = T.tensor4('x') 
y = T.tanh(  conv2d( x, W, border_mode='valid', subsample = (1,1) )  )
z = conv2d( y, Wprime, border_mode='full', subsample = (1,1)  )

... into the situation where subsample = (2,2). The first layer will work work just as expected. However, the second layer will effectively "do a convolution with stride 1, then throw away half of the outputs". This is clearly a different operation than what I'm looking for - z won't even have the same number of neurons as length as x. What should the second conv2d command be to "reconstruct" the original x?

…进入子样本=(2,2)的情况。第一层将按照预期工作。然而,第二层将有效地“与stride 1进行卷积,然后丢弃一半的输出”。这显然是一个不同于我正在寻找的操作- z甚至没有和x一样多的神经元数目。第二个conv2d命令应该是什么来“重建”原来的x呢?

1 个解决方案

#1


5  

I deduce from this that you intend to have tied weights, i.e. if the first operation were are matrix multiplication with W, then the output would be generated with W.T, the adjoint matrix. In your case you would thus be looking for the adjoint of the convolution operator followed by subsampling.

我由此推断出你想要的是系权值,也就是说,如果第一个操作是用W乘以矩阵,那么输出就会用W生成。T,伴随矩阵。在你的例子中,你会寻找卷积算子的伴随子抽样。

(EDIT: I deduced wrongly, you can use any filter whatsoever to 'deconvolve', as long as you get the shapes right. Talking about the adjoint is still informative, though. You will be able to relax the assumption afterwards.)

(编辑:我推断错了,你可以使用任何滤镜来“反卷积”,只要你的形状正确。不过,谈论伴随性仍然是有益的。之后你就可以放松自己的假设了。

Since the convolution operator and subsampling operators are linear operator, lets denote them by C and S respectively and observe that convolution + subsampling an image x would be

由于卷积算子和子采样算子是线性算子,我们分别用C和S表示它们,观察卷积+对图像x进行子采样

S C x

and that the adjoint operation on y (which lives in the same space as S C x) would be

y上的伴随运算(和csc x在同一个空间里)会是

C.T S.T y

Now, S.T is nothing other than upsampling to the original image size by adding zeros around all entries of y until the right size is obtained.

现在,S。T只是对原始图像大小进行向上采样,在y的所有元素上添加0,直到得到合适的大小。

From your post, you seem to be aware of the adjoint of the convolution operator of stride (1, 1) - it is the convolution with reversed filters and reversed border_mode, i.e. with filters.dimshuffle(1, 0, 2, 3)[:, :, ::-1, ::-1] and switch from border_mode='valid' to border_mode='full'.

从你的文章中,你似乎意识到卷积运算符的伴随(1,1)-它是与反向滤波器和反向边界模式的卷积,即滤波器。dimshuffle(1, 0, 2, 3)[:,:,:-1,::-1],从border_mode='valid'切换到border_mode='full'。

Concatenate upsampling and this reverse filter convolution and you obtain the adjoint you seek.

连接上采样和这个反向滤波卷积得到你所寻找的伴随。

Note: There may be ways of exploiting the gradient T.grad or T.jacobian to obtain this automatically, but I am never sure how this is done exactly.

注意:可能有办法利用梯度T。毕业生或T。可以自动得到雅可比矩阵,但我不确定这是怎么做到的。

EDIT: There, I wrote it down :)

编辑:在那里,我写下了:)

import theano
import theano.tensor as T
import numpy as np

filters = theano.shared(np.random.randn(4, 3, 6, 5).astype('float32'))

inp1 = T.tensor4(dtype='float32')

subsampled_convolution = T.nnet.conv2d(inp1, filters, border_mode='valid', subsample=(2, 2))

inp2 = T.tensor4(dtype='float32')
shp = inp2.shape
upsample = T.zeros((shp[0], shp[1], shp[2] * 2, shp[3] * 2), dtype=inp2.dtype)
upsample = T.set_subtensor(upsample[:, :, ::2, ::2], inp2)
upsampled_convolution = T.nnet.conv2d(upsample,
     filters.dimshuffle(1, 0, 2, 3)[:, :, ::-1, ::-1], border_mode='full')

f1 = theano.function([inp1], subsampled_convolution)
f2 = theano.function([inp2], upsampled_convolution)

x = np.random.randn(1, 3, 10, 10).astype(np.float32)
f1x = f1(x)
y = np.random.randn(*f1x.shape).astype(np.float32)
f2y = f2(y)

p1 = np.dot(f1x.ravel(), y.ravel())
p2 = np.dot(x.ravel(), f2y[:, :, :-1].ravel())

print p1 - p2

p1 being equal to p2 corroborates that f2 is the adjoint of f1

p1等于p2证实了f2是f1的伴随子

#1


5  

I deduce from this that you intend to have tied weights, i.e. if the first operation were are matrix multiplication with W, then the output would be generated with W.T, the adjoint matrix. In your case you would thus be looking for the adjoint of the convolution operator followed by subsampling.

我由此推断出你想要的是系权值,也就是说,如果第一个操作是用W乘以矩阵,那么输出就会用W生成。T,伴随矩阵。在你的例子中,你会寻找卷积算子的伴随子抽样。

(EDIT: I deduced wrongly, you can use any filter whatsoever to 'deconvolve', as long as you get the shapes right. Talking about the adjoint is still informative, though. You will be able to relax the assumption afterwards.)

(编辑:我推断错了,你可以使用任何滤镜来“反卷积”,只要你的形状正确。不过,谈论伴随性仍然是有益的。之后你就可以放松自己的假设了。

Since the convolution operator and subsampling operators are linear operator, lets denote them by C and S respectively and observe that convolution + subsampling an image x would be

由于卷积算子和子采样算子是线性算子,我们分别用C和S表示它们,观察卷积+对图像x进行子采样

S C x

and that the adjoint operation on y (which lives in the same space as S C x) would be

y上的伴随运算(和csc x在同一个空间里)会是

C.T S.T y

Now, S.T is nothing other than upsampling to the original image size by adding zeros around all entries of y until the right size is obtained.

现在,S。T只是对原始图像大小进行向上采样,在y的所有元素上添加0,直到得到合适的大小。

From your post, you seem to be aware of the adjoint of the convolution operator of stride (1, 1) - it is the convolution with reversed filters and reversed border_mode, i.e. with filters.dimshuffle(1, 0, 2, 3)[:, :, ::-1, ::-1] and switch from border_mode='valid' to border_mode='full'.

从你的文章中,你似乎意识到卷积运算符的伴随(1,1)-它是与反向滤波器和反向边界模式的卷积,即滤波器。dimshuffle(1, 0, 2, 3)[:,:,:-1,::-1],从border_mode='valid'切换到border_mode='full'。

Concatenate upsampling and this reverse filter convolution and you obtain the adjoint you seek.

连接上采样和这个反向滤波卷积得到你所寻找的伴随。

Note: There may be ways of exploiting the gradient T.grad or T.jacobian to obtain this automatically, but I am never sure how this is done exactly.

注意:可能有办法利用梯度T。毕业生或T。可以自动得到雅可比矩阵,但我不确定这是怎么做到的。

EDIT: There, I wrote it down :)

编辑:在那里,我写下了:)

import theano
import theano.tensor as T
import numpy as np

filters = theano.shared(np.random.randn(4, 3, 6, 5).astype('float32'))

inp1 = T.tensor4(dtype='float32')

subsampled_convolution = T.nnet.conv2d(inp1, filters, border_mode='valid', subsample=(2, 2))

inp2 = T.tensor4(dtype='float32')
shp = inp2.shape
upsample = T.zeros((shp[0], shp[1], shp[2] * 2, shp[3] * 2), dtype=inp2.dtype)
upsample = T.set_subtensor(upsample[:, :, ::2, ::2], inp2)
upsampled_convolution = T.nnet.conv2d(upsample,
     filters.dimshuffle(1, 0, 2, 3)[:, :, ::-1, ::-1], border_mode='full')

f1 = theano.function([inp1], subsampled_convolution)
f2 = theano.function([inp2], upsampled_convolution)

x = np.random.randn(1, 3, 10, 10).astype(np.float32)
f1x = f1(x)
y = np.random.randn(*f1x.shape).astype(np.float32)
f2y = f2(y)

p1 = np.dot(f1x.ravel(), y.ravel())
p2 = np.dot(x.ravel(), f2y[:, :, :-1].ravel())

print p1 - p2

p1 being equal to p2 corroborates that f2 is the adjoint of f1

p1等于p2证实了f2是f1的伴随子