张量流-了解张量形状以进行卷积

张量流-了解张量形状以进行卷积

问题描述:

目前正在尝试通过用于卷积网络的Tensorflow MNIST教程并且我可以使用一些帮助来了解织补张量的尺寸.

Currently trying to work my way through the Tensorflow MNIST tutorial for convolutional networks and I could use some help with understanding the dimensions of the darn tensors.

所以我们的图像大小为28x28像素.

So we have images of 28x28 pixels in size.

卷积将为每个5x5补丁计算32个特征.

The convolution will compute 32 features for each 5x5 patch.

让我们暂时接受一下,然后再问自己为什么要使用32个功能以及为什么要使用5x5补丁.

Let's just accept this, for now, and ask ourselves later why 32 features and why 5x5 patches.

其重量张量将具有[5, 5, 1, 32]的形状.前两个维是音色大小,下一个是输入通道数,最后一个是输出通道数.

Its weight tensor will have a shape of [5, 5, 1, 32]. The first two dimensions are the patch size, the next is the number of input channels, and the last is the number of output channels.

W_conv1 = weight_variable([5, 5, 1, 32])

b_conv1 = bias_variable([32])

如果你这么说...

要应用该层,我们首先将x重塑为4d张量,其第二和第三个尺寸对应于图像的宽度和高度,最终尺寸对应于颜色通道的数量.

To apply the layer, we first reshape x to a 4d tensor, with the second and third dimensions corresponding to image width and height, and the final dimension corresponding to the number of color channels.

x_image = tf.reshape(x, [-1,28,28,1])

好的,现在我迷路了.

从最后一次重塑来看,我们有 不过" 28x28x1块"是我们的图像像素.

Judging by this last reshape, we have "howevermany" 28x28x1 "blocks" of pixels that are our images.

我想这是有道理的,因为图像是灰度的

I guess this makes sense because the images are in greyscale

但是,如果那是顺序,那么我们的权重张量实际上就是五个5x1x32个块"值的集合.

However, if that is the ordering, then our weight tensor is essentially a collection of five 5x1x32 "blocks" of values.

如果我们想推断每个补丁的32功能,我认为x32很有意义

The x32 makes sense, I guess, if we want to infer 32 features per patch

不过,其余的我并不十分相信.

The rest, though, I'm not terribly convinced by.

为什么重量张量看起来像它的样子?

Why does the weight tensor look the way it apparently does?

(为完整性:我们使用它们

(For completeness: we use them

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

其中

def conv2d(x,W):
    '''
    2D convolution, expects 4D input x and filter matrix W
    '''
    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding ='SAME')

def max_pool_2x2(x):
    '''
    max-pooling, using 2x2 patches
    '''
    return tf.nn.max_pool(x,ksize=[1,2,2,1], strides=[1,2,2,1],padding='SAME')

)

您的输入张量的形状为[-1,28,28,1].就像您提到的那样,最后一个尺寸是1,因为图像是灰度的.第一个索引是批处理大小.卷积将独立处理批处理中的每个图像,因此,批处理大小不会影响卷积权重张量尺寸,或者实际上不会影响网络中的任何权重张量尺寸.这就是为什么batchsize可以是任意的(-1表示张量流中的任意大小)的原因.

Your input tensor has the shape [-1,28,28,1]. Like you mention, the last dimension is 1 because the images are in greyscale. The first index is the batchsize. The convolution will process every image in the batch independently, therefore the batchsize has no influence on the convolution-weight-tensor dimensions, or, in fact, no influence on any weight-tensor dimensions in the network. That is why the batchsize can be arbitrary (-1 signifies arbitrary size in tensorflow).

现在是重量张量;您没有五个5x1x32块,而是有32个5x5x1块.每个代表一个功能. 1是贴片的深度,由于灰度而为1(对于彩色图像,其值为5x5x3x32). 5x5是补丁的大小.

Now to the weight tensor; you don't have five of 5x1x32-blocks, you rather have 32 of 5x5x1-blocks. Each represents one feature. The 1 is the depth of the patch and is 1 due to the gray scale (it would be 5x5x3x32 for color images). The 5x5 is the size of the patch.

数据张量中维度的顺序与卷积权重张量中维度的顺序不同.

The ordering of dimensions in the data tensors is different from the ordering of dimensions in the convolution weight tensors.