Tensorflow大步向前

Tensorflow大步向前

问题描述:

我正在尝试了解tf.nn.avg_pool,tf.nn.max_pool,tf.nn.conv2d中的 strides 参数.

I am trying to understand the strides argument in tf.nn.avg_pool, tf.nn.max_pool, tf.nn.conv2d.

文档反复说

strides:长度大于等于4的整数的列表.输入张量每个维度的滑动窗口的步幅.

strides: A list of ints that has length >= 4. The stride of the sliding window for each dimension of the input tensor.

我的问题是:

  1. 这4个以上的整数分别代表什么?
  2. 对于卷积网络,为什么它们必须具有跨度[0] =跨度[3] = 1?
  3. 此示例中我们看到tf.reshape(_X,shape=[-1, 28, 28, 1]).为什么为-1?
  1. What do each of the 4+ integers represent?
  2. Why must they have strides[0] = strides[3] = 1 for convnets?
  3. In this example we see tf.reshape(_X,shape=[-1, 28, 28, 1]). Why -1?

遗憾的是,文档中使用-1进行重塑的示例并不能很好地解释这种情况.

Sadly the examples in the docs for reshape using -1 don't translate too well to this scenario.

池化和卷积运算在输入张量上滑动一个窗口".以 tf.nn.conv2d 为例:如果输入张量具有4个维度: [batch, height, width, channels],则卷积在height, width维度上的2D窗口上进行.

The pooling and convolutional ops slide a "window" across the input tensor. Using tf.nn.conv2d as an example: If the input tensor has 4 dimensions: [batch, height, width, channels], then the convolution operates on a 2D window on the height, width dimensions.

strides确定窗口在每个维度上的移动量.典型用法是将第一个(批次)和最后一个(深度)步幅设置为1.

strides determines how much the window shifts by in each of the dimensions. The typical use sets the first (the batch) and last (the depth) stride to 1.

让我们使用一个非常具体的示例:在32x32灰度输入图像上进行2-d卷积.我说灰度是因为输入图像的深度为= 1,这有助于保持简单.让该图像看起来像这样:

Let's use a very concrete example: Running a 2-d convolution over a 32x32 greyscale input image. I say greyscale because then the input image has depth=1, which helps keep it simple. Let that image look like this:

00 01 02 03 04 ...
10 11 12 13 14 ...
20 21 22 23 24 ...
30 31 32 33 34 ...
...

让我们在一个示例(批次大小= 1)上运行2x2卷积窗口.我们给卷积的输出通道深度为8.

Let's run a 2x2 convolution window over a single example (batch size = 1). We'll give the convolution an output channel depth of 8.

卷积的输入为shape=[1, 32, 32, 1].

如果用padding=SAME指定strides=[1,1,1,1],则过滤器的输出将为[1、32、32、8].

If you specify strides=[1,1,1,1] with padding=SAME, then the output of the filter will be [1, 32, 32, 8].

过滤器将首先为以下内容创建输出:

The filter will first create an output for:

F(00 01
  10 11)

然后针对:

F(01 02
  11 12)

,依此类推.然后它将移至第二行,计算:

and so on. Then it will move to the second row, calculating:

F(10, 11
  20, 21)

然后

F(11, 12
  21, 22)

如果将步幅指定为[1、2、2、1],则不会重叠窗口.它将计算:

If you specify a stride of [1, 2, 2, 1] it won't do overlapping windows. It will compute:

F(00, 01
  10, 11)

然后

F(02, 03
  12, 13)

对于池操作员而言,跨步操作类似.

The stride operates similarly for the pooling operators.

问题2:为什么对卷积网络大步向前[1,x,y,1]

第一个是批处理:您通常不想跳过批处理中的示例,否则您不应该首先将它们包括在内. :)

The first 1 is the batch: You don't usually want to skip over examples in your batch, or you shouldn't have included them in the first place. :)

最后1个是卷积的深度:出于相同的原因,您通常不想跳过输入.

The last 1 is the depth of the convolution: You don't usually want to skip inputs, for the same reason.

conv2d运算符比较笼统,因此您可以可以创建卷积以使窗口沿其他尺寸滑动,但这在卷积网络中并不常见.典型的用途是在空间上使用它们.

The conv2d operator is more general, so you could create convolutions that slide the window along other dimensions, but that's not a typical use in convnets. The typical use is to use them spatially.

为什么要重塑为-1 -1是一个占位符,表示根据需要进行调整以匹配整个张量所需的大小".这是使代码独立于输入批处理大小的一种方法,因此您可以更改管道,而不必在代码中的任何地方调整批处理大小.

Why reshape to -1 -1 is a placeholder that says "adjust as necessary to match the size needed for the full tensor." It's a way of making the code be independent of the input batch size, so that you can change your pipeline and not have to adjust the batch size everywhere in the code.