python如何用零填充numpy数组

I want to know how I can pad a 2D numpy array with zeros using python 2.6.6 with numpy version 1.5.0. Sorry! But these are my limitations. Therefore I cannot use np.pad. For example, I want to pad a with zeros such that its shape matches b. The reason why I want to do this is so I can do:

我想知道如何使用python 2.6.6和numpy版本1.5.0将零numpy数组填充零。抱歉!但这些是我的局限。因此我不能使用np.pad。例如，我想用零填充a，使其形状与b匹配。我想这样做的原因是我能做到：

b-a

such that

这样的

>>> a
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.]])
>>> b
array([[ 3.,  3.,  3.,  3.,  3.,  3.],
       [ 3.,  3.,  3.,  3.,  3.,  3.],
       [ 3.,  3.,  3.,  3.,  3.,  3.],
       [ 3.,  3.,  3.,  3.,  3.,  3.]])
>>> c
array([[1, 1, 1, 1, 1, 0],
       [1, 1, 1, 1, 1, 0],
       [1, 1, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0]])

The only way I can think of doing this is appending, however this seems pretty ugly. is there a cleaner solution possibly using b.shape?

我能想到的唯一方法就是追加，但这看起来很难看。是否有一个更清洁的解决方案可能使用b.shape？

Edit, Thank you to MSeiferts answer. I had to clean it up a bit, and this is what I got:

编辑，谢谢MSeiferts的回答。我不得不把它清理一下，这就是我得到的：

def pad(array, reference_shape, offsets):
    """
    array: Array to be padded
    reference_shape: tuple of size of ndarray to create
    offsets: list of offsets (number of elements must be equal to the dimension of the array)
    will throw a ValueError if offsets is too big and the reference_shape cannot handle the offsets
    """

    # Create an array of zeros with the reference shape
    result = np.zeros(reference_shape)
    # Create a list of slices from offset to offset + shape in each dimension
    insertHere = [slice(offsets[dim], offsets[dim] + array.shape[dim]) for dim in range(array.ndim)]
    # Insert the array in the result at the specified offsets
    result[insertHere] = array
    return result

4 个解决方案

#1

Very simple, you create an array containing zeros using the reference shape:

很简单，使用参考形状创建一个包含零的数组：

result = np.zeros(b.shape)
# actually you can also use result = np.zeros_like(b) 
# but that also copies the dtype not only the shape

and then insert the array where you need it:

然后将数组插入所需的位置：

result[:a.shape[0],:a.shape[1]] = a

and voila you have padded it:

瞧，你填补了它：

print(result)
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

You can also make it a bit more general if you define where your upper left element should be inserted

如果定义应该插入左上角元素的位置，也可以使它更通用一些

result = np.zeros_like(b)
x_offset = 1  # 0 would be what you wanted
y_offset = 1  # 0 in your case
result[x_offset:a.shape[0]+x_offset,y_offset:a.shape[1]+y_offset] = a
result

array([[ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.],
       [ 0.,  1.,  1.,  1.,  1.,  1.],
       [ 0.,  1.,  1.,  1.,  1.,  1.]])

but then be careful that you don't have offsets bigger than allowed. For x_offset = 2 for example this will fail.

但是要小心你没有超过允许的偏移量。例如，对于x_offset = 2，这将失败。

If you have an arbitary number of dimensions you can define a list of slices to insert the original array. I've found it interesting to play around a bit and created a padding function that can pad (with offset) an arbitary shaped array as long as the array and reference have the same number of dimensions and the offsets are not too big.

如果您有任意数量的维度，则可以定义切片列表以插入原始数组。我发现有趣的是玩一下并创建一个填充函数，可以填充（带偏移）一个任意形状的数组，只要数组和引用具有相同的维数并且偏移量不是太大。

def pad(array, reference, offsets):
    """
    array: Array to be padded
    reference: Reference array with the desired shape
    offsets: list of offsets (number of elements must be equal to the dimension of the array)
    """
    # Create an array of zeros with the reference shape
    result = np.zeros(reference.shape)
    # Create a list of slices from offset to offset + shape in each dimension
    insertHere = [slice(offset[dim], offset[dim] + array.shape[dim]) for dim in range(a.ndim)]
    # Insert the array in the result at the specified offsets
    result[insertHere] = a
    return result

And some test cases:

还有一些测试用例：

import numpy as np

# 1 Dimension
a = np.ones(2)
b = np.ones(5)
offset = [3]
pad(a, b, offset)

# 3 Dimensions

a = np.ones((3,3,3))
b = np.ones((5,4,3))
offset = [1,0,0]
pad(a, b, offset)

#2

NumPy 1.7 (when np.pad was added) is pretty old now (it was released in 2013) so even though the question asked for a way without that function I thought it could be useful to know how that could be achieved using np.pad.

NumPy 1.7（当添加np.pad时）现在已经很老了（它于2013年发布）所以即使问题要求没有该功能的方法，我认为知道如何使用np.pad实现这一目标会很有用。

It's actually pretty simple:

它实际上非常简单：

>>> import numpy as np
>>> a = np.array([[ 1.,  1.,  1.,  1.,  1.],
...               [ 1.,  1.,  1.,  1.,  1.],
...               [ 1.,  1.,  1.,  1.,  1.]])
>>> np.pad(a, [(0, 1), (0, 1)], mode='constant')
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

In this case I used that 0 is the default value for mode='constant'. But it could also be specified by passing it in explicitly:

在这种情况下，我使用0表示mode ='constant'的默认值。但它也可以通过显式传递来指定：

>>> np.pad(a, [(0, 1), (0, 1)], mode='constant', constant_values=0)
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

Just in case the second argument ([(0, 1), (0, 1)]) seems confusing: Each list item (in this case tuple) corresponds to a dimension and item therein represents the padding before (first element) and after (second element). So:

以防第二个参数（[（0,1），（0,1）]）似乎令人困惑：每个列表项（在本例中为元组）对应于一个维度，其中的项目表示之前的填充（第一个元素）和之后（第二个要素）。所以：

[(0, 1), (0, 1)]
         ^^^^^^------ padding for second dimension
 ^^^^^^-------------- padding for first dimension

  ^------------------ no padding at the beginning of the first axis
     ^--------------- pad with one "value" at the end of the first axis.

In this case the padding for the first and second axis are identical, so one could also just pass in the 2-tuple:

在这种情况下，第一和第二轴的填充是相同的，所以也可以只传入2元组：

>>> np.pad(a, (0, 1), mode='constant')
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

In case the padding before and after is identical one could even omit the tuple (not applicable in this case though):

如果之前和之后的填充相同，甚至可以省略元组（虽然在这种情况下不适用）：

>>> np.pad(a, 1, mode='constant')
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.]])

Or if the padding before and after is identical but different for the axis, you could also omit the second argument in the inner tuples:

或者，如果前后填充相同但轴不同，则还可以省略内部元组中的第二个参数：

>>> np.pad(a, [(1, ), (2, )], mode='constant')
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

However I tend to prefer to always use the explicit one, because it's just to easy to make mistakes (when NumPys expectations differ from your intentions):

但是我倾向于总是使用明确的一个，因为它很容易出错（当NumPys的期望与你的意图不同时）：

>>> np.pad(a, [1, 2], mode='constant')
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

Here NumPy thinks you wanted to pad all axis with 1 element before and 2 elements after each axis! Even if you intended it to pad with 1 element in axis 1 and 2 elements for axis 2.

在这里，NumPy认为你想要在所有轴之前填充1个元素，在每个轴之后填充2个元素！即使您打算在轴1中填充1个元素，也为轴2填充2个元素。

I used lists of tuples for the padding, note that this is just "my convention", you could also use lists of lists or tuples of tuples, or even tuples of arrays. NumPy just checks the length of the argument (or if it doesn't have a length) and the length of each item (or if it has a length)!

我使用了元组列表作为填充，注意这只是“我的约定”，你也可以使用列表或元组元组，甚至是元组元组。 NumPy只检查参数的长度（或者如果没有长度）和每个项目的长度（或者如果它有长度）！

#3

I understand that your main problem is that you need to calculate d=b-a but your arrays have different sizes. There is no need for an intermediate padded c

我知道你的主要问题是你需要计算d = b-a，但你的数组有不同的大小。不需要中间衬垫c

You can solve this without padding:

你可以不用填充来解决这个问题：

import numpy as np

a = np.array([[ 1.,  1.,  1.,  1.,  1.],
              [ 1.,  1.,  1.,  1.,  1.],
              [ 1.,  1.,  1.,  1.,  1.]])

b = np.array([[ 3.,  3.,  3.,  3.,  3.,  3.],
              [ 3.,  3.,  3.,  3.,  3.,  3.],
              [ 3.,  3.,  3.,  3.,  3.,  3.],
              [ 3.,  3.,  3.,  3.,  3.,  3.]])

d = b.copy()
d[:a.shape[0],:a.shape[1]] -=  a

print d

Output:

输出：

[[ 2.  2.  2.  2.  2.  3.]
 [ 2.  2.  2.  2.  2.  3.]
 [ 2.  2.  2.  2.  2.  3.]
 [ 3.  3.  3.  3.  3.  3.]]

#4

In case you need to add a fence of 1s to an array:

如果您需要向数组添加1s的栅栏：

>>> mat = np.zeros((4,4), np.int32)
>>> mat
array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])
>>> mat[0,:] = mat[:,0] = mat[:,-1] =  mat[-1,:] = 1
>>> mat
array([[1, 1, 1, 1],
       [1, 0, 0, 1],
       [1, 0, 0, 1],
       [1, 1, 1, 1]])

#1