
时间:2023-01-28 12:32:13

I have an array of values that I want to replace with from an array of choices based on which choice is linearly closest.


The catch is the size of the choices is defined at runtime.


import numpy as np
a = np.array([[0, 0, 0], [4, 4, 4], [9, 9, 9]])
choices = np.array([1, 5, 10])

If choices was static in size, I would simply use np.where


d = np.where(np.abs(a - choices[0]) > np.abs(a - choices[1]), 
      np.where(np.abs(a - choices[0]) > np.abs(a - choices[2]), choices[0], choices[2]),
         np.where(np.abs(a - choices[1]) > np.abs(a - choices[2]), choices[1], choices[2]))

To get the output:


>>[[1, 1, 1], [5, 5, 5], [10, 10, 10]]

Is there a way to do this more dynamically while still preserving the vectorization.


3 个解决方案



Subtract choices from a, find the index of the minimum of the result, substitute.


a = np.array([[0, 0, 0], [4, 4, 4], [9, 9, 9]])
choices = np.array([1, 5, 10])
b = a[:,:,None] - choices
i = np.argmin(b, axis = -1)
a = choices[i]
print a

[[ 1  1  1]
 [ 5  5  5]
 [10 10 10]]

a = np.array([[0, 3, 0], [4, 8, 4], [9, 1, 9]])
choices = np.array([1, 5, 10])
b = a[:,:,None] - choices
i = np.argmin(b, axis = -1)
a = choices[i]
print a

[[ 1  1  1]
 [ 5 10  5]
 [10  1 10]]

The extra dimension was added to a so that each element of choices would be subtracted from each element of a. choices was broadcast against a in the third dimension, This link has a decent graphic. b.shape is (3,3,3). EricsBroadcastingDoc is a pretty good explanation and has a graphic 3-d example at the end.

将额外维度添加到a中,以便从a的每个元素中减去每个选项元素。选项是在第三维中针对a广播的,这个链接有一个像样的图形。 b.shape是(3,3,3)。 EricsBroadcastingDoc是一个非常好的解释,并在最后有一个图形3-d示例。

For the second example:


>>> print b
[[[ 1  5 10]
  [ 2  2  7]
  [ 1  5 10]]

 [[ 3  1  6]
  [ 7  3  2]
  [ 3  1  6]]

 [[ 8  4  1]
  [ 0  4  9]
  [ 8  4  1]]]
>>> print i
[[0 0 0]
 [1 2 1]
 [2 0 2]]

The final assignment uses an Index Array or Integer Array Indexing.


In the second example, notice that there was a tie for element a[0,1] , either one or five could have been substituted.

在第二个例子中,注意元素a [0,1]存在一个平局,可以替换一个或五个。



To explain wwii's excellent answer in a little more detail:


The idea is to create a new dimension which does the job of comparing each element of a to each element in choices using numpy broadcasting. This is easily done for an arbitrary number of dimensions in a using the ellipsis syntax:


>>> b = np.abs(a[..., np.newaxis] - choices)
array([[[ 1,  5, 10],
        [ 1,  5, 10],
        [ 1,  5, 10]],
       [[ 3,  1,  6],
        [ 3,  1,  6],
        [ 3,  1,  6]],
       [[ 8,  4,  1],
        [ 8,  4,  1],
        [ 8,  4,  1]]])

Taking argmin along the axis you just created (the last axis, with label -1) gives you the desired index in choices that you want to substitute:


>>> np.argmin(b, axis=-1)
array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2]])

Which finally allows you to choose those elements from choices:


>>> d = choices[np.argmin(b, axis=-1)]
>>> d
array([[ 1,  1,  1],
       [ 5,  5,  5],
       [10, 10, 10]])

For a non-symmetric shape:


Let's say a had shape (2, 5):


>>> a = np.arange(10).reshape((2, 5))
>>> a
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

Then you'd get:


>>> b = np.abs(a[..., np.newaxis] - choices)
>>> b
array([[[ 1,  5, 10],
        [ 0,  4,  9],
        [ 1,  3,  8],
        [ 2,  2,  7],
        [ 3,  1,  6]],

       [[ 4,  0,  5],
        [ 5,  1,  4],
        [ 6,  2,  3],
        [ 7,  3,  2],
        [ 8,  4,  1]]])

This is hard to read, but what it's saying is, b has shape:


>>> b.shape
(2, 5, 3)

The first two dimensions came from the shape of a, which is also (2, 5). The last dimension is the one you just created. To get a better idea:


>>> b[:, :, 0]  # = abs(a - 1)
array([[1, 0, 1, 2, 3],
       [4, 5, 6, 7, 8]])
>>> b[:, :, 1]  # = abs(a - 5)
array([[5, 4, 3, 2, 1],
       [0, 1, 2, 3, 4]])
>>> b[:, :, 2]  # = abs(a - 10)
array([[10,  9,  8,  7,  6],
       [ 5,  4,  3,  2,  1]])

Note how b[:, :, i] is the absolute difference between a and choices[i], for each i = 1, 2, 3.

注意b [:,:,i]是a和choice [i]之间的绝对差异,对于每个i = 1,2,3。

Hope that helps explain this a little more clearly.




I love broadcasting and would have gone that way myself too. But, with large arrays, I would like to suggest another approach with np.searchsorted that keeps it memory efficient and thus achieves performance benefits, like so -

我喜欢广播,也会自己走那条路。但是,对于大型数组,我想建议使用np.searchsorted的另一种方法,以保持内存效率,从而实现性能优势,如此 -

def searchsorted_app(a, choices):
    lidx = np.searchsorted(choices, a, 'left').clip(max=choices.size-1)
    ridx = (np.searchsorted(choices, a, 'right')-1).clip(min=0)
    cl = np.take(choices,lidx) # Or choices[lidx]
    cr = np.take(choices,ridx) # Or choices[ridx]
    mask = np.abs(a - cl) > np.abs(a - cr)
    cl[mask] = cr[mask]
    return cl

Please note that if the elements in choices are not sorted, we need to add in the additional argument sorter with np.searchsorted.


Runtime test -

运行时测试 -

In [160]: # Setup inputs
     ...: a = np.random.rand(100,100)
     ...: choices = np.sort(np.random.rand(100))

In [161]: def broadcasting_app(a, choices): # @wwii's solution
     ...:     return choices[np.argmin(np.abs(a[:,:,None] - choices),-1)]

In [162]: np.allclose(broadcasting_app(a,choices),searchsorted_app(a,choices))
Out[162]: True

In [163]: %timeit broadcasting_app(a, choices)
100 loops, best of 3: 9.3 ms per loop

In [164]: %timeit searchsorted_app(a, choices)
1000 loops, best of 3: 1.78 ms per loop

Related post : Find elements of array one nearest to elements of array two




Subtract choices from a, find the index of the minimum of the result, substitute.


a = np.array([[0, 0, 0], [4, 4, 4], [9, 9, 9]])
choices = np.array([1, 5, 10])
b = a[:,:,None] - choices
i = np.argmin(b, axis = -1)
a = choices[i]
print a

[[ 1  1  1]
 [ 5  5  5]
 [10 10 10]]

a = np.array([[0, 3, 0], [4, 8, 4], [9, 1, 9]])
choices = np.array([1, 5, 10])
b = a[:,:,None] - choices
i = np.argmin(b, axis = -1)
a = choices[i]
print a

[[ 1  1  1]
 [ 5 10  5]
 [10  1 10]]

The extra dimension was added to a so that each element of choices would be subtracted from each element of a. choices was broadcast against a in the third dimension, This link has a decent graphic. b.shape is (3,3,3). EricsBroadcastingDoc is a pretty good explanation and has a graphic 3-d example at the end.

将额外维度添加到a中,以便从a的每个元素中减去每个选项元素。选项是在第三维中针对a广播的,这个链接有一个像样的图形。 b.shape是(3,3,3)。 EricsBroadcastingDoc是一个非常好的解释,并在最后有一个图形3-d示例。

For the second example:


>>> print b
[[[ 1  5 10]
  [ 2  2  7]
  [ 1  5 10]]

 [[ 3  1  6]
  [ 7  3  2]
  [ 3  1  6]]

 [[ 8  4  1]
  [ 0  4  9]
  [ 8  4  1]]]
>>> print i
[[0 0 0]
 [1 2 1]
 [2 0 2]]

The final assignment uses an Index Array or Integer Array Indexing.


In the second example, notice that there was a tie for element a[0,1] , either one or five could have been substituted.

在第二个例子中,注意元素a [0,1]存在一个平局,可以替换一个或五个。



To explain wwii's excellent answer in a little more detail:


The idea is to create a new dimension which does the job of comparing each element of a to each element in choices using numpy broadcasting. This is easily done for an arbitrary number of dimensions in a using the ellipsis syntax:


>>> b = np.abs(a[..., np.newaxis] - choices)
array([[[ 1,  5, 10],
        [ 1,  5, 10],
        [ 1,  5, 10]],
       [[ 3,  1,  6],
        [ 3,  1,  6],
        [ 3,  1,  6]],
       [[ 8,  4,  1],
        [ 8,  4,  1],
        [ 8,  4,  1]]])

Taking argmin along the axis you just created (the last axis, with label -1) gives you the desired index in choices that you want to substitute:


>>> np.argmin(b, axis=-1)
array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2]])

Which finally allows you to choose those elements from choices:


>>> d = choices[np.argmin(b, axis=-1)]
>>> d
array([[ 1,  1,  1],
       [ 5,  5,  5],
       [10, 10, 10]])

For a non-symmetric shape:


Let's say a had shape (2, 5):


>>> a = np.arange(10).reshape((2, 5))
>>> a
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

Then you'd get:


>>> b = np.abs(a[..., np.newaxis] - choices)
>>> b
array([[[ 1,  5, 10],
        [ 0,  4,  9],
        [ 1,  3,  8],
        [ 2,  2,  7],
        [ 3,  1,  6]],

       [[ 4,  0,  5],
        [ 5,  1,  4],
        [ 6,  2,  3],
        [ 7,  3,  2],
        [ 8,  4,  1]]])

This is hard to read, but what it's saying is, b has shape:


>>> b.shape
(2, 5, 3)

The first two dimensions came from the shape of a, which is also (2, 5). The last dimension is the one you just created. To get a better idea:


>>> b[:, :, 0]  # = abs(a - 1)
array([[1, 0, 1, 2, 3],
       [4, 5, 6, 7, 8]])
>>> b[:, :, 1]  # = abs(a - 5)
array([[5, 4, 3, 2, 1],
       [0, 1, 2, 3, 4]])
>>> b[:, :, 2]  # = abs(a - 10)
array([[10,  9,  8,  7,  6],
       [ 5,  4,  3,  2,  1]])

Note how b[:, :, i] is the absolute difference between a and choices[i], for each i = 1, 2, 3.

注意b [:,:,i]是a和choice [i]之间的绝对差异,对于每个i = 1,2,3。

Hope that helps explain this a little more clearly.




I love broadcasting and would have gone that way myself too. But, with large arrays, I would like to suggest another approach with np.searchsorted that keeps it memory efficient and thus achieves performance benefits, like so -

我喜欢广播,也会自己走那条路。但是,对于大型数组,我想建议使用np.searchsorted的另一种方法,以保持内存效率,从而实现性能优势,如此 -

def searchsorted_app(a, choices):
    lidx = np.searchsorted(choices, a, 'left').clip(max=choices.size-1)
    ridx = (np.searchsorted(choices, a, 'right')-1).clip(min=0)
    cl = np.take(choices,lidx) # Or choices[lidx]
    cr = np.take(choices,ridx) # Or choices[ridx]
    mask = np.abs(a - cl) > np.abs(a - cr)
    cl[mask] = cr[mask]
    return cl

Please note that if the elements in choices are not sorted, we need to add in the additional argument sorter with np.searchsorted.


Runtime test -

运行时测试 -

In [160]: # Setup inputs
     ...: a = np.random.rand(100,100)
     ...: choices = np.sort(np.random.rand(100))

In [161]: def broadcasting_app(a, choices): # @wwii's solution
     ...:     return choices[np.argmin(np.abs(a[:,:,None] - choices),-1)]

In [162]: np.allclose(broadcasting_app(a,choices),searchsorted_app(a,choices))
Out[162]: True

In [163]: %timeit broadcasting_app(a, choices)
100 loops, best of 3: 9.3 ms per loop

In [164]: %timeit searchsorted_app(a, choices)
1000 loops, best of 3: 1.78 ms per loop

Related post : Find elements of array one nearest to elements of array two
