查找列表中numpy数组的索引

时间:2023-02-08 15:42:06
import numpy as np
foo = [1, "hello", np.array([[1,2,3]]) ]

I would expect

我期待

foo.index( np.array([[1,2,3]]) ) 

to return

回来

2

but instead I get

但相反,我得到了

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

ValueError:具有多个元素的数组的真值是不明确的。使用a.any()或a.all()

anything better than my current solution? It seems inefficient.

什么比我目前的解决方案更好?这似乎效率低下。

def find_index_of_array(list, array):
    for i in range(len(list)):
        if np.all(list[i]==array):
            return i

find_index_of_array(foo, np.array([[1,2,3]]) )
# 2

5 个解决方案

#1


11  

The reason for the error here is obviously because numpy's ndarray overrides == to return an array rather than a boolean.

这里出错的原因显然是因为numpy的ndarray重写==返回数组而不是布尔值。

AFAIK, there is no simple solution here. The following will work so long as the
np.all(val == array) bit works.

AFAIK,这里没有简单的解决方案。只要np.all(val == array)位有效,以下内容就可以正常工作。

next((i for i, val in enumerate(lst) if np.all(val == array)), -1)

Whether that bit works or not depends critically on what the other elements in the array are and if they can be compared with numpy arrays.

该位是否有效取决于数组中的其他元素是什么,以及它们是否可以与numpy数组进行比较。

#2


2  

For performance, you might want to process only the NumPy arrays in the input list. So, we could type-check before going into the loop and index into the elements that are arrays.

为了提高性能,您可能只想处理输入列表中的NumPy数组。因此,我们可以在进入循环之前进行类型检查并索引到数组元素。

Thus, an implementation would be -

因此,实施将是 -

def find_index_of_array_v2(list1, array1):
    idx = np.nonzero([type(i).__module__ == np.__name__ for i in list1])[0]
    for i in idx:
        if np.all(list1[i]==array1):
            return i

#3


2  

How about this one?

这个怎么样?

arr = np.array([[1,2,3]])
foo = np.array([1, 'hello', arr], dtype=np.object)

# if foo array is of heterogeneous elements (str, int, array)
[idx for idx, el in enumerate(foo) if type(el) == type(arr)]

# if foo array has only numpy arrays in it
[idx for idx, el in enumerate(foo) if np.array_equal(el, arr)]

Output:

输出:

[2]

Note: This will also work even if foo is a list. I just put it as a numpy array here.

注意:即使foo是一个列表,这也可以工作。我把它作为一个numpy数组放在这里。

#4


2  

The issue here (you probably know already but just to repeat it) is that list.index works along the lines of:

这里的问题(你可能已经知道但只是为了重复它)是list.index的工作原理如下:

for idx, item in enumerate(your_list):
    if item == wanted_item:
        return idx

The line if item == wanted_item is the problem, because it implicitly converts item == wanted_item to a boolean. But numpy.ndarray (except if it's a scalar) raises this ValueError then:

如果item == wanted_item的行是问题,因为它隐式地将item == wanted_item转换为布尔值。但是numpy.ndarray(除非它是标量)引发了这个ValueError然后:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

ValueError:具有多个元素的数组的真值是不明确的。使用a.any()或a.all()

Solution 1: adapter (thin wrapper) class

I generally use a thin wrapper (adapter) around numpy.ndarray whenever I need to use python functions like list.index:

每当我需要使用像list.index这样的python函数时,我通常会在numpy.ndarray周围使用一个瘦的包装器(适配器):

class ArrayWrapper(object):

    __slots__ = ["_array"]  # minimizes the memory footprint of the class.

    def __init__(self, array):
        self._array = array

    def __eq__(self, other_array):
        # array_equal also makes sure the shape is identical!
        # If you don't mind broadcasting you can also use
        # np.all(self._array == other_array)
        return np.array_equal(self._array, other_array)

    def __array__(self):
        # This makes sure that `np.asarray` works and quite fast.
        return self._array

    def __repr__(self):
        return repr(self._array)

These thin wrappers are more expensive than manually using some enumerate loop or comprehension but you don't have to re-implement the python functions. Assuming the list contains only numpy-arrays (otherwise you need to do some if ... else ... checking):

这些瘦包装比使用一些枚举循环或理解手动更昂贵,但您不必重新实现python函数。假设列表只包含numpy-arrays(否则你需要做一些if ...... else ...检查):

list_of_wrapped_arrays = [ArrayWrapper(arr) for arr in list_of_arrays]

After this step you can use all your python functions on this list:

完成此步骤后,您可以在此列表中使用所有python函数:

>>> list_of_arrays = [np.ones((3, 3)), np.ones((3)), np.ones((3, 3)) * 2, np.ones((3))]
>>> list_of_wrapped_arrays.index(np.ones((3,3)))
0
>>> list_of_wrapped_arrays.index(np.ones((3)))
1

These wrappers are not numpy-arrays anymore but you have thin wrappers so the extra list is quite small. So depending on your needs you could keep the wrapped list and the original list and choose on which to do the operations, for example you can also list.count the identical arrays now:

这些包装器不再是numpy-arrays,但你有薄包装器,所以额外的列表非常小。因此,根据您的需要,您可以保留包装列表和原始列表,并选择执行操作,例如,您现在还可以列出相同的数组:

>>> list_of_wrapped_arrays.count(np.ones((3)))
2

or list.remove:

或list.remove:

>>> list_of_wrapped_arrays.remove(np.ones((3)))
>>> list_of_wrapped_arrays
[array([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.],
        [ 1.,  1.,  1.]]), 
 array([[ 2.,  2.,  2.],
        [ 2.,  2.,  2.],
        [ 2.,  2.,  2.]]), 
 array([ 1.,  1.,  1.])]

Solution 2: subclass and ndarray.view

This approach uses explicit subclasses of numpy.array. It has the advantage that you get all builtin array-functionality and only modify the requested operation (which would be __eq__):

此方法使用numpy.array的显式子类。它的优点是你可以获得所有内置的数组功能,并且只修改所请求的操作(这将是__eq__):

class ArrayWrapper(np.ndarray):
    def __eq__(self, other_array):
        return np.array_equal(self, other_array)

>>> your_list = [np.ones(3), np.ones(3)*2, np.ones(3)*3, np.ones(3)*4]

>>> view_list = [arr.view(ArrayWrapper) for arr in your_list]

>>> view_list.index(np.array([2,2,2]))
1

Again you get most list methods this way: list.remove, list.count besides list.index.

再次,您可以通过以下方式获得大多数列表方法:list.remove,list.count以及list.index。

However this approach may yield subtle behaviour if some operation implicitly uses __eq__. You can always re-interpret is as plain numpy array by using np.asarray or .view(np.ndarray):

但是,如果某些操作隐式使用__eq__,则此方法可能会产生微妙的行为。你总是可以使用np.asarray或.view(np.ndarray)重新解释为普通的numpy数组:

>>> view_list[1]
ArrayWrapper([ 2.,  2.,  2.])

>>> view_list[1].view(np.ndarray)
array([ 2.,  2.,  2.])

>>> np.asarray(view_list[1])
array([ 2.,  2.,  2.])

Alternative: Overriding __bool__ (or __nonzero__ for python 2)

Instead of fixing the problem in the __eq__ method you could also override __bool__ or __nonzero__:

您可以覆盖__bool__或__nonzero__,而不是在__eq__方法中修复问题:

class ArrayWrapper(np.ndarray):
    # This could also be done in the adapter solution.
    def __bool__(self):
        return bool(np.all(self))

    __nonzero__ = __bool__

Again this makes the list.index work like intended:

这再次使list.index像预期的那样工作:

>>> your_list = [np.ones(3), np.ones(3)*2, np.ones(3)*3, np.ones(3)*4]
>>> view_list = [arr.view(ArrayWrapper) for arr in your_list]
>>> view_list.index(np.array([2,2,2]))
1

But this will definitly modify more behaviour! For example:

但这肯定会修改更多行为!例如:

>>> if ArrayWrapper([1,2,3]):
...     print('that was previously impossible!')
that was previously impossible!

#5


0  

This should do the job:

这应该做的工作:

[i for i,j in enumerate(foo) if j.__class__.__name__=='ndarray']
[2]

#1


11  

The reason for the error here is obviously because numpy's ndarray overrides == to return an array rather than a boolean.

这里出错的原因显然是因为numpy的ndarray重写==返回数组而不是布尔值。

AFAIK, there is no simple solution here. The following will work so long as the
np.all(val == array) bit works.

AFAIK,这里没有简单的解决方案。只要np.all(val == array)位有效,以下内容就可以正常工作。

next((i for i, val in enumerate(lst) if np.all(val == array)), -1)

Whether that bit works or not depends critically on what the other elements in the array are and if they can be compared with numpy arrays.

该位是否有效取决于数组中的其他元素是什么,以及它们是否可以与numpy数组进行比较。

#2


2  

For performance, you might want to process only the NumPy arrays in the input list. So, we could type-check before going into the loop and index into the elements that are arrays.

为了提高性能,您可能只想处理输入列表中的NumPy数组。因此,我们可以在进入循环之前进行类型检查并索引到数组元素。

Thus, an implementation would be -

因此,实施将是 -

def find_index_of_array_v2(list1, array1):
    idx = np.nonzero([type(i).__module__ == np.__name__ for i in list1])[0]
    for i in idx:
        if np.all(list1[i]==array1):
            return i

#3


2  

How about this one?

这个怎么样?

arr = np.array([[1,2,3]])
foo = np.array([1, 'hello', arr], dtype=np.object)

# if foo array is of heterogeneous elements (str, int, array)
[idx for idx, el in enumerate(foo) if type(el) == type(arr)]

# if foo array has only numpy arrays in it
[idx for idx, el in enumerate(foo) if np.array_equal(el, arr)]

Output:

输出:

[2]

Note: This will also work even if foo is a list. I just put it as a numpy array here.

注意:即使foo是一个列表,这也可以工作。我把它作为一个numpy数组放在这里。

#4


2  

The issue here (you probably know already but just to repeat it) is that list.index works along the lines of:

这里的问题(你可能已经知道但只是为了重复它)是list.index的工作原理如下:

for idx, item in enumerate(your_list):
    if item == wanted_item:
        return idx

The line if item == wanted_item is the problem, because it implicitly converts item == wanted_item to a boolean. But numpy.ndarray (except if it's a scalar) raises this ValueError then:

如果item == wanted_item的行是问题,因为它隐式地将item == wanted_item转换为布尔值。但是numpy.ndarray(除非它是标量)引发了这个ValueError然后:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

ValueError:具有多个元素的数组的真值是不明确的。使用a.any()或a.all()

Solution 1: adapter (thin wrapper) class

I generally use a thin wrapper (adapter) around numpy.ndarray whenever I need to use python functions like list.index:

每当我需要使用像list.index这样的python函数时,我通常会在numpy.ndarray周围使用一个瘦的包装器(适配器):

class ArrayWrapper(object):

    __slots__ = ["_array"]  # minimizes the memory footprint of the class.

    def __init__(self, array):
        self._array = array

    def __eq__(self, other_array):
        # array_equal also makes sure the shape is identical!
        # If you don't mind broadcasting you can also use
        # np.all(self._array == other_array)
        return np.array_equal(self._array, other_array)

    def __array__(self):
        # This makes sure that `np.asarray` works and quite fast.
        return self._array

    def __repr__(self):
        return repr(self._array)

These thin wrappers are more expensive than manually using some enumerate loop or comprehension but you don't have to re-implement the python functions. Assuming the list contains only numpy-arrays (otherwise you need to do some if ... else ... checking):

这些瘦包装比使用一些枚举循环或理解手动更昂贵,但您不必重新实现python函数。假设列表只包含numpy-arrays(否则你需要做一些if ...... else ...检查):

list_of_wrapped_arrays = [ArrayWrapper(arr) for arr in list_of_arrays]

After this step you can use all your python functions on this list:

完成此步骤后,您可以在此列表中使用所有python函数:

>>> list_of_arrays = [np.ones((3, 3)), np.ones((3)), np.ones((3, 3)) * 2, np.ones((3))]
>>> list_of_wrapped_arrays.index(np.ones((3,3)))
0
>>> list_of_wrapped_arrays.index(np.ones((3)))
1

These wrappers are not numpy-arrays anymore but you have thin wrappers so the extra list is quite small. So depending on your needs you could keep the wrapped list and the original list and choose on which to do the operations, for example you can also list.count the identical arrays now:

这些包装器不再是numpy-arrays,但你有薄包装器,所以额外的列表非常小。因此,根据您的需要,您可以保留包装列表和原始列表,并选择执行操作,例如,您现在还可以列出相同的数组:

>>> list_of_wrapped_arrays.count(np.ones((3)))
2

or list.remove:

或list.remove:

>>> list_of_wrapped_arrays.remove(np.ones((3)))
>>> list_of_wrapped_arrays
[array([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.],
        [ 1.,  1.,  1.]]), 
 array([[ 2.,  2.,  2.],
        [ 2.,  2.,  2.],
        [ 2.,  2.,  2.]]), 
 array([ 1.,  1.,  1.])]

Solution 2: subclass and ndarray.view

This approach uses explicit subclasses of numpy.array. It has the advantage that you get all builtin array-functionality and only modify the requested operation (which would be __eq__):

此方法使用numpy.array的显式子类。它的优点是你可以获得所有内置的数组功能,并且只修改所请求的操作(这将是__eq__):

class ArrayWrapper(np.ndarray):
    def __eq__(self, other_array):
        return np.array_equal(self, other_array)

>>> your_list = [np.ones(3), np.ones(3)*2, np.ones(3)*3, np.ones(3)*4]

>>> view_list = [arr.view(ArrayWrapper) for arr in your_list]

>>> view_list.index(np.array([2,2,2]))
1

Again you get most list methods this way: list.remove, list.count besides list.index.

再次,您可以通过以下方式获得大多数列表方法:list.remove,list.count以及list.index。

However this approach may yield subtle behaviour if some operation implicitly uses __eq__. You can always re-interpret is as plain numpy array by using np.asarray or .view(np.ndarray):

但是,如果某些操作隐式使用__eq__,则此方法可能会产生微妙的行为。你总是可以使用np.asarray或.view(np.ndarray)重新解释为普通的numpy数组:

>>> view_list[1]
ArrayWrapper([ 2.,  2.,  2.])

>>> view_list[1].view(np.ndarray)
array([ 2.,  2.,  2.])

>>> np.asarray(view_list[1])
array([ 2.,  2.,  2.])

Alternative: Overriding __bool__ (or __nonzero__ for python 2)

Instead of fixing the problem in the __eq__ method you could also override __bool__ or __nonzero__:

您可以覆盖__bool__或__nonzero__,而不是在__eq__方法中修复问题:

class ArrayWrapper(np.ndarray):
    # This could also be done in the adapter solution.
    def __bool__(self):
        return bool(np.all(self))

    __nonzero__ = __bool__

Again this makes the list.index work like intended:

这再次使list.index像预期的那样工作:

>>> your_list = [np.ones(3), np.ones(3)*2, np.ones(3)*3, np.ones(3)*4]
>>> view_list = [arr.view(ArrayWrapper) for arr in your_list]
>>> view_list.index(np.array([2,2,2]))
1

But this will definitly modify more behaviour! For example:

但这肯定会修改更多行为!例如:

>>> if ArrayWrapper([1,2,3]):
...     print('that was previously impossible!')
that was previously impossible!

#5


0  

This should do the job:

这应该做的工作:

[i for i,j in enumerate(foo) if j.__class__.__name__=='ndarray']
[2]