m来自上三角矩阵的最小值,其索引为元组列表

时间:2022-09-28 18:06:36

I have a np.ndarray as follows:

我有一个np.ndarray如下:

[[ inf   1.   3.   2.   1.]
 [ inf  inf   2.   3.   2.]
 [ inf  inf  inf   5.   4.]
 [ inf  inf  inf  inf   1.]
 [ inf  inf  inf  inf  inf]]

Is there a way to get the indices and values of the m smallest items in that nd array? So, if I wanted the 4 smallest it would be

有没有办法获得该nd数组中m个最小项的索引和值?所以,如果我想要4个最小的那个

[(0,1,1),(0,4,1),(3,4,1),(0,3,2)] 

where (row,col,val) is the notation above.

其中(row,col,val)是上面的符号。

If there are multiple values then one of them is just randomly chosen. For instance, there were 3 ones and then next smallest is a value 2 but (0,3,2), (1,2,2),(1,4,2) were all possible choices.

如果有多个值,则只需随机选择其中一个值。例如,有3个,然后下一个最小值是2,但(0,3,2),(1,2,2),(1,4,2)都是可能的选择。

Essentially, Can I extract the k smallest values in that format from the upper triangular matrix efficiently (the matrix is much larger than the example above). I tried flattening it, using square form, nsmallest, but am having trouble getting the indices and values to align. Thanks!

基本上,我能否有效地从上三角矩阵中提取该格式的k个最小值(矩阵比上面的例子大得多)。我尝试使用方形,最小的方法展平它,但是我很难将索引和值对齐。谢谢!

3 个解决方案

#1


2  

For an Inf filled array -

对于Inf填充阵列 -

r,c = np.unravel_index(a.ravel().argsort()[:4], a.shape)
out = zip(r,c,a[r,c])

For performance, consider using np.argpartition. So, replace a.ravel().argsort()[:4] with np.argpartition(a.ravel(), range(4))[:4].

为了提高性能,请考虑使用np.argpartition。因此,用np.argpartition(a.ravel(),range(4))[:4]替换a.ravel()。argsort()[:4]。

Sample run -

样品运行 -

In [285]: a
Out[285]: 
array([[ inf,   1.,   3.,   2.,   1.],
       [ inf,  inf,   2.,   3.,   2.],
       [ inf,  inf,  inf,   5.,   4.],
       [ inf,  inf,  inf,  inf,   1.],
       [ inf,  inf,  inf,  inf,  inf]])

In [286]: out
Out[286]: [(0, 1, 1.0), (0, 4, 1.0), (3, 4, 1.0), (0, 3, 2.0)]

For a generic case -

对于一般情况 -

R,C = np.triu_indices(a.shape[1],1)
idx = a[R,C].argsort()[:4]
r,c = R[idx], C[idx]
out = zip(r,c,a[r,c])

Sample run -

样品运行 -

In [351]: a
Out[351]: 
array([[ 68.,  67.,  81.,  23.,  16.],
       [ 84.,  83.,  20.,  66.,  48.],
       [ 58.,  72.,  98.,  63.,  30.],
       [ 61.,  40.,   1.,  86.,  22.],
       [ 29.,  95.,  38.,  22.,  95.]])
In [352]: out
Out[352]: [(0, 4, 16.0), (1, 2, 20.0), (3, 4, 22.0), (0, 3, 23.0)]

For performance, consider using np.argpartition. So, replace a[R,C].argsort()[:4] with np.argpartition(a[R,C], range(4))[:4].

为了提高性能,请考虑使用np.argpartition。因此,用np.argpartition(a [R,C],range(4))[:4]替换[R,C] .argsort()[:4]。

#2


0  

Something like this works:

像这样的东西有效:

import numpy as np
a = np.random.rand(4,4)
tuples = [(ix,iy, a[ix,iy]) for ix, row in enumerate(a) for iy, i in enumerate(row)]
sorted(tuples,key=lambda x: x[2])[:10]

Where k=10 ([:10]) from your question.

从你的问题k = 10([:10])。

If you only want the upper triangular elements you can add a condition to the list comprehension:

如果您只想要上三角形元素,则可以向列表推导添加条件:

a = np.random.rand(4,4)
tuples = [(ix,iy, a[ix,iy]) for ix, row in enumerate(a) for iy, i in enumerate(row) if ix<=iy]
sorted(tuples,key=lambda x: x[2])

#3


0  

If my np.array() is n I could get the n smallest values from it by flattening it (with *np.ndenumerate()), and using the heapq module's .heapify() and .smallest() methods like so:

如果我的np.array()是n,我可以通过展平它(使用* np.ndenumerate())并使用heapq模块的.heapify()和.smallest()方法得到它的n个最小值:

#!python
flattened = [(y,x) for x,y in np.ndenumerate(n)]
# tuples reversed for natural sorting on values rather than co-ords
heapq.heapify(flattened)
results = heapq.nsmallest(4, flattened)

But this will use plenty of extra memory and will extract the data and co-ordinates out of Numpy's efficient arrays into Python's native lists. So there's probably much better ways to do it more natively in Python.

但这将使用大量额外的内存,并将Numpy的高效数组中的数据和坐标提取到Python的本机列表中。因此,可能有更好的方法在Python中更原生地完成它。

#1


2  

For an Inf filled array -

对于Inf填充阵列 -

r,c = np.unravel_index(a.ravel().argsort()[:4], a.shape)
out = zip(r,c,a[r,c])

For performance, consider using np.argpartition. So, replace a.ravel().argsort()[:4] with np.argpartition(a.ravel(), range(4))[:4].

为了提高性能,请考虑使用np.argpartition。因此,用np.argpartition(a.ravel(),range(4))[:4]替换a.ravel()。argsort()[:4]。

Sample run -

样品运行 -

In [285]: a
Out[285]: 
array([[ inf,   1.,   3.,   2.,   1.],
       [ inf,  inf,   2.,   3.,   2.],
       [ inf,  inf,  inf,   5.,   4.],
       [ inf,  inf,  inf,  inf,   1.],
       [ inf,  inf,  inf,  inf,  inf]])

In [286]: out
Out[286]: [(0, 1, 1.0), (0, 4, 1.0), (3, 4, 1.0), (0, 3, 2.0)]

For a generic case -

对于一般情况 -

R,C = np.triu_indices(a.shape[1],1)
idx = a[R,C].argsort()[:4]
r,c = R[idx], C[idx]
out = zip(r,c,a[r,c])

Sample run -

样品运行 -

In [351]: a
Out[351]: 
array([[ 68.,  67.,  81.,  23.,  16.],
       [ 84.,  83.,  20.,  66.,  48.],
       [ 58.,  72.,  98.,  63.,  30.],
       [ 61.,  40.,   1.,  86.,  22.],
       [ 29.,  95.,  38.,  22.,  95.]])
In [352]: out
Out[352]: [(0, 4, 16.0), (1, 2, 20.0), (3, 4, 22.0), (0, 3, 23.0)]

For performance, consider using np.argpartition. So, replace a[R,C].argsort()[:4] with np.argpartition(a[R,C], range(4))[:4].

为了提高性能,请考虑使用np.argpartition。因此,用np.argpartition(a [R,C],range(4))[:4]替换[R,C] .argsort()[:4]。

#2


0  

Something like this works:

像这样的东西有效:

import numpy as np
a = np.random.rand(4,4)
tuples = [(ix,iy, a[ix,iy]) for ix, row in enumerate(a) for iy, i in enumerate(row)]
sorted(tuples,key=lambda x: x[2])[:10]

Where k=10 ([:10]) from your question.

从你的问题k = 10([:10])。

If you only want the upper triangular elements you can add a condition to the list comprehension:

如果您只想要上三角形元素,则可以向列表推导添加条件:

a = np.random.rand(4,4)
tuples = [(ix,iy, a[ix,iy]) for ix, row in enumerate(a) for iy, i in enumerate(row) if ix<=iy]
sorted(tuples,key=lambda x: x[2])

#3


0  

If my np.array() is n I could get the n smallest values from it by flattening it (with *np.ndenumerate()), and using the heapq module's .heapify() and .smallest() methods like so:

如果我的np.array()是n,我可以通过展平它(使用* np.ndenumerate())并使用heapq模块的.heapify()和.smallest()方法得到它的n个最小值:

#!python
flattened = [(y,x) for x,y in np.ndenumerate(n)]
# tuples reversed for natural sorting on values rather than co-ords
heapq.heapify(flattened)
results = heapq.nsmallest(4, flattened)

But this will use plenty of extra memory and will extract the data and co-ordinates out of Numpy's efficient arrays into Python's native lists. So there's probably much better ways to do it more natively in Python.

但这将使用大量额外的内存,并将Numpy的高效数组中的数据和坐标提取到Python的本机列表中。因此,可能有更好的方法在Python中更原生地完成它。