查找最接近未完全排序的列表中的值的项的索引

As an example my list is:

作为一个例子，我的列表是：

[25.75443, 26.7803, 25.79099, 24.17642, 24.3526, 22.79056, 20.84866, 19.49222, 18.38086, 18.0358, 16.57819, 15.71255, 14.79059, 13.64154, 13.09409, 12.18347, 11.33447, 10.32184, 9.544922, 8.813385, 8.181152, 6.983734, 6.048035, 5.505096, 4.65799]

and I'm looking for the index of the value closest to 11.5. I've tried other methods such as binary search and bisect_left but they don't work.

我正在寻找最接近11.5的值的索引。我尝试过其他方法，如二进制搜索和bisect_left但它们不起作用。

I cannot sort this array, because the index of the value will be used on a similar array to fetch the value at that index.

我无法对此数组进行排序，因为值的索引将用于类似的数组以获取该索引处的值。

7 个解决方案

#1

104

Try the following:

请尝试以下方法：

min(range(len(a)), key=lambda i: abs(a[i]-11.5))

For example:

例如：

>>> a = [25.75443, 26.7803, 25.79099, 24.17642, 24.3526, 22.79056, 20.84866, 19.49222, 18.38086, 18.0358, 16.57819, 15.71255, 14.79059, 13.64154, 13.09409, 12.18347, 11.33447, 10.32184, 9.544922, 8.813385, 8.181152, 6.983734, 6.048035, 5.505096, 4.65799]
>>> min(range(len(a)), key=lambda i: abs(a[i]-11.5))
16

Or to get the index and the value:

或者获取索引和值：

>>> min(enumerate(a), key=lambda x: abs(x[1]-11.5))
(16, 11.33447)

#2

How about: you zip the two lists, then sort the result?

怎么样：你压缩两个列表，然后对结果进行排序？

#3

If you can't sort the array, then there is no quick way to find the closest item - you have to iterate over all entries.

如果您无法对数组进行排序，则无法快速找到最近的项目 - 您必须迭代所有条目。

There is a workaround but it's quite a bit of work: Write a sort algorithm which sorts the array and (at the same time) updates a second array which tells you where this entry was before the array was sorted.

有一个解决方法，但它有相当多的工作：编写一个排序算法，对数组进行排序，并（同时）更新第二个数组，告诉你在数组排序之前这个条目的位置。

That way, you can use binary search to look up index of the closest entry and then use this index to look up the original index using the "index array".

这样，您可以使用二进制搜索来查找最近条目的索引，然后使用此索引使用“索引数组”查找原始索引。

[EDIT] Using zip(), this is pretty simple to achieve:

[编辑]使用zip（），这很容易实现：

 array_to_sort = zip( original_array, range(len(original_array)) )
 array_to_sort.sort( key=i:i[0] )

Now you can binary search for the value (using item[0]). item[1] will give you the original index.

现在您可以二进制搜索该值（使用项[0]）。 item [1]会给你原始索引。

#4

Going trough all the items is only linear. If you would sort the array that would be worse.

通过所有项目只是线性的。如果你要对阵列进行排序会更糟糕。

I dont see a problem on keeping an additional deltax(the min difference so far) and idx(the index of that element) and just loop once trough the list.

我没有看到保持额外的deltax（到目前为止的最小差异）和idx（该元素的索引）的问题，只是循环一次通过列表。

#5

Keep in mind that if space isn't important you can sort any list without moving the contents by creating a secondary list of the sorted indices.

请记住，如果空间不重要，您可以通过创建排序索引的辅助列表来对任何列表进行排序，而无需移动内容。

Also bear in mind that if you are doing this look up just once, then you will just have to traverse every element in the list O(n). (If multiple times then you probably would want to sort for increase efficiency later)

还要记住，如果你只是查看一次，那么你只需要遍历列表O（n）中的每个元素。（如果多次，那么您可能希望以后对提高效率进行排序）

#6

import numpy as np

a = [25.75443, 26.7803, 25.79099, 24.17642, 24.3526, 22.79056, 20.84866, 19.49222, 18.38086, 18.0358, 16.57819, 15.71255, 14.79059, 13.64154, 13.09409, 12.18347, 11.33447, 10.32184, 9.544922, 8.813385, 8.181152, 6.983734, 6.048035, 5.505096, 4.65799]

index = np.argmin(np.abs(np.array(a)-11.5))
a[index] # here is your result

In case a is already an array, the corresponding transformation can be ommitted.

如果a已经是数组，则可以省略相应的转换。

#7

If you are searching a long list a lot of times, then min scales very bad (O(n) or even O(n^2) if you append some of your searches to the search list I think). Bisect is your friend. Here's my solution. It scales O(log(n)) worst case:

如果你经常搜索一个很长的列表，那么如果你把你的一些搜索附加到我认为的搜索列表中，那么min刻度非常差（O（n）甚至O（n ^ 2）。 Bisect是你的朋友。这是我的解决方案。它会扩展O（log（n））最坏的情况：

class Closest:
    """Assumes *no* redundant entries - all inputs must be unique"""
    def __init__(self, numlist=[], firstdistance=0):
        self.numindexes = dict((val, n) for n, val in enumerate(numlist))
        self.nums = sorted(self.numindexes)
        self.firstdistance = firstdistance

    def append(self, num):
        if num in self.numindexes:
            raise ValueError("Cannot append '%i' it is already used" % num)
        self.numindexes[num] = len(self.nums)
        bisect.insort(self.nums, num)

    def rank(self, target):
        rank = bisect.bisect(self.nums, target)
        if rank == 0:
            pass
        elif len(self.nums) == rank:
            rank -= 1
        else:
            dist1 = target - self.nums[rank - 1]
            dist2 = self.nums[rank] - target
            if dist1 < dist2:
                rank -= 1
        return rank

    def closest(self, target):
        try:
            return self.numindexes[self.nums[self.rank(target)]]
        except IndexError:
            return 0

    def distance(self, target):
        rank = self.rank(target)
        try:
            dist = abs(self.nums[rank] - target)
        except IndexError:
            dist = self.firstdistance
        return dist

Use it like this:

像这样用它：

a = [25.75443, 26.7803, 25.79099, 24.17642, 24.3526, 22.79056, 20.84866, 19.49222, 18.38086, 18.0358, 16.57819, 15.71255, 14.79059, 13.64154, 13.09409, 12.18347, 11.33447, 10.32184, 9.544922, 8.813385, 8.181152, 6.983734, 6.048035, 5.505096, 4.65799]
cl = Closest(a)
for x in targets:
    rank = cl.rank(x)
    print("Closest number:", cl.nums[rank])
    print("Closest index:",  self.numindexes[self.nums[rank]])

#1

104