LINQ可以在订购集合时使用二进制搜索吗?

时间:2022-12-20 15:26:18

Can I somehow "instruct" LINQ to use binary search when the collection that I'm trying to search is ordered. I'm using an ObservableCollection<T>, populated with ordered data, and I'm trying to use Enumerable.First(<Predicate>). In my predicate, I'm filtering by the value of the field my collection's sorted by.

我可以以某种方式“指示”LINQ在我尝试搜索的集合被订购时使用二进制搜索。我正在使用ObservableCollection ,填充有序数据,我正在尝试使用Enumerable.First( )。在我的谓词中,我按照我的集合排序的字段的值进行过滤。

5 个解决方案

#1


27  

As far as I know, it's not possible with the built-in methods. However it would be relatively easy to write an extension method that would allow you to write something like that :

据我所知,使用内置方法是不可能的。但是编写一个允许你编写类似的扩展方法会相对容易:

var item = myCollection.BinarySearch(i => i.Id, 42);

(assuming, of course, that you collection implements IList ; there's no way to perform a binary search if you can't access the items randomly)

(当然,假设您的集合实现了IList;如果您无法随机访问项目,则无法执行二进制搜索)

Here's a sample implementation :

这是一个示例实现:

public static T BinarySearch<T, TKey>(this IList<T> list, Func<T, TKey> keySelector, TKey key)
        where TKey : IComparable<TKey>
{
    if (list.Count == 0)
        throw new InvalidOperationException("Item not found");

    int min = 0;
    int max = list.Count;
    while (min < max)
    {
        int mid = min + ((max - min) / 2);
        T midItem = list[mid];
        TKey midKey = keySelector(midItem);
        int comp = midKey.CompareTo(key);
        if (comp < 0)
        {
            min = mid + 1;
        }
        else if (comp > 0)
        {
            max = mid - 1;
        }
        else
        {
            return midItem;
        }
    }
    if (min == max &&
        min < list.Count &&
        keySelector(list[min]).CompareTo(key) == 0)
    {
        return list[min];
    }
    throw new InvalidOperationException("Item not found");
}

(not tested... a few adjustments might be necessary) Now tested and fixed ;)

(未经测试......可能需要进行一些调整)现已测试并修复;)

The fact that it throws an InvalidOperationException may seem strange, but that's what Enumerable.First does when there's no matching item.

抛出InvalidOperationException这一事实可能看起来很奇怪,但这就是Enumerable.First在没有匹配项时所做的事情。

#2


3  

The accepted answer is very good.

接受的答案非常好。

However, I need that the BinarySearch returns the index of the first item that is larger, as the List<T>.BinarySearch() does.

但是,我需要BinarySearch返回较大的第一个项的索引,如List .BinarySearch()所做的那样。

So I watched its implementation by using ILSpy, then I modified it to have a selector parameter. I hope it will be as useful to someone as it is for me:

所以我通过使用ILSpy观察了它的实现,然后我将其修改为具有选择器参数。我希望它对某人有用,对我来说也是如此:

public static class ListExtensions
{
    public static int BinarySearch<T, U>(this IList<T> tf, U target, Func<T, U> selector)
    {
        var lo = 0;
        var hi = (int)tf.Count - 1;
        var comp = Comparer<U>.Default;

        while (lo <= hi)
        {
            var median = lo + (hi - lo >> 1);
            var num = comp.Compare(selector(tf[median]), target);
            if (num == 0)
                return median;
            if (num < 0)
                lo = median + 1;
            else
                hi = median - 1;
        }

        return ~lo;
    }
}

#3


2  

Well, you can write your own extension method over ObservableCollection<T> - but then that will be used for any ObservableCollection<T> where your extension method is available, without knowing whether it's sorted or not.

好吧,你可以在ObservableCollection 上编写自己的扩展方法 - 但是这将用于你的扩展方法可用的任何ObservableCollection ,而不知道它是否已经排序。

You'd also have to indicate in the predicate what you wanted to find - which would be better done with an expression tree... but that would be a pain to parse. Basically, the signature of First isn't really suitable for a binary search.

你还必须在谓词中指出你想要找到的东西 - 用表达式树做得更好......但是解析起来会很麻烦。基本上,First的签名并不适合二进制搜索。

I suggest you don't try to overload the existing signatures, but write a new one, e.g.

我建议你不要试图重写现有的签名,而是写一个新签名,例如

public static TElement BinarySearch<TElement, TKey>
    (this IList<TElement> collection, Func<TElement, TItem> keySelector,
     TKey key)

(I'm not going to implement it right now, but I can do so later if you want.)

(我现在不打算实现它,但如果你愿意,我可以稍后再这样做。)

By providing a function, you can search by the property the collection is sorted by, rather than by the items themselves.

通过提供函数,您可以按属性搜索集合,而不是按项目本身进行搜索。

#4


1  

Enumerable.First(predicate) works on an IEnumarable<T> which only supports enumeration, therefore it does not have random access to the items within.

Enumerable.First(谓词)在IEnumarable 上工作,它只支持枚举,因此它没有随机访问其中的项目。

Also, your predicate contains arbitrary code that eventually results in true or false, and so cannot indicate whether the tested item was too low or too high. This information would be needed in order to do a binary search.

此外,您的谓词包含最终导致true或false的任意代码,因此无法指示测试项目是否太低或太高。为了进行二分查找,需要此信息。

Enumerable.First(predicate) can only test each item in order as it walks through the enumeration.

Enumerable.First(谓词)只能在遍历枚举时按顺序测试每个项目。

#5


1  

Keep in mind that all(? at least most) of the extension methods used by LINQ are implemented on IQueryable<T>orIEnumerable<T> or IOrderedEnumerable<T> or IOrderedQueryable<T>.

请记住,LINQ使用的所有(?至少大多数)扩展方法都是在IQueryable 或IEnumerable 或IOrderedEnumerable 或IOrderedQueryable 上实现的。

None of these interfaces supports random access, and therefore none of them can be used for a binary search. One of the benefits of something like LINQ is that you can work with large datasets without having to return the entire dataset from the database. Obviously you can't binary search something if you don't even have all of the data yet.

这些接口都不支持随机访问,因此它们都不能用于二进制搜索。 LINQ之类的好处之一是您可以使用大型数据集而无需从数据库返回整个数据集。如果你还没有所有的数据,显然你不能二进制搜索。

But as others have said, there is no reason at all you can't write this extension method for IList<T> or other collection types that support random access.

但正如其他人所说,没有理由你不能为IList 或其他支持随机访问的集合类型编写这种扩展方法。

#1


27  

As far as I know, it's not possible with the built-in methods. However it would be relatively easy to write an extension method that would allow you to write something like that :

据我所知,使用内置方法是不可能的。但是编写一个允许你编写类似的扩展方法会相对容易:

var item = myCollection.BinarySearch(i => i.Id, 42);

(assuming, of course, that you collection implements IList ; there's no way to perform a binary search if you can't access the items randomly)

(当然,假设您的集合实现了IList;如果您无法随机访问项目,则无法执行二进制搜索)

Here's a sample implementation :

这是一个示例实现:

public static T BinarySearch<T, TKey>(this IList<T> list, Func<T, TKey> keySelector, TKey key)
        where TKey : IComparable<TKey>
{
    if (list.Count == 0)
        throw new InvalidOperationException("Item not found");

    int min = 0;
    int max = list.Count;
    while (min < max)
    {
        int mid = min + ((max - min) / 2);
        T midItem = list[mid];
        TKey midKey = keySelector(midItem);
        int comp = midKey.CompareTo(key);
        if (comp < 0)
        {
            min = mid + 1;
        }
        else if (comp > 0)
        {
            max = mid - 1;
        }
        else
        {
            return midItem;
        }
    }
    if (min == max &&
        min < list.Count &&
        keySelector(list[min]).CompareTo(key) == 0)
    {
        return list[min];
    }
    throw new InvalidOperationException("Item not found");
}

(not tested... a few adjustments might be necessary) Now tested and fixed ;)

(未经测试......可能需要进行一些调整)现已测试并修复;)

The fact that it throws an InvalidOperationException may seem strange, but that's what Enumerable.First does when there's no matching item.

抛出InvalidOperationException这一事实可能看起来很奇怪,但这就是Enumerable.First在没有匹配项时所做的事情。

#2


3  

The accepted answer is very good.

接受的答案非常好。

However, I need that the BinarySearch returns the index of the first item that is larger, as the List<T>.BinarySearch() does.

但是,我需要BinarySearch返回较大的第一个项的索引,如List .BinarySearch()所做的那样。

So I watched its implementation by using ILSpy, then I modified it to have a selector parameter. I hope it will be as useful to someone as it is for me:

所以我通过使用ILSpy观察了它的实现,然后我将其修改为具有选择器参数。我希望它对某人有用,对我来说也是如此:

public static class ListExtensions
{
    public static int BinarySearch<T, U>(this IList<T> tf, U target, Func<T, U> selector)
    {
        var lo = 0;
        var hi = (int)tf.Count - 1;
        var comp = Comparer<U>.Default;

        while (lo <= hi)
        {
            var median = lo + (hi - lo >> 1);
            var num = comp.Compare(selector(tf[median]), target);
            if (num == 0)
                return median;
            if (num < 0)
                lo = median + 1;
            else
                hi = median - 1;
        }

        return ~lo;
    }
}

#3


2  

Well, you can write your own extension method over ObservableCollection<T> - but then that will be used for any ObservableCollection<T> where your extension method is available, without knowing whether it's sorted or not.

好吧,你可以在ObservableCollection 上编写自己的扩展方法 - 但是这将用于你的扩展方法可用的任何ObservableCollection ,而不知道它是否已经排序。

You'd also have to indicate in the predicate what you wanted to find - which would be better done with an expression tree... but that would be a pain to parse. Basically, the signature of First isn't really suitable for a binary search.

你还必须在谓词中指出你想要找到的东西 - 用表达式树做得更好......但是解析起来会很麻烦。基本上,First的签名并不适合二进制搜索。

I suggest you don't try to overload the existing signatures, but write a new one, e.g.

我建议你不要试图重写现有的签名,而是写一个新签名,例如

public static TElement BinarySearch<TElement, TKey>
    (this IList<TElement> collection, Func<TElement, TItem> keySelector,
     TKey key)

(I'm not going to implement it right now, but I can do so later if you want.)

(我现在不打算实现它,但如果你愿意,我可以稍后再这样做。)

By providing a function, you can search by the property the collection is sorted by, rather than by the items themselves.

通过提供函数,您可以按属性搜索集合,而不是按项目本身进行搜索。

#4


1  

Enumerable.First(predicate) works on an IEnumarable<T> which only supports enumeration, therefore it does not have random access to the items within.

Enumerable.First(谓词)在IEnumarable 上工作,它只支持枚举,因此它没有随机访问其中的项目。

Also, your predicate contains arbitrary code that eventually results in true or false, and so cannot indicate whether the tested item was too low or too high. This information would be needed in order to do a binary search.

此外,您的谓词包含最终导致true或false的任意代码,因此无法指示测试项目是否太低或太高。为了进行二分查找,需要此信息。

Enumerable.First(predicate) can only test each item in order as it walks through the enumeration.

Enumerable.First(谓词)只能在遍历枚举时按顺序测试每个项目。

#5


1  

Keep in mind that all(? at least most) of the extension methods used by LINQ are implemented on IQueryable<T>orIEnumerable<T> or IOrderedEnumerable<T> or IOrderedQueryable<T>.

请记住,LINQ使用的所有(?至少大多数)扩展方法都是在IQueryable 或IEnumerable 或IOrderedEnumerable 或IOrderedQueryable 上实现的。

None of these interfaces supports random access, and therefore none of them can be used for a binary search. One of the benefits of something like LINQ is that you can work with large datasets without having to return the entire dataset from the database. Obviously you can't binary search something if you don't even have all of the data yet.

这些接口都不支持随机访问,因此它们都不能用于二进制搜索。 LINQ之类的好处之一是您可以使用大型数据集而无需从数据库返回整个数据集。如果你还没有所有的数据,显然你不能二进制搜索。

But as others have said, there is no reason at all you can't write this extension method for IList<T> or other collection types that support random access.

但正如其他人所说,没有理由你不能为IList 或其他支持随机访问的集合类型编写这种扩展方法。