DataView。RowFilter Vs DataTable.Select() Vs datatable. row . find ()

时间:2022-04-02 04:06:48

Considering the code below:

考虑下面的代码:

Dataview someView = new DataView(sometable)
someView.RowFilter = someFilter;

if(someView.count > 0) {  …. }

Quite a number of articles which say Datatable.Select() is better than using DataViews, but these are prior to VS2008.

很多文章都说Datatable.Select()比使用DataViews要好,但这些都是在VS2008之前。

Solved: The Mystery of DataView's Poor Performance with Large Recordsets
Array of DataRecord vs. DataView: A Dramatic Difference in Performance

解决了:DataView性能较差的秘密是:在性能上的巨大差异

Googling on this topic I found some articles/forum topics which mention Datatable.Select() itself is quite buggy(not sure on this) and underperforms in various scenarios.

在google上搜索这个主题时,我发现一些文章/论坛主题提到了Datatable.Select()本身就有很大的问题(对此我并不确定),并且在各种场景中表现不佳。

On this(Best Practices ADO.NET) topic on msdn it is suggested that if there is primary key defined on a datatable the findrows() or find() methods should be used insted of Datatable.Select().

对于msdn上的这个(最佳实践ADO.NET)主题,建议如果在datatable上定义了主键,则应该在Datatable.Select()的insted中使用findrows()或find()方法。

This article here (.NET 1.1) benchmarks all the three approaches plus a couple more. But this is for version 1.1 so not sure if these are valid still now. Accroding to this DataRowCollection.Find() outperforms all approaches and Datatable.Select() outperforms DataView.RowFilter.

(这篇文章。所有这三种方法的基准再加上两个。但这是1.1版,所以现在不确定这些是否有效。查找()优于所有方法,select()优于DataView.RowFilter。

So I am quite confused on what might be the best approach on finding rows in a datatable. Or there is no single good way to do this, multiple solutions exist depending upon the scenario?

因此,对于在datatable中查找行最好的方法是什么,我非常困惑。或者没有单一的好方法,根据场景有多种解决方案?

2 个解决方案

#1


46  

You are looking for the "best approach on finding rows in a datatable", so I first have to ask: "best" for what? I think, any technique has scenarios where it might fit better then the others.

您正在寻找“在datatable中查找行的最佳方法”,因此我首先必须问:“best”是什么?我认为,任何技术都有可能比其他技术更适合的场景。

First, let's look at DataView.RowFilter: A DataView has some advantages in Data Binding. Its very view-oriented so it has powerful sorting, filtering or searching features, but creates some overhead and is not optimized for performance. I would choose the DataView.RowFilter for smaller recordsets and/or where you take advantage of the other features (like, a direct data binding to the view).

首先,让我们看看DataView。RowFilter: DataView在数据绑定方面有一些优势。它是面向视图的,因此它具有强大的排序、过滤或搜索功能,但会产生一些开销,并且不会对性能进行优化。我会选择DataView。RowFilter用于较小的记录集和/或您利用其他特性的地方(例如,直接绑定到视图的数据)。

Most facts about the DataView, which you can read in older posts, still apply.

大多数关于DataView的事实,您可以在旧的帖子中阅读,仍然适用。

Second, you should prefer DataTable.Rows.Find over DataTable.Select if you want just a single hit. Why? DataTable.Rows.Find returns only a single row. Essentially, when you specify the primary key, a binary tree is created. This has some overhead associated with it, but tremendously speeds up the retrieval.

其次,您应该更喜欢DataTable.Rows。在DataTable找到。如果您想要单个命中,请选择。为什么?DataTable.Rows。只返回一行。本质上,当您指定主键时,将创建一个二叉树。这有一些与它相关的开销,但是极大地加速了检索。

DataTable.Select is slower, but can come very handy if you have multiple criteria and don't care about indexed or unindexed rows: It can find basically everything but is not optimized for performance. Essentially, DataTable.Select has to walk the entire table and compare every record to the criteria that you passed in.

数据表。选择比较慢,但是如果您有多个标准,并且不关心索引或未索引的行,那么可以非常方便地选择:它基本上可以找到所有的内容,但是没有对性能进行优化。从本质上讲,数据表。Select必须遍历整个表,并将每个记录与您传入的条件进行比较。

I hope you find this little overview helpful.

我希望你能发现这个小小的概述是有帮助的。

I'd suggest to take a look at this article, it was helpful for me regarding performance questions. This post contains some quotes from it.

我建议看一下这篇文章,它对我关于性能的问题很有帮助。这篇文章包含了一些引用。

A little UPDATE: By the way, this might seem a little out of scope of your question, but its nearly always the fastest solution to do the filtering and searching on the backend. If you want the simplicity and have an SQL Server as backend and .NET3+ on client, go for LINQ-to-SQL. Searching Linq objects is very comfortable and creates queries which are performed on server side. While LINQ-to-Objects is also a very comfortable but also slower technique. In case you didn't know already....

更新一下:顺便说一下,这似乎有点超出了您的问题范围,但它几乎总是在后台执行过滤和搜索的最快解决方案。如果您想要简单,并在客户端使用SQL服务器作为后端和. net3 +,请使用linqto -SQL。搜索Linq对象非常舒适,并创建在服务器端执行的查询。虽然LINQ-to-Objects也是一种非常舒适但速度较慢的技术。如果你不知道已经....

#2


20  

Thomashaid's post sums it up nicely:

托马斯海德的文章很好地总结了这一点:

  • DataView.RowFilter is for binding.
  • DataView。RowFilter绑定。
  • DataTable.Rows.Find is for searching by primary key only.
  • DataTable.Rows。Find仅用于使用主键进行搜索。
  • DataTable.Select is for searching by multiple columns and also for specifying an order.
  • 数据表。Select用于搜索多个列,也用于指定顺序。

Avoid creating many DataViews in a loop and using their RowFilters to search for records. This will drastically reduce performance.

避免在循环中创建多个数据视图,并使用它们的行过滤器搜索记录。这将大大降低性能。

I wanted to add that DataTable.Select can take advantage of indexes. You can create an index on a DataTable by creating a DataView and specifying a sort order:

我想添加那个DataTable。选择可以利用索引。您可以通过创建DataView并指定排序顺序在DataTable上创建索引:

DataView dv = new DataView(dt);
dv.Sort = "Col1, Col2";

Then, when you call DataTable.Select(), it can use this index when running the query. We have used this technique to seriously improve performance in places where we use the same query many, many times. (Note that this was before Linq existed.)

然后,当您调用DataTable.Select()时,它可以在运行查询时使用此索引。我们已经使用了这种技术,在我们多次使用相同查询的地方,认真地提高了性能。(注意,这是在Linq存在之前。)

The trick is to define the sort order correctly for the Select statement. So if your query is "Col1 = 1 and Col2 = 4", then you'll want "Col1, Col2" like in the example above.

诀窍是为Select语句正确定义排序顺序。因此,如果您的查询是“Col1 = 1, Col2 = 4”,那么您将希望使用上面示例中的“Col1, Col2”。

Note that the index creation may depend on the actual calls to create the DataView. We had to use the new DataView(DataTable dt) constructor, and then specify the Sort property in a separate step. The behavior may change slightly with different .NET versions.

注意,创建索引可能依赖于创建DataView的实际调用。我们必须使用新的DataView(DataTable dt)构造函数,然后在单独的步骤中指定Sort属性。对于不同的。net版本,这种行为可能会稍有不同。

#1


46  

You are looking for the "best approach on finding rows in a datatable", so I first have to ask: "best" for what? I think, any technique has scenarios where it might fit better then the others.

您正在寻找“在datatable中查找行的最佳方法”,因此我首先必须问:“best”是什么?我认为,任何技术都有可能比其他技术更适合的场景。

First, let's look at DataView.RowFilter: A DataView has some advantages in Data Binding. Its very view-oriented so it has powerful sorting, filtering or searching features, but creates some overhead and is not optimized for performance. I would choose the DataView.RowFilter for smaller recordsets and/or where you take advantage of the other features (like, a direct data binding to the view).

首先,让我们看看DataView。RowFilter: DataView在数据绑定方面有一些优势。它是面向视图的,因此它具有强大的排序、过滤或搜索功能,但会产生一些开销,并且不会对性能进行优化。我会选择DataView。RowFilter用于较小的记录集和/或您利用其他特性的地方(例如,直接绑定到视图的数据)。

Most facts about the DataView, which you can read in older posts, still apply.

大多数关于DataView的事实,您可以在旧的帖子中阅读,仍然适用。

Second, you should prefer DataTable.Rows.Find over DataTable.Select if you want just a single hit. Why? DataTable.Rows.Find returns only a single row. Essentially, when you specify the primary key, a binary tree is created. This has some overhead associated with it, but tremendously speeds up the retrieval.

其次,您应该更喜欢DataTable.Rows。在DataTable找到。如果您想要单个命中,请选择。为什么?DataTable.Rows。只返回一行。本质上,当您指定主键时,将创建一个二叉树。这有一些与它相关的开销,但是极大地加速了检索。

DataTable.Select is slower, but can come very handy if you have multiple criteria and don't care about indexed or unindexed rows: It can find basically everything but is not optimized for performance. Essentially, DataTable.Select has to walk the entire table and compare every record to the criteria that you passed in.

数据表。选择比较慢,但是如果您有多个标准,并且不关心索引或未索引的行,那么可以非常方便地选择:它基本上可以找到所有的内容,但是没有对性能进行优化。从本质上讲,数据表。Select必须遍历整个表,并将每个记录与您传入的条件进行比较。

I hope you find this little overview helpful.

我希望你能发现这个小小的概述是有帮助的。

I'd suggest to take a look at this article, it was helpful for me regarding performance questions. This post contains some quotes from it.

我建议看一下这篇文章,它对我关于性能的问题很有帮助。这篇文章包含了一些引用。

A little UPDATE: By the way, this might seem a little out of scope of your question, but its nearly always the fastest solution to do the filtering and searching on the backend. If you want the simplicity and have an SQL Server as backend and .NET3+ on client, go for LINQ-to-SQL. Searching Linq objects is very comfortable and creates queries which are performed on server side. While LINQ-to-Objects is also a very comfortable but also slower technique. In case you didn't know already....

更新一下:顺便说一下,这似乎有点超出了您的问题范围,但它几乎总是在后台执行过滤和搜索的最快解决方案。如果您想要简单,并在客户端使用SQL服务器作为后端和. net3 +,请使用linqto -SQL。搜索Linq对象非常舒适,并创建在服务器端执行的查询。虽然LINQ-to-Objects也是一种非常舒适但速度较慢的技术。如果你不知道已经....

#2


20  

Thomashaid's post sums it up nicely:

托马斯海德的文章很好地总结了这一点:

  • DataView.RowFilter is for binding.
  • DataView。RowFilter绑定。
  • DataTable.Rows.Find is for searching by primary key only.
  • DataTable.Rows。Find仅用于使用主键进行搜索。
  • DataTable.Select is for searching by multiple columns and also for specifying an order.
  • 数据表。Select用于搜索多个列,也用于指定顺序。

Avoid creating many DataViews in a loop and using their RowFilters to search for records. This will drastically reduce performance.

避免在循环中创建多个数据视图,并使用它们的行过滤器搜索记录。这将大大降低性能。

I wanted to add that DataTable.Select can take advantage of indexes. You can create an index on a DataTable by creating a DataView and specifying a sort order:

我想添加那个DataTable。选择可以利用索引。您可以通过创建DataView并指定排序顺序在DataTable上创建索引:

DataView dv = new DataView(dt);
dv.Sort = "Col1, Col2";

Then, when you call DataTable.Select(), it can use this index when running the query. We have used this technique to seriously improve performance in places where we use the same query many, many times. (Note that this was before Linq existed.)

然后,当您调用DataTable.Select()时,它可以在运行查询时使用此索引。我们已经使用了这种技术,在我们多次使用相同查询的地方,认真地提高了性能。(注意,这是在Linq存在之前。)

The trick is to define the sort order correctly for the Select statement. So if your query is "Col1 = 1 and Col2 = 4", then you'll want "Col1, Col2" like in the example above.

诀窍是为Select语句正确定义排序顺序。因此,如果您的查询是“Col1 = 1, Col2 = 4”,那么您将希望使用上面示例中的“Col1, Col2”。

Note that the index creation may depend on the actual calls to create the DataView. We had to use the new DataView(DataTable dt) constructor, and then specify the Sort property in a separate step. The behavior may change slightly with different .NET versions.

注意,创建索引可能依赖于创建DataView的实际调用。我们必须使用新的DataView(DataTable dt)构造函数,然后在单独的步骤中指定Sort属性。对于不同的。net版本,这种行为可能会稍有不同。