使用Linq to SQL,如何在表中找到列的最小值和最大值?

时间:2022-09-28 14:10:45

I want to find the fastest way to get the min and max of a column in a table with a single Linq to SQL roundtrip. So I know this would work in two roundtrips:

我想找到一种最快的方法来获取表中的列的最小值和最大值,并使用单个Linq to SQL往返。所以我知道这将在两个往返中起作用:

int min = MyTable.Min(row => row.FavoriteNumber);
int max = MyTable.Max(row => row.FavoriteNumber);

I know I can use group but I don't have a group by clause, I want to aggregate over the whole table! And I can't use the .Min without grouping first. I did try this:

我知道我可以使用group但是我没有group by子句,我想聚合整个表!如果没有先分组,我就不能使用.Min。我试过这个:

from row in MyTable 
group row by true into r 
select new { 
    min = r.Min(z => z.FavoriteNumber), 
    max = r.Max(z => z.FavoriteNumber) 
}

But that crazy group clause seems silly, and the SQL it makes is more complex than it needs to be.

但是这个疯狂的组子句似乎很愚蠢,而它所做的SQL比它需要的更复杂。

So, is there any way to just get the correct SQL out?

那么,有没有办法让正确的SQL出来?

EDIT: These guys failed too: Linq to SQL: how to aggregate without a group by? ... lame oversight by LINQ designers if there's really no answer.

编辑:这些家伙也失败了:Linq to SQL:如何在没有小组的情况下聚合? ......如果真的没有答案,LINQ设计师的疏忽。

EDIT 2: I looked at my own solution (with the nonsensical constant group by clause) in the SQL Server Management Studio execution plan analysis, and it looks to me like it is identical to the plan generated by:

编辑2:我在SQL Server Management Studio执行计划分析中查看了我自己的解决方案(使用无意义的常量group by子句),它在我看来它与生成的计划完全相同:

SELECT MIN(FavoriteNumber), MAX(FavoriteNumber)
FROM MyTable

so unless someone can come up with a simpler-or-equally-as-good answer, I think I have to mark it as answered-by-myself. Thoughts?

因此,除非有人能够提出一个更简单或同样好的答案,否则我认为我必须将其标记为自己回答。思考?

5 个解决方案

#1


27  

As stated in the question, this method seems to actually generate optimal SQL code, so while it looks a bit squirrely in LINQ, it should be optimal performance-wise.

正如问题中所述,这种方法似乎实际上生成了最佳的SQL代码,因此虽然它在LINQ中看起来有点松懈,但它应该是性能最佳的。

from row in MyTable  
group row by true into r  
select new {  
    min = r.Min(z => z.FavoriteNumber),  
    max = r.Max(z => z.FavoriteNumber)  
} 

#2


6  

I could find only this one which produces somewhat clean sql still not really effective comparing to select min(val), max(val) from table:

我发现只有这个产生一些干净的sql仍然没有真正有效,相比于表中的select min(val),max(val):

var r =
  (from min in items.OrderBy(i => i.Value)
   from max in items.OrderByDescending(i => i.Value)
   select new {min, max}).First();

the sql is

sql是

SELECT TOP (1)
    [t0].[Value],
    [t1].[Value] AS [Value2]
FROM
    [TestTable] AS [t0],
    [TestTable] AS [t1]
ORDER BY
    [t0].[Value],
    [t1].[Value] DESC

still there is another option to use single connection for both min and max queries (see Multiple Active Result Sets (MARS))

仍然有另一种选择为最小和最大查询使用单一连接(请参阅多个活动结果集(MARS))

or stored procedure..

或存储过程..

#3


2  

I'm not sure how to translate it into C# yet (I'm working on it)

我不知道如何将它翻译成C#(我正在研究它)

This is the Haskell version

这是Haskell版本

minAndMax :: Ord a => [a] -> (a,a)
minAndMax [x]    = (x,x)
minAndMax (x:xs) = (min a x, max b x)
                   where (a,b) = minAndMax xs

The C# version should involve Aggregate some how (I think).

C#版本应该涉及聚合一些(我认为)。

#4


1  

A LINQ to SQL query is a single expression. Thus, if you can't express your query in a single expression (or don't like it once you do) then you have to look at other options.

LINQ to SQL查询是单个表达式。因此,如果您无法在单个表达式中表达您的查询(或者您不喜欢它),那么您必须查看其他选项。

Stored procedures, since they can have statements, enable you to accomplish this in a single round-trip. You will either have two output parameters or select a result set with two rows. Either way, you will need custom code to read the stored procedure's result.

存储过程,因为它们可以有语句,使您能够在一次往返中完成此操作。您将具有两个输出参数或选择具有两行的结果集。无论哪种方式,您都需要自定义代码来读取存储过程的结果。

(I don't personally see the need to avoid two round-trips here. It seems like a premature optimization, especially since you will probably have to jump through hoops to get it working. Not to mention the time you will spend justifying this decision and explaining the solution to other developers.)

(我个人认为没有必要避免在这里进行两次往返。这似乎是一种过早的优化,特别是因为你可能不得不跳过箍来使其工作。更不用说你花费时间来证明这个决定的合理性并向其他开发人员解释解决方案。)

Put another way: you've already answered your own question. "I can't use the .Min without grouping first", followed by "that crazy group clause seems silly, and the SQL it makes is more complex than it needs to be", are clues that the simple and easily-understood two-round-trip solution is the best expression of your intent (unless you write custom SQL).

换句话说:你已经回答了自己的问题。 “我不能在没有分组的情况下使用.Min”,其次是“疯狂的群组条款似乎很愚蠢,而且它制作的SQL比它需要的更复杂”,是简单易懂的两条线索 - 往返解决方案是您意图的最佳表达(除非您编写自定义SQL)。

#5


0  

You could select the whole table, and do your min and max operations in memory:

您可以选择整个表,并在内存中执行最小和最大操作:

var cache = // select *

var min = cache.Min(...);
var max = cache.Max(...);

Depending on how large your dataset is, this might be the way to go about not hitting your database more than once.

根据数据集的大小,这可能是不多次访问数据库的方法。

#1


27  

As stated in the question, this method seems to actually generate optimal SQL code, so while it looks a bit squirrely in LINQ, it should be optimal performance-wise.

正如问题中所述,这种方法似乎实际上生成了最佳的SQL代码,因此虽然它在LINQ中看起来有点松懈,但它应该是性能最佳的。

from row in MyTable  
group row by true into r  
select new {  
    min = r.Min(z => z.FavoriteNumber),  
    max = r.Max(z => z.FavoriteNumber)  
} 

#2


6  

I could find only this one which produces somewhat clean sql still not really effective comparing to select min(val), max(val) from table:

我发现只有这个产生一些干净的sql仍然没有真正有效,相比于表中的select min(val),max(val):

var r =
  (from min in items.OrderBy(i => i.Value)
   from max in items.OrderByDescending(i => i.Value)
   select new {min, max}).First();

the sql is

sql是

SELECT TOP (1)
    [t0].[Value],
    [t1].[Value] AS [Value2]
FROM
    [TestTable] AS [t0],
    [TestTable] AS [t1]
ORDER BY
    [t0].[Value],
    [t1].[Value] DESC

still there is another option to use single connection for both min and max queries (see Multiple Active Result Sets (MARS))

仍然有另一种选择为最小和最大查询使用单一连接(请参阅多个活动结果集(MARS))

or stored procedure..

或存储过程..

#3


2  

I'm not sure how to translate it into C# yet (I'm working on it)

我不知道如何将它翻译成C#(我正在研究它)

This is the Haskell version

这是Haskell版本

minAndMax :: Ord a => [a] -> (a,a)
minAndMax [x]    = (x,x)
minAndMax (x:xs) = (min a x, max b x)
                   where (a,b) = minAndMax xs

The C# version should involve Aggregate some how (I think).

C#版本应该涉及聚合一些(我认为)。

#4


1  

A LINQ to SQL query is a single expression. Thus, if you can't express your query in a single expression (or don't like it once you do) then you have to look at other options.

LINQ to SQL查询是单个表达式。因此,如果您无法在单个表达式中表达您的查询(或者您不喜欢它),那么您必须查看其他选项。

Stored procedures, since they can have statements, enable you to accomplish this in a single round-trip. You will either have two output parameters or select a result set with two rows. Either way, you will need custom code to read the stored procedure's result.

存储过程,因为它们可以有语句,使您能够在一次往返中完成此操作。您将具有两个输出参数或选择具有两行的结果集。无论哪种方式,您都需要自定义代码来读取存储过程的结果。

(I don't personally see the need to avoid two round-trips here. It seems like a premature optimization, especially since you will probably have to jump through hoops to get it working. Not to mention the time you will spend justifying this decision and explaining the solution to other developers.)

(我个人认为没有必要避免在这里进行两次往返。这似乎是一种过早的优化,特别是因为你可能不得不跳过箍来使其工作。更不用说你花费时间来证明这个决定的合理性并向其他开发人员解释解决方案。)

Put another way: you've already answered your own question. "I can't use the .Min without grouping first", followed by "that crazy group clause seems silly, and the SQL it makes is more complex than it needs to be", are clues that the simple and easily-understood two-round-trip solution is the best expression of your intent (unless you write custom SQL).

换句话说:你已经回答了自己的问题。 “我不能在没有分组的情况下使用.Min”,其次是“疯狂的群组条款似乎很愚蠢,而且它制作的SQL比它需要的更复杂”,是简单易懂的两条线索 - 往返解决方案是您意图的最佳表达(除非您编写自定义SQL)。

#5


0  

You could select the whole table, and do your min and max operations in memory:

您可以选择整个表,并在内存中执行最小和最大操作:

var cache = // select *

var min = cache.Min(...);
var max = cache.Max(...);

Depending on how large your dataset is, this might be the way to go about not hitting your database more than once.

根据数据集的大小,这可能是不多次访问数据库的方法。