如何使用t-sql选择表中一列和一个重复的最高值

时间:2021-03-22 09:11:31

I have a table with some of the data a test performs on different samples that have different desirability (P1 is better than S1 which is better than S3 which is better than S2) AND sometimes any of these tests could be repeated. If a test fails, someone has to do it over.

我有一个表,其中包含一些测试对不同样本进行测试的数据(P1优于S1,优于S3,优于S2),有时可以重复这些测试中的任何一个。如果测试失败,有人必须完成。

I want my query to display only the best sample (P1>S1>S3>S2) AND only the repeated data, (not the original data).

我希望我的查询只显示最佳样本(P1> S1> S3> S2)和仅显示重复数据(不是原始数据)。

The following query works but as you can see it is rather long and complicated. I am still a junior SQL person so how can I accomplish the same thing with a shorter/better query ?

以下查询有效,但您可以看到它相当漫长而复杂。我仍然是一个初级SQL人员,那么如何用更短/更好的查询完成同样的事情呢?

I am trying to learn better SQL so I do not always have to ask these types of questions so explanations of why your query works better would be very helpful !

我正在努力学习更好的SQL,所以我不必总是问这些类型的问题,所以解释为什么你的查询更好地工作将是非常有帮助的!

   DECLARE @TempTable TABLE (Sample_ID varchar(10), TestRepeat int, TestResult varchar(1))
   --  In the end, ONLY the samples with Y should be displayed
   INSERT INTO @TempTable VALUES('61-0001-P1', 0, 'R') -- 1  Y
   INSERT INTO @TempTable VALUES('61-0002-P1', 0, 'R') -- 2  Y
   INSERT INTO @TempTable VALUES('61-0003-S1', 0, 'S') -- 3  Y
   INSERT INTO @TempTable VALUES('61-0004-S1', 0, 'R') -- 4  Y
   INSERT INTO @TempTable VALUES('61-0005-P1', 0, 'I') -- 5 
   INSERT INTO @TempTable VALUES('61-0005-P1', 1, 'S') -- 6  Y
   INSERT INTO @TempTable VALUES('61-0006-P1', 0, 'S') -- 7  Y
   INSERT INTO @TempTable VALUES('61-0006-S3', 0, 'R') -- 8
   INSERT INTO @TempTable VALUES('61-0007-P1', 0, 'S') -- 9  Y
   INSERT INTO @TempTable VALUES('61-0008-S3', 0, 'I') -- 10
   INSERT INTO @TempTable VALUES('61-0008-S3', 1, 'R') -- 11 Y
   INSERT INTO @TempTable VALUES('61-0009-P1', 0, 'R') -- 12 Y
   INSERT INTO @TempTable VALUES('61-0009-S1', 0, 'S') -- 13
   INSERT INTO @TempTable VALUES('61-0010-P1', 0, 'S') -- 14 Y
   INSERT INTO @TempTable VALUES('61-0011-S3', 0, 'S') -- 15 Y

   DECLARE @TempTable1 TABLE (Subject_ID varchar(7), Sample_ID varchar(10), SampleOrder int, TestRepeat int, TestResult varchar(1))

   INSERT @TempTable1
   SELECT  LEFT(Sample_ID,7) AS Subject_ID, 
     Sample_ID,
     SampleOrder = 
       CASE 
         WHEN RIGHT(Sample_ID,2) = 'P1' THEN 4
         WHEN RIGHT(Sample_ID,2) = 'S1' THEN 3
         WHEN RIGHT(Sample_ID,2) = 'S3' THEN 2
         WHEN RIGHT(Sample_ID,2) = 'S2' THEN 1
       END, 
     TestRepeat, 
     TestResult
   FROM @TempTable
   ORDER BY Subject_ID, SampleOrder;

   --SELECT * FROM @TempTable1;

   DECLARE @TempTable2 TABLE (Sample_ID varchar(10), TestRepeat int, TestResult varchar(1))

   INSERT @TempTable2 SELECT 
     tt1.Sample_ID,
     tt1.TestRepeat,
     tt1.TestResult
    FROM @TempTable1 tt1
    INNER JOIN (
        SELECT Subject_ID, MAX(SampleOrder) AS Max_SampleOrder
        FROM @TempTable1
        GROUP BY Subject_ID) subQ1 
        ON (tt1.Subject_ID=subQ1.Subject_ID AND tt1.SampleOrder=subQ1.Max_SampleOrder)
    ORDER BY tt1.Sample_ID;

   SELECT tt2.Sample_ID,
          tt2.TestRepeat, 
          tt2.TestResult
   FROM @TempTable2 tt2
   INNER JOIN (
        SELECT Sample_ID, MAX(TestRepeat) AS Max_TestRepeat
        FROM @TempTable2
        GROUP BY Sample_ID) subQ 
        ON (tt2.Sample_ID = subQ.Sample_ID AND tt2.TestRepeat=subq.Max_TestRepeat)
   ORDER BY tt2.Sample_ID, tt2.TestResult;

2 个解决方案

#1


3  

You can use row_number() in a sub query for this.

您可以在子查询中使用row_number()。

select Sample_ID, TestRepeat, TestResult
from
  (
    select Sample_ID, TestRepeat, TestResult,
           row_number() over(partition by left(Sample_ID, 7)
                             order by case right(Sample_ID,2)
                                        when 'P1' then 1
                                        when 'S1' then 2
                                        when 'S3' then 3
                                        when 'S2' then 4
                                       end, TestRepeat desc) as rn
    from @TempTable
  ) as T
where rn  = 1
order by Sample_ID

You can test the query on SE-Data

您可以在SE-Data上测试查询

Explanation:
row_number will enumerate your rows from 1. The partition by clause controls when numbering start from 1 again and the order by clause specifies the order of the numbering. The over() clause used above will give you a row_number() of 1 for the rows you are interested in. It is not possible to use row_number() in the where clause of a query so you have to use a derived table to be able to filter your rows on the result of row_number()

说明:row_number将枚举您的行1. partition by子句控制何时编号从1开始,order by子句指定编号的顺序。上面使用的over()子句将为您感兴趣的行提供1的row_number()。不可能在查询的where子句中使用row_number(),因此您必须使用派生表能够在row_number()的结果上过滤你的行

#2


0  

Try using common table expressions. You don't have to create all the table variables, it might be a little cleaner. Here is an example, I kept the original table variable from your example for the source data.

尝试使用公用表表达式。您不必创建所有表变量,它可能会更清洁一些。这是一个例子,我保留了源数据示例中的原始表变量。

DECLARE @TempTable TABLE (Sample_ID varchar(10), TestRepeat int, TestResult varchar(1))
--  In the end, ONLY the samples with Y should be displayed
INSERT INTO @TempTable VALUES('61-0001-P1', 0, 'R') -- 1  Y
INSERT INTO @TempTable VALUES('61-0002-P1', 0, 'R') -- 2  Y
INSERT INTO @TempTable VALUES('61-0003-S1', 0, 'S') -- 3  Y
INSERT INTO @TempTable VALUES('61-0004-S1', 0, 'R') -- 4  Y
INSERT INTO @TempTable VALUES('61-0005-P1', 0, 'I') -- 5 
INSERT INTO @TempTable VALUES('61-0005-P1', 1, 'S') -- 6  Y
INSERT INTO @TempTable VALUES('61-0006-P1', 0, 'S') -- 7  Y
INSERT INTO @TempTable VALUES('61-0006-S3', 0, 'R') -- 8
INSERT INTO @TempTable VALUES('61-0007-P1', 0, 'S') -- 9  Y
INSERT INTO @TempTable VALUES('61-0008-S3', 0, 'I') -- 10
INSERT INTO @TempTable VALUES('61-0008-S3', 1, 'R') -- 11 Y
INSERT INTO @TempTable VALUES('61-0009-P1', 0, 'R') -- 12 Y
INSERT INTO @TempTable VALUES('61-0009-S1', 0, 'S') -- 13
INSERT INTO @TempTable VALUES('61-0010-P1', 0, 'S') -- 14 Y
INSERT INTO @TempTable VALUES('61-0011-S3', 0, 'S') -- 15 Y


;with CTE1 as 
(
    SELECT  LEFT(Sample_ID,7) AS Subject_ID, 
        Sample_ID,
        SampleOrder = 
        CASE 
            WHEN RIGHT(Sample_ID,2) = 'P1' THEN 4
            WHEN RIGHT(Sample_ID,2) = 'S1' THEN 3
            WHEN RIGHT(Sample_ID,2) = 'S3' THEN 2
            WHEN RIGHT(Sample_ID,2) = 'S2' THEN 1
        END, 
        TestRepeat, 
        TestResult
    FROM @TempTable
),
CTE2 as
(
SELECT 
        tt1.Sample_ID,
        tt1.TestRepeat,
        tt1.TestResult
    FROM CTE1 tt1
    INNER JOIN (
        SELECT Subject_ID, MAX(SampleOrder) AS Max_SampleOrder
        FROM CTE1
        GROUP BY Subject_ID) subQ1 
        ON (tt1.Subject_ID=subQ1.Subject_ID AND tt1.SampleOrder=subQ1.Max_SampleOrder)
    ),
CTE3 as 
(
    SELECT tt2.Sample_ID,
            tt2.TestRepeat, 
            tt2.TestResult
    FROM CTE2 tt2
    INNER JOIN (
        SELECT Sample_ID, MAX(TestRepeat) AS Max_TestRepeat
        FROM CTE2
        GROUP BY Sample_ID) subQ 
        ON (tt2.Sample_ID = subQ.Sample_ID AND tt2.TestRepeat=subq.Max_TestRepeat)
)
select * 
from CTE3;

It produces the same results as your example, without so many table variables. It should also be much more efficient that the table variables.

它产生与您的示例相同的结果,没有这么多的表变量。它还应该比表变量更有效。

Here is a link to my blog with several examples of the Common Table Expressions. http://stevestedman.com/category/classes/cte/

这是我的博客的链接,其中包含几个公用表格表达的示例。 http://stevestedman.com/category/classes/cte/

Hope this helps. Let me know if you have any questions.

希望这可以帮助。如果您有任何疑问,请告诉我。

#1


3  

You can use row_number() in a sub query for this.

您可以在子查询中使用row_number()。

select Sample_ID, TestRepeat, TestResult
from
  (
    select Sample_ID, TestRepeat, TestResult,
           row_number() over(partition by left(Sample_ID, 7)
                             order by case right(Sample_ID,2)
                                        when 'P1' then 1
                                        when 'S1' then 2
                                        when 'S3' then 3
                                        when 'S2' then 4
                                       end, TestRepeat desc) as rn
    from @TempTable
  ) as T
where rn  = 1
order by Sample_ID

You can test the query on SE-Data

您可以在SE-Data上测试查询

Explanation:
row_number will enumerate your rows from 1. The partition by clause controls when numbering start from 1 again and the order by clause specifies the order of the numbering. The over() clause used above will give you a row_number() of 1 for the rows you are interested in. It is not possible to use row_number() in the where clause of a query so you have to use a derived table to be able to filter your rows on the result of row_number()

说明:row_number将枚举您的行1. partition by子句控制何时编号从1开始,order by子句指定编号的顺序。上面使用的over()子句将为您感兴趣的行提供1的row_number()。不可能在查询的where子句中使用row_number(),因此您必须使用派生表能够在row_number()的结果上过滤你的行

#2


0  

Try using common table expressions. You don't have to create all the table variables, it might be a little cleaner. Here is an example, I kept the original table variable from your example for the source data.

尝试使用公用表表达式。您不必创建所有表变量,它可能会更清洁一些。这是一个例子,我保留了源数据示例中的原始表变量。

DECLARE @TempTable TABLE (Sample_ID varchar(10), TestRepeat int, TestResult varchar(1))
--  In the end, ONLY the samples with Y should be displayed
INSERT INTO @TempTable VALUES('61-0001-P1', 0, 'R') -- 1  Y
INSERT INTO @TempTable VALUES('61-0002-P1', 0, 'R') -- 2  Y
INSERT INTO @TempTable VALUES('61-0003-S1', 0, 'S') -- 3  Y
INSERT INTO @TempTable VALUES('61-0004-S1', 0, 'R') -- 4  Y
INSERT INTO @TempTable VALUES('61-0005-P1', 0, 'I') -- 5 
INSERT INTO @TempTable VALUES('61-0005-P1', 1, 'S') -- 6  Y
INSERT INTO @TempTable VALUES('61-0006-P1', 0, 'S') -- 7  Y
INSERT INTO @TempTable VALUES('61-0006-S3', 0, 'R') -- 8
INSERT INTO @TempTable VALUES('61-0007-P1', 0, 'S') -- 9  Y
INSERT INTO @TempTable VALUES('61-0008-S3', 0, 'I') -- 10
INSERT INTO @TempTable VALUES('61-0008-S3', 1, 'R') -- 11 Y
INSERT INTO @TempTable VALUES('61-0009-P1', 0, 'R') -- 12 Y
INSERT INTO @TempTable VALUES('61-0009-S1', 0, 'S') -- 13
INSERT INTO @TempTable VALUES('61-0010-P1', 0, 'S') -- 14 Y
INSERT INTO @TempTable VALUES('61-0011-S3', 0, 'S') -- 15 Y


;with CTE1 as 
(
    SELECT  LEFT(Sample_ID,7) AS Subject_ID, 
        Sample_ID,
        SampleOrder = 
        CASE 
            WHEN RIGHT(Sample_ID,2) = 'P1' THEN 4
            WHEN RIGHT(Sample_ID,2) = 'S1' THEN 3
            WHEN RIGHT(Sample_ID,2) = 'S3' THEN 2
            WHEN RIGHT(Sample_ID,2) = 'S2' THEN 1
        END, 
        TestRepeat, 
        TestResult
    FROM @TempTable
),
CTE2 as
(
SELECT 
        tt1.Sample_ID,
        tt1.TestRepeat,
        tt1.TestResult
    FROM CTE1 tt1
    INNER JOIN (
        SELECT Subject_ID, MAX(SampleOrder) AS Max_SampleOrder
        FROM CTE1
        GROUP BY Subject_ID) subQ1 
        ON (tt1.Subject_ID=subQ1.Subject_ID AND tt1.SampleOrder=subQ1.Max_SampleOrder)
    ),
CTE3 as 
(
    SELECT tt2.Sample_ID,
            tt2.TestRepeat, 
            tt2.TestResult
    FROM CTE2 tt2
    INNER JOIN (
        SELECT Sample_ID, MAX(TestRepeat) AS Max_TestRepeat
        FROM CTE2
        GROUP BY Sample_ID) subQ 
        ON (tt2.Sample_ID = subQ.Sample_ID AND tt2.TestRepeat=subq.Max_TestRepeat)
)
select * 
from CTE3;

It produces the same results as your example, without so many table variables. It should also be much more efficient that the table variables.

它产生与您的示例相同的结果,没有这么多的表变量。它还应该比表变量更有效。

Here is a link to my blog with several examples of the Common Table Expressions. http://stevestedman.com/category/classes/cte/

这是我的博客的链接,其中包含几个公用表格表达的示例。 http://stevestedman.com/category/classes/cte/

Hope this helps. Let me know if you have any questions.

希望这可以帮助。如果您有任何疑问,请告诉我。