SQL Server:与TOP 1进行左外连接,最多选择一行

时间:2022-06-23 15:34:09

I basically need to do a left outer join on 2 tables (CarePlan and Referrals) problem is i need the newest Referral If it exists, its ok if it doesnt.

我基本上需要在两个表上做一个左外连接(看图和推荐)问题是我需要一个最新的推荐如果它存在,如果它不存在就可以。

I have these 2 queries 1. joins the CarePlan/Referral tables - creates duplicate careplans if there are multiple referrals for a careplan, or has no referral information at all (left outer join) 2. select the top 1 Referral based on Date, given a CarePlanId

我有这两个查询1。加入看护计划/推荐表——如果一个看护计划有多个推荐,或者根本没有推荐信息(左外连接)2,创建重复的看护计划。根据日期选择排名前1的推荐人

I'd like to combine these 2 so i grab all the careplans and its referrals if it exists, if it does - take only the newest referral

我想把这两个结合起来,这样我就能得到所有的护理计划和它的推荐,如果有的话,如果有的话,我只接受最新的推荐

select * from CarePlan c //query 1
left outer join Referral r on 
r.CarePlanId = c.CarePlanId


select top 1 * from Referral r //query 2
where r.CarePlanId = '1'
order by ReferralDate desc

EDIT:

编辑:

The first query gives me something like this:

第一个查询给出如下内容:

CarePlanID    ReferralId     ReferralDate
----------    ----------     ------------
1             1              05/15/12
2             NULL           NULL
1             2              05/10/12  //Old date, dont want this careplan

The second query will give me the referral with the newest date

第二个查询将给出最新日期的查询

ReferralId    ReferralDate
----------    ------------
1             05/15/12

The Referral data, may have 0 or more referrals belonging to a Careplan

转诊数据,可能有0或更多的转诊属于护理计划

ReferralID  CarePlanId    Date
----------  ----------    ----
1           1             05/15/12
2           1             05/10/12

Ultimately I want a query that gives me careplans with referrals that have the newest date, or null for referrals if it doesnt have it

最后,我想要一个查询,它给我提供了一个最新的日期的推荐,或者如果没有的话,它将会是无效的。

like this:

是这样的:

CarePlanId   ReferralId    ReferralDate
----------   ----------    ------------
1            1             05/15/12
2            NULL          NULL

Thanks - i hope this makes sense

谢谢-我希望这是有道理的

3 个解决方案

#1


38  

select *
from CarePlan c
outer apply (
    select top 1 * --top N rows
    from Referral r
    where r.CarePlanId = c.CarePlanId --join condition
    order by /*fill this in!*/
) x

Be aware that this forces a loop join due to optimizer weaknesses up to and including version 2014.

请注意,这将强制执行循环连接,因为到2014年版本为止,包括2014版之前,优化器都存在缺陷。

#2


6  

I know this question is older, but there is another approach which I feel is under-utilized:

我知道这个问题由来已久,但我认为还有另一种方法没有得到充分利用:

You can join tables back to themselves and use an operator to find the "most recent" record.

您可以将表连接回自己,并使用操作符查找“最近的”记录。

Answer

回答

SELECT CP.CarePlanId, R.ReferralId, R.ReferralDate
FROM CarePlan CP
LEFT OUTER JOIN Referral R ON R.CarePlanId = CP.CarePlanId
LEFT OUTER JOIN Referral R_NEWER ON R.CarePlanId = R_NEWER.CarePlanId AND R.ReferralDate < R_NEWER.ReferralDate
WHERE R_NEWER.ReferralId IS NULL

results:

结果:

CP.CarePlanId   R.ReferralId    R.ReferralDate
----------      ----------      ------------
1               1               05/15/12
2               NULL            NULL

Explanation

解释

Let's break this down. You are basically saying, for each referral record, (left outer) join to every other referral record that is associated with the same CarePlanId but ONLY where there is a newer ReferralDate.

让我们分解。你基本上是说,对于每个推荐记录,(左外)连接到与同一个CarePlanId相关联的所有其他推荐记录,但只在有更新的推荐日期的地方。

Here is the query without the where clause (along with some addition info from the R_NEWER table):

这里是没有where子句的查询(以及来自r_new表的一些附加信息):

SELECT CP.CarePlanId, R.ReferralId, R.ReferralDate, R_NEWER.ReferralId, R.NEWER.ReferralDate
FROM CarePlan CP
LEFT OUTER JOIN Referral R ON R.CarePlanId = CP.CarePlanId
LEFT OUTER JOIN Referral R_NEWER ON R.CarePlanId = R_NEWER.CarePlanId AND R.ReferralDate < R_NEWER.ReferralDate

Here is the result of that query:

以下是该查询的结果:

CP.CarePlanId   R.ReferralId    R.ReferralDate  R_NEWER.ReferralId  R_NEWER.ReferralDate
----------      ----------      ------------    ------------        ------------    
1               1               05/15/12        NULL                NULL
2               NULL            NULL            NULL                NULL
1               2               05/10/12        1                   05/15/12

As you can see, only referral Id 2 (3rd record above) found a "newer" record to join to in the referral table (i.e. referral Id 1). Referral Id 1 (1st record above) did NOT find a "newer" referral (for the same CarePlanId).

如您所见,只有转诊Id 2(上面的第三条记录)在转诊表中(即转诊Id 1)中发现了一个“较新的”记录。转介Id 1(上面的第1条)没有找到“较新的”转诊(对于相同的监护)。

So, with this in mind, now we just add the where clause back:

记住这个,现在我们把where子句加回来

SELECT CP.CarePlanId, R.ReferralId, R.ReferralDate, R_NEWER.ReferralId, R.NEWER.ReferralDate
FROM CarePlan CP
LEFT OUTER JOIN Referral R ON R.CarePlanId = CP.CarePlanId
LEFT OUTER JOIN Referral R_NEWER ON R.CarePlanId = R_NEWER.CarePlanId AND R.ReferralDate < R_NEWER.ReferralDate
WHERE R_NEWER.ReferralId IS NULL

and get:

并获得:

CP.CarePlanId   R.ReferralId    R.ReferralDate  R_NEWER.ReferralId  R_NEWER.ReferralDate
----------      ----------      ------------    ------------        ------------    
1               1               05/15/12        NULL                NULL
2               NULL            NULL            NULL                NULL

At this point, just remove your R_NEWER columns from your SELECT as they are not needed any more and you have your answer.

此时,只需从SELECT中删除r_updates,因为它们不再需要,您就得到了答案。

It is important to remember that the "where" applies AFTER the joins have been occurred, but the ON statement occurs at the time of the join. To make this more understandable for me, I always try to write the SELECTs and JOINs and return columns from each table I am joining and then add my WHERE clauses once I have a clear picture of what is returned.

重要的是要记住,在连接发生之后应用“where”,但是ON语句发生在连接的时候。为了让我更容易理解,我总是尝试从每个表中编写select和join并返回列,然后在我对返回的内容有一个清晰的图片后添加WHERE子句。

Caveat This approach works great in most cases, but it is possible to have duplicate rows if you had 2 referrals (for the same CarePlanId) with the date of 05/15/12 and that date was the "most recent." To work around that, you could extend your join(s) to limit based on the "highest" ReferralId if that scenario arises.

注意,这种方法在大多数情况下都很有效,但是如果您有两个引用(对于相同的管理员),日期为05/15/12,并且该日期是“最近的”,那么可能会有重复的行。为了解决这个问题,如果出现这种情况,您可以将您的连接扩展到基于“最高”ReferralId的限制。

#3


1  

Just a guess. I'm not sure if EF is going to have issues with CTE syntax - can you force EF to call a stored procedure so you're not handcuffed by the subset of functionality supported by EF?

只是一个猜测。我不确定EF是否会遇到CTE语法的问题——你能强迫EF调用一个存储过程,这样你就不会被EF支持的功能子集束缚住吗?

;WITH r AS 
(
  SELECT CarePlanId, MAX(ReferralDate)
    FROM dbo.Referrals GROUP BY CarePlanId
)
SELECT * FROM dbo.CarePlan AS c
LEFT OUTER JOIN r 
  ON r.CarePlanId = c.CarePlanId;

#1


38  

select *
from CarePlan c
outer apply (
    select top 1 * --top N rows
    from Referral r
    where r.CarePlanId = c.CarePlanId --join condition
    order by /*fill this in!*/
) x

Be aware that this forces a loop join due to optimizer weaknesses up to and including version 2014.

请注意,这将强制执行循环连接,因为到2014年版本为止,包括2014版之前,优化器都存在缺陷。

#2


6  

I know this question is older, but there is another approach which I feel is under-utilized:

我知道这个问题由来已久,但我认为还有另一种方法没有得到充分利用:

You can join tables back to themselves and use an operator to find the "most recent" record.

您可以将表连接回自己,并使用操作符查找“最近的”记录。

Answer

回答

SELECT CP.CarePlanId, R.ReferralId, R.ReferralDate
FROM CarePlan CP
LEFT OUTER JOIN Referral R ON R.CarePlanId = CP.CarePlanId
LEFT OUTER JOIN Referral R_NEWER ON R.CarePlanId = R_NEWER.CarePlanId AND R.ReferralDate < R_NEWER.ReferralDate
WHERE R_NEWER.ReferralId IS NULL

results:

结果:

CP.CarePlanId   R.ReferralId    R.ReferralDate
----------      ----------      ------------
1               1               05/15/12
2               NULL            NULL

Explanation

解释

Let's break this down. You are basically saying, for each referral record, (left outer) join to every other referral record that is associated with the same CarePlanId but ONLY where there is a newer ReferralDate.

让我们分解。你基本上是说,对于每个推荐记录,(左外)连接到与同一个CarePlanId相关联的所有其他推荐记录,但只在有更新的推荐日期的地方。

Here is the query without the where clause (along with some addition info from the R_NEWER table):

这里是没有where子句的查询(以及来自r_new表的一些附加信息):

SELECT CP.CarePlanId, R.ReferralId, R.ReferralDate, R_NEWER.ReferralId, R.NEWER.ReferralDate
FROM CarePlan CP
LEFT OUTER JOIN Referral R ON R.CarePlanId = CP.CarePlanId
LEFT OUTER JOIN Referral R_NEWER ON R.CarePlanId = R_NEWER.CarePlanId AND R.ReferralDate < R_NEWER.ReferralDate

Here is the result of that query:

以下是该查询的结果:

CP.CarePlanId   R.ReferralId    R.ReferralDate  R_NEWER.ReferralId  R_NEWER.ReferralDate
----------      ----------      ------------    ------------        ------------    
1               1               05/15/12        NULL                NULL
2               NULL            NULL            NULL                NULL
1               2               05/10/12        1                   05/15/12

As you can see, only referral Id 2 (3rd record above) found a "newer" record to join to in the referral table (i.e. referral Id 1). Referral Id 1 (1st record above) did NOT find a "newer" referral (for the same CarePlanId).

如您所见,只有转诊Id 2(上面的第三条记录)在转诊表中(即转诊Id 1)中发现了一个“较新的”记录。转介Id 1(上面的第1条)没有找到“较新的”转诊(对于相同的监护)。

So, with this in mind, now we just add the where clause back:

记住这个,现在我们把where子句加回来

SELECT CP.CarePlanId, R.ReferralId, R.ReferralDate, R_NEWER.ReferralId, R.NEWER.ReferralDate
FROM CarePlan CP
LEFT OUTER JOIN Referral R ON R.CarePlanId = CP.CarePlanId
LEFT OUTER JOIN Referral R_NEWER ON R.CarePlanId = R_NEWER.CarePlanId AND R.ReferralDate < R_NEWER.ReferralDate
WHERE R_NEWER.ReferralId IS NULL

and get:

并获得:

CP.CarePlanId   R.ReferralId    R.ReferralDate  R_NEWER.ReferralId  R_NEWER.ReferralDate
----------      ----------      ------------    ------------        ------------    
1               1               05/15/12        NULL                NULL
2               NULL            NULL            NULL                NULL

At this point, just remove your R_NEWER columns from your SELECT as they are not needed any more and you have your answer.

此时,只需从SELECT中删除r_updates,因为它们不再需要,您就得到了答案。

It is important to remember that the "where" applies AFTER the joins have been occurred, but the ON statement occurs at the time of the join. To make this more understandable for me, I always try to write the SELECTs and JOINs and return columns from each table I am joining and then add my WHERE clauses once I have a clear picture of what is returned.

重要的是要记住,在连接发生之后应用“where”,但是ON语句发生在连接的时候。为了让我更容易理解,我总是尝试从每个表中编写select和join并返回列,然后在我对返回的内容有一个清晰的图片后添加WHERE子句。

Caveat This approach works great in most cases, but it is possible to have duplicate rows if you had 2 referrals (for the same CarePlanId) with the date of 05/15/12 and that date was the "most recent." To work around that, you could extend your join(s) to limit based on the "highest" ReferralId if that scenario arises.

注意,这种方法在大多数情况下都很有效,但是如果您有两个引用(对于相同的管理员),日期为05/15/12,并且该日期是“最近的”,那么可能会有重复的行。为了解决这个问题,如果出现这种情况,您可以将您的连接扩展到基于“最高”ReferralId的限制。

#3


1  

Just a guess. I'm not sure if EF is going to have issues with CTE syntax - can you force EF to call a stored procedure so you're not handcuffed by the subset of functionality supported by EF?

只是一个猜测。我不确定EF是否会遇到CTE语法的问题——你能强迫EF调用一个存储过程,这样你就不会被EF支持的功能子集束缚住吗?

;WITH r AS 
(
  SELECT CarePlanId, MAX(ReferralDate)
    FROM dbo.Referrals GROUP BY CarePlanId
)
SELECT * FROM dbo.CarePlan AS c
LEFT OUTER JOIN r 
  ON r.CarePlanId = c.CarePlanId;