SQL查询 - Join返回连接表的前两个记录

时间:2022-12-29 01:55:34

I have two tables:

我有两张桌子:

Patient

患者

  • pkPatientId
  • pkPatientId
  • FirstName
  • 名字
  • Surname

PatientStatus

PatientStatus

  • pkPatientStatusId
  • pkPatientStatusId
  • fkPatientId
  • fkPatientId
  • StatusCode
  • 的StatusCode
  • StartDate
  • 开始日期
  • EndDate
  • 结束日期

Patient -> PatientStatus is a one to many relationship.

患者 - > PatientStatus是一对多的关系。

I am wondering if its possible in SQL to do a join which returns only the first two PatientStatus records for each Patient. If only one PatientStatus record exists then this should not be returned in the results.

我想知道在SQL中是否可以进行连接,该连接仅返回每个患者的前两个​​PatientStatus记录。如果仅存在一个PatientStatus记录,则不应在结果中返回此记录。

The normal join of my query is:

我的查询的正常连接是:

SELECT FROM Patient p INNER JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId
ORDER BY ps.fkPatientId, ps.StartDate

8 个解决方案

#1


6  

A CTE is probably your best bet if you're in SQL Server 2005 or greater, but if you want something a little more compatible with other platforms, this should work:

如果您使用的是SQL Server 2005或更高版本,CTE可能是您最好的选择,但如果您想要与其他平台更兼容的东西,这应该可行:

SELECT
     P.pkPatientID,
     P.FirstName,
     P.LastName,
     PS1.StatusCode AS FirstStatusCode,
     PS1.StartDate AS FirstStatusStartDate,
     PS1.EndDate AS FirstStatusEndDate,
     PS2.StatusCode AS SecondStatusCode,
     PS2.StartDate AS SecondStatusStartDate,
     PS2.EndDate AS SecondStatusEndDate
FROM
     Patient P
INNER JOIN PatientStatus PS1 ON
     PS1.fkPatientID = P.pkPatientID
INNER JOIN PatientStatus PS2 ON
     PS2.fkPatientID = P.pkPatientID AND
     PS2.StartDate > PS1.StartDate
LEFT OUTER JOIN PatientStatus PS3 ON
     PS3.fkPatientID = P.pkPatientID AND
     PS3.StartDate < PS1.StartDate
LEFT OUTER JOIN PatientStatus PS4 ON
     PS4.fkPatientID = P.pkPatientID AND
     PS4.StartDate > PS1.StartDate AND
     PS4.StartDate < PS2.StartDate
WHERE
     PS3.pkPatientStatusID IS NULL AND
     PS4.pkPatientStatusID IS NULL

It does seem a little odd to me that you would want the first two statuses instead of the last two, but I'll assume that you know what you want.

对我来说,你想要前两个状态而不是最后两个状态似乎有点奇怪,但我会假设你知道你想要什么。

You can also use WHERE NOT EXISTS instead of the PS3 and PS4 joins if you get better performance with that.

如果你获得更好的性能,你也可以使用WHERE NOT EXISTS而不是PS3和PS4连接。

#2


4  

Here is my attempt - It should work on SQL Server 2005 and SQL Server 2008 (Tested on SQL Server 2008) owing to the use of a common table expression:

这是我的尝试 - 它应该适用于SQL Server 2005和SQL Server 2008(在SQL Server 2008上测试),因为使用了公用表表达式:

WITH CTE AS
(
    SELECT  fkPatientId
          , StatusCode
          -- add more columns here
          , ROW_NUMBER() OVER
    (
    PARTITION BY fkPatientId ORDER BY fkPatientId desc) AS [Row_Number] 
    from PatientStatus
    where fkPatientId in
    (
        select fkPatientId
        from PatientStatus
        group by fkPatientId
        having COUNT(*) >= 2
    )
)
SELECT p.pkPatientId,
    p.FirstName,
    CTE.StatusCode  
FROM [Patient] as p
    INNER JOIN CTE
        ON p.[pkPatientId] = CTE.fkPatientId
WHERE CTE.[Row_Number] = 1 
or CTE.[Row_Number] = 2

#3


2  

EDIT: Both of the following solutions require that PatientStatus.StartDate is unique within each patient.

编辑:以下两种解决方案都要求PatientStatus.StartDate在每位患者中都是唯一的。

The traditional way (SQL Server 2000 compatible):

传统方式(SQL Server 2000兼容):

SELECT 
  p.pkPatientId,
  p.FirstName,
  p.Surname,
  ps.StatusCode,
  ps.StartDate,
  ps.EndDate
FROM 
  Patient p 
  INNER JOIN PatientStatus ps ON 
    p.pkPatientId = ps.fkPatientId
    AND ps.StartDate IN (
      SELECT TOP 2 StartDate 
      FROM     PatientStatus 
      WHERE    fkPatientId = ps.fkPatientId
      ORDER BY StartDate  /* DESC (to switch between first/last records) */
    )
WHERE 
  EXISTS (
    SELECT   1 
    FROM     PatientStatus
    WHERE    fkPatientId = p.pkPatientId
    GROUP BY fkPatientId
    HAVING   COUNT(*) >= 2
  )
ORDER BY 
  ps.fkPatientId, 
  ps.StartDate

A more interesting alternative (you'd have to try how well it performs in comparison):

一个更有趣的替代方案(你必须尝试比较它的表现):

SELECT 
  p.pkPatientId,
  p.FirstName,
  p.Surname,
  ps.StatusCode,
  ps.StartDate,
  ps.EndDate
FROM 
  Patient p 
  INNER JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId
WHERE
  /* the "2" is the maximum number of rows returned */
  2 > (
    SELECT 
      COUNT(*)
    FROM 
      Patient p_i 
      INNER JOIN PatientStatus ps_i ON p_i.pkPatientId = ps_i.fkPatientId
    WHERE
      ps_i.fkPatientId = ps.fkPatientId
      AND ps_i.StartDate < ps.StartDate
      /* switch between "<" and ">" to get the first/last rows */
  )
  AND EXISTS (
    SELECT   1 
    FROM     PatientStatus
    WHERE    fkPatientId = p.pkPatientId
    GROUP BY fkPatientId
    HAVING   COUNT(*) >= 2
  )
ORDER BY 
  ps.fkPatientId, 
  ps.StartDate

Side note: For MySQL the latter query might be the only alternative - until LIMIT is supported in sub-queries.

旁注:对于MySQL,后一个查询可能是唯一的选择 - 直到子查询支持LIMIT。

EDIT: I added a condition that excludes patients with only one PatientStatus record. (Thanks for the tip, Ryan!)

编辑:我添加了一个条件,排除只有一个PatientStatus记录的患者。 (谢谢你的提示,Ryan!)

#4


1  

I did not try but this could work;

我没试过,但这可行;

SELECT /*(your select columns here)*/, row_number() over(ORDER BY ps.fkPatientId, ps.StartDate) as rownumber FROM Patient p INNER JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId
where rownumber between 1 and 2

if this did not work, see this link.

如果这不起作用,请参阅此链接。

#5


1  

Adding this WHERE clause to the outer query of Tomalak's first solution will prevent Patients with less than 2 status records from being returned. You can also "and" it in the WHERE clause of the second query for the same results.

将此WHERE子句添加到Tomalak的第一个解决方案的外部查询将阻止返回少于2个状态记录的患者。您也可以在第二个查询的WHERE子句中“和”它以获得相同的结果。

WHERE pkPatientId IN (
    SELECT pkPatientID 
    FROM Patient JOIN PatientStatus ON pkPatientId = fkPatientId
    GROUP BY pkPatientID HAVING Count(*) >= 2
)

#6


1  

Check if your server supports windowed functions:

检查您的服务器是否支持窗口函数:

SELECT * 
FROM Patient p
LEFT JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId
QUALIFY ROW_NUMBER() OVER (PARTITION BY ps.fkPatientId ORDER BY ps.StartDate) < 3

Another possibility, which should work with SQL Server 2005:

另一种可能与SQL Server 2005一起使用的可能性:

SELECT * FROM Patient p
LEFT JOIN ( 
    SELECT *, ROW_NUMBER(PARTITION BY fsPatientId ORDER by StartDate) rn
    FROM PatientStatus) ps
ON p.pkPatientId = ps.fkPatientID 
and ps.rn < 3

#7


0  

Here is how I would approach this:

以下是我将如何处理这个问题:

-- Patients with at least 2 status records
with PatientsWithEnoughRecords as (
    select fkPatientId
        from PatientStatus as ps
        group by 
            fkPatientId
        having
            count(*) >= 2
)
select top 2 *
    from PatientsWithEnoughRecords as er 
        left join PatientStatus as ps on
            er.fkPatientId = ps.fkPatientId
    order by StartDate asc

I am not sure what determines the "first" two status records in your case, so I assumed you want the earliest two StartDate**s. Modify the last **order by clause to get the records that you are interested in.

我不确定在你的情况下是什么决定了“第一”两个状态记录,所以我假设你想要最早的两个StartDate **。修改最后的** order by子句以获取您感兴趣的记录。

Edit: SQL Server 2000 doesn't support CTEs, so this solution will indeed only work directly on 2005 and later.

编辑:SQL Server 2000不支持CTE,因此该解决方案确实只能直接在2005及更高版本上运行。

#8


0  

Ugly, but this one does not rely on uniqueness of StartDate and works on SQL 2000

很丑,但是这个不依赖于StartDate的唯一性并且适用于SQL 2000

select * 
from Patient p 
join PatientStatus ps on p.pkPatientId=ps.fkPatientId
where pkPatientStatusId in (
 select top 2 pkPatientStatusId 
 from PatientStatus 
 where fkPatientId=ps.fkPatientId 
 order by StartDate
) and pkPatientId in (
 select fkPatientId
 from PatientStatus
 group by fkPatientId
 having count(*)>=2
)

#1


6  

A CTE is probably your best bet if you're in SQL Server 2005 or greater, but if you want something a little more compatible with other platforms, this should work:

如果您使用的是SQL Server 2005或更高版本,CTE可能是您最好的选择,但如果您想要与其他平台更兼容的东西,这应该可行:

SELECT
     P.pkPatientID,
     P.FirstName,
     P.LastName,
     PS1.StatusCode AS FirstStatusCode,
     PS1.StartDate AS FirstStatusStartDate,
     PS1.EndDate AS FirstStatusEndDate,
     PS2.StatusCode AS SecondStatusCode,
     PS2.StartDate AS SecondStatusStartDate,
     PS2.EndDate AS SecondStatusEndDate
FROM
     Patient P
INNER JOIN PatientStatus PS1 ON
     PS1.fkPatientID = P.pkPatientID
INNER JOIN PatientStatus PS2 ON
     PS2.fkPatientID = P.pkPatientID AND
     PS2.StartDate > PS1.StartDate
LEFT OUTER JOIN PatientStatus PS3 ON
     PS3.fkPatientID = P.pkPatientID AND
     PS3.StartDate < PS1.StartDate
LEFT OUTER JOIN PatientStatus PS4 ON
     PS4.fkPatientID = P.pkPatientID AND
     PS4.StartDate > PS1.StartDate AND
     PS4.StartDate < PS2.StartDate
WHERE
     PS3.pkPatientStatusID IS NULL AND
     PS4.pkPatientStatusID IS NULL

It does seem a little odd to me that you would want the first two statuses instead of the last two, but I'll assume that you know what you want.

对我来说,你想要前两个状态而不是最后两个状态似乎有点奇怪,但我会假设你知道你想要什么。

You can also use WHERE NOT EXISTS instead of the PS3 and PS4 joins if you get better performance with that.

如果你获得更好的性能,你也可以使用WHERE NOT EXISTS而不是PS3和PS4连接。

#2


4  

Here is my attempt - It should work on SQL Server 2005 and SQL Server 2008 (Tested on SQL Server 2008) owing to the use of a common table expression:

这是我的尝试 - 它应该适用于SQL Server 2005和SQL Server 2008(在SQL Server 2008上测试),因为使用了公用表表达式:

WITH CTE AS
(
    SELECT  fkPatientId
          , StatusCode
          -- add more columns here
          , ROW_NUMBER() OVER
    (
    PARTITION BY fkPatientId ORDER BY fkPatientId desc) AS [Row_Number] 
    from PatientStatus
    where fkPatientId in
    (
        select fkPatientId
        from PatientStatus
        group by fkPatientId
        having COUNT(*) >= 2
    )
)
SELECT p.pkPatientId,
    p.FirstName,
    CTE.StatusCode  
FROM [Patient] as p
    INNER JOIN CTE
        ON p.[pkPatientId] = CTE.fkPatientId
WHERE CTE.[Row_Number] = 1 
or CTE.[Row_Number] = 2

#3


2  

EDIT: Both of the following solutions require that PatientStatus.StartDate is unique within each patient.

编辑:以下两种解决方案都要求PatientStatus.StartDate在每位患者中都是唯一的。

The traditional way (SQL Server 2000 compatible):

传统方式(SQL Server 2000兼容):

SELECT 
  p.pkPatientId,
  p.FirstName,
  p.Surname,
  ps.StatusCode,
  ps.StartDate,
  ps.EndDate
FROM 
  Patient p 
  INNER JOIN PatientStatus ps ON 
    p.pkPatientId = ps.fkPatientId
    AND ps.StartDate IN (
      SELECT TOP 2 StartDate 
      FROM     PatientStatus 
      WHERE    fkPatientId = ps.fkPatientId
      ORDER BY StartDate  /* DESC (to switch between first/last records) */
    )
WHERE 
  EXISTS (
    SELECT   1 
    FROM     PatientStatus
    WHERE    fkPatientId = p.pkPatientId
    GROUP BY fkPatientId
    HAVING   COUNT(*) >= 2
  )
ORDER BY 
  ps.fkPatientId, 
  ps.StartDate

A more interesting alternative (you'd have to try how well it performs in comparison):

一个更有趣的替代方案(你必须尝试比较它的表现):

SELECT 
  p.pkPatientId,
  p.FirstName,
  p.Surname,
  ps.StatusCode,
  ps.StartDate,
  ps.EndDate
FROM 
  Patient p 
  INNER JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId
WHERE
  /* the "2" is the maximum number of rows returned */
  2 > (
    SELECT 
      COUNT(*)
    FROM 
      Patient p_i 
      INNER JOIN PatientStatus ps_i ON p_i.pkPatientId = ps_i.fkPatientId
    WHERE
      ps_i.fkPatientId = ps.fkPatientId
      AND ps_i.StartDate < ps.StartDate
      /* switch between "<" and ">" to get the first/last rows */
  )
  AND EXISTS (
    SELECT   1 
    FROM     PatientStatus
    WHERE    fkPatientId = p.pkPatientId
    GROUP BY fkPatientId
    HAVING   COUNT(*) >= 2
  )
ORDER BY 
  ps.fkPatientId, 
  ps.StartDate

Side note: For MySQL the latter query might be the only alternative - until LIMIT is supported in sub-queries.

旁注:对于MySQL,后一个查询可能是唯一的选择 - 直到子查询支持LIMIT。

EDIT: I added a condition that excludes patients with only one PatientStatus record. (Thanks for the tip, Ryan!)

编辑:我添加了一个条件,排除只有一个PatientStatus记录的患者。 (谢谢你的提示,Ryan!)

#4


1  

I did not try but this could work;

我没试过,但这可行;

SELECT /*(your select columns here)*/, row_number() over(ORDER BY ps.fkPatientId, ps.StartDate) as rownumber FROM Patient p INNER JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId
where rownumber between 1 and 2

if this did not work, see this link.

如果这不起作用,请参阅此链接。

#5


1  

Adding this WHERE clause to the outer query of Tomalak's first solution will prevent Patients with less than 2 status records from being returned. You can also "and" it in the WHERE clause of the second query for the same results.

将此WHERE子句添加到Tomalak的第一个解决方案的外部查询将阻止返回少于2个状态记录的患者。您也可以在第二个查询的WHERE子句中“和”它以获得相同的结果。

WHERE pkPatientId IN (
    SELECT pkPatientID 
    FROM Patient JOIN PatientStatus ON pkPatientId = fkPatientId
    GROUP BY pkPatientID HAVING Count(*) >= 2
)

#6


1  

Check if your server supports windowed functions:

检查您的服务器是否支持窗口函数:

SELECT * 
FROM Patient p
LEFT JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId
QUALIFY ROW_NUMBER() OVER (PARTITION BY ps.fkPatientId ORDER BY ps.StartDate) < 3

Another possibility, which should work with SQL Server 2005:

另一种可能与SQL Server 2005一起使用的可能性:

SELECT * FROM Patient p
LEFT JOIN ( 
    SELECT *, ROW_NUMBER(PARTITION BY fsPatientId ORDER by StartDate) rn
    FROM PatientStatus) ps
ON p.pkPatientId = ps.fkPatientID 
and ps.rn < 3

#7


0  

Here is how I would approach this:

以下是我将如何处理这个问题:

-- Patients with at least 2 status records
with PatientsWithEnoughRecords as (
    select fkPatientId
        from PatientStatus as ps
        group by 
            fkPatientId
        having
            count(*) >= 2
)
select top 2 *
    from PatientsWithEnoughRecords as er 
        left join PatientStatus as ps on
            er.fkPatientId = ps.fkPatientId
    order by StartDate asc

I am not sure what determines the "first" two status records in your case, so I assumed you want the earliest two StartDate**s. Modify the last **order by clause to get the records that you are interested in.

我不确定在你的情况下是什么决定了“第一”两个状态记录,所以我假设你想要最早的两个StartDate **。修改最后的** order by子句以获取您感兴趣的记录。

Edit: SQL Server 2000 doesn't support CTEs, so this solution will indeed only work directly on 2005 and later.

编辑:SQL Server 2000不支持CTE,因此该解决方案确实只能直接在2005及更高版本上运行。

#8


0  

Ugly, but this one does not rely on uniqueness of StartDate and works on SQL 2000

很丑,但是这个不依赖于StartDate的唯一性并且适用于SQL 2000

select * 
from Patient p 
join PatientStatus ps on p.pkPatientId=ps.fkPatientId
where pkPatientStatusId in (
 select top 2 pkPatientStatusId 
 from PatientStatus 
 where fkPatientId=ps.fkPatientId 
 order by StartDate
) and pkPatientId in (
 select fkPatientId
 from PatientStatus
 group by fkPatientId
 having count(*)>=2
)