如何在每个列中选择下一个非空值。t - sql

时间:2023-01-02 09:31:39

I am trying to select the last non-null value per column based on a date value.

我正在尝试根据日期值为每列选择最后一个非空值。

I have a table that looks like this -

我有一张像这样的桌子

Email           Name1   Name2   Job     Date
Test1@test.com  Ron     NULL    NULL    2015-01-01 00:00:00.000
Test1@test.com  Dave    Smith   NULL    2014-01-01 00:00:00.000
Test1@test.com  NULL    NULL    NULL    2013-01-01 00:00:00.000
Test2@test.com  NULL    Smith   NULL    2014-01-01 00:00:00.000
Test2@test.com  NULL    Ford    Plumber 2015-01-01 00:00:00.000`

I want to display the newest non-null value for each column per email address.

我想要显示每个邮件地址每个列的最新非空值。

The output should be -

输出应该是-

Email           Name1   Name2   Job
Test1@test.com  Ron     Smith   NULL
Test2@test.com  NULL    Ford    Plumber

I have already written some pretty ugly SQL to solve this problem however I want to apply this logic to another table with a lot more columns.

我已经编写了一些非常难看的SQL来解决这个问题,但是我想将这个逻辑应用到另一个包含更多列的表。

My Question is - Is there an easier way to do this without having to join for each column?

我的问题是——是否有一种更简单的方法可以做到这一点,而不必为每个列都加入?

Current solution is as follows -

目前的解决方案如下:

select distinct  a.[Email],b.[Name1],c.[Name2],d.[job] from 
(
select [Email] from #test
)
A
left join 
(
SELECT [Email],
 FIRST_VALUE([Name1]) over(partition by [Email] order by [Date] desc) as [Name1]
from #test
where [Name1] is not null
) b
on a.[Email] = b.[Email]
left join 
(
SELECT [Email],
FIRST_VALUE([Name2]) over(partition by [Email] order by [Date] desc) as [Name2]
from  #test
where [Name2] is not null
) c
on a.[Email] = c.[Email]
left join 
(
select [Email],
FIRST_VALUE([Job]) over(partition by [Email] order by [Date] desc) as [Job]
from #test
where  [Job] is not null
) d
on a.[Email] = d.[Email]

Here's the DDL/DML for the example table if this helps -

下面是示例表的DDL/DML,如果这有帮助的话

create table #test
([Email] nvarchar(50),
[Name1]  nvarchar(50),
[Name2] nvarchar(50),
[Job] nvarchar(50),
[Date] datetime)

insert into #test
values
('Test1@test.com', 'Ron', null,null,'20150101'),
('Test1@test.com', 'Dave' ,'Smith',null, '20140101'),
('Test1@test.com', null, null, null ,'20130101'),
('Test2@test.com', null, 'Smith', null, '20140101'),
('Test2@test.com', null, 'Ford', 'Plumber','20150101')

2 个解决方案

#1


2  

There are ways that don't require so many joins. None are simple because SQL Server doesn't support the ignore nulls option on lag().

有一些方法不需要这么多连接。它们都不简单,因为SQL Server不支持lag()上的ignore nulls选项。

Basically, you'll need to do logic on each column. One method without a subquery is:

基本上,您需要在每个列上执行逻辑。没有子查询的一种方法是:

select distinct email,
       first_value(name1) over (partition by email
                                order by (case when name1 is not null then date else '2000-01-01' end) desc
                               ) as name1,
       . . .
from #test;

An alternative uses outer apply:

另一种选择是使用外部应用:

select t.email, name1, . . .
from (select distinct email from #test t) t outer apply
     (select top 1 name1
      from #test t2
      where t2.email = t.email and name1 is not null
      order by date desc
     ) name1 . . .

#2


2  

You can use FIRST_VALUE with DISTINCT:

您可以使用FIRST_VALUE与不同的:

SELECT DISTINCT Email, 
       FIRST_VALUE(Name1) OVER (PARTITION BY Email 
                                ORDER BY CASE 
                                           WHEN Name1 IS NULL THEN '19000101' 
                                           ELSE [Date] 
                                         END DESC) AS Name1,
       FIRST_VALUE(Name2) OVER (PARTITION BY Email 
                                ORDER BY CASE 
                                           WHEN Name2 IS NULL THEN '19000101' 
                                           ELSE [Date] 
                                         END DESC) AS Name2,
       FIRST_VALUE(Job) OVER (PARTITION BY Email 
                                ORDER BY CASE 
                                           WHEN Job IS NULL THEN '19000101' 
                                           ELSE [Date] 
                                         END DESC) AS Job
FROM test 

#1


2  

There are ways that don't require so many joins. None are simple because SQL Server doesn't support the ignore nulls option on lag().

有一些方法不需要这么多连接。它们都不简单,因为SQL Server不支持lag()上的ignore nulls选项。

Basically, you'll need to do logic on each column. One method without a subquery is:

基本上,您需要在每个列上执行逻辑。没有子查询的一种方法是:

select distinct email,
       first_value(name1) over (partition by email
                                order by (case when name1 is not null then date else '2000-01-01' end) desc
                               ) as name1,
       . . .
from #test;

An alternative uses outer apply:

另一种选择是使用外部应用:

select t.email, name1, . . .
from (select distinct email from #test t) t outer apply
     (select top 1 name1
      from #test t2
      where t2.email = t.email and name1 is not null
      order by date desc
     ) name1 . . .

#2


2  

You can use FIRST_VALUE with DISTINCT:

您可以使用FIRST_VALUE与不同的:

SELECT DISTINCT Email, 
       FIRST_VALUE(Name1) OVER (PARTITION BY Email 
                                ORDER BY CASE 
                                           WHEN Name1 IS NULL THEN '19000101' 
                                           ELSE [Date] 
                                         END DESC) AS Name1,
       FIRST_VALUE(Name2) OVER (PARTITION BY Email 
                                ORDER BY CASE 
                                           WHEN Name2 IS NULL THEN '19000101' 
                                           ELSE [Date] 
                                         END DESC) AS Name2,
       FIRST_VALUE(Job) OVER (PARTITION BY Email 
                                ORDER BY CASE 
                                           WHEN Job IS NULL THEN '19000101' 
                                           ELSE [Date] 
                                         END DESC) AS Job
FROM test