使用生产日期表添加丢失的行

时间:2022-08-24 20:17:33

This question is related to this SO post

这个问题和这个帖子有关

Rather than using a recursive CTE how do I add in missing data (deemed missing via date) using a DimDates table?

与其使用递归CTE,不如使用DimDates表添加丢失的数据(通过日期被认为是丢失的)?

I have the following two tables:

我有以下两张表:

create table the_table 
(
  [Date] datetime,
  Category2 varchar(10),
  Amount INT
)
insert into the_table
values
( '01 jan 2012', 'xx', 10),
( '03 jan 2012', 'yy', 50)


create table DimDate 
(
  [Date] datetime
)
insert into DimDate
values
( '01 jan 2012'),
( '02 jan 2012'),
( '03 jan 2012'),
( '04 jan 2012')

These are the results I'm trying to get to. I've not bothered with a recursive CTE as I, wrongly assumed, it'd be loads easier using our warehouse DimDate table:

这就是我想要的结果。我不喜欢递归CTE,因为我错误地认为,使用我们的warehouse DimDate表会更容易加载:

使用生产日期表添加丢失的行

ok - i might have stumbled on a possible solution - please poke holes in the following if it's wrong:

好吧——我可能无意中发现了一种可能的解决方案——如果是错的,请在下面打个洞:

select

  coalesce(x.[Date], y.[Date]) AS Date ,
  coalesce(x.Category2, y.Category2) AS Category2 ,
  isnull(Amount,0) as Amount
from the_table x
full outer join 
(
select 
    d.Date
    , t.Category2
from 
        the_table t
        cross join DimDate d 
) y
    on
    x.Category2 = y.Category2
    and 
    x.Date = y.Date

this is what I've ended up with. A combination of the marked answer amd the cte from Aaron's post:

这就是我最后得到的。从Aaron的帖子中,我们可以看到一个明显的答案和cte的结合:

;WITH 
    Dates_cte ([Date]) AS
            (
            SELECT [Date] = DayMarker 
            FROM WHData.dbo.vw_DimDate x
            WHERE
                    x.DayMarker >= (SELECT MIN([Date]) FROM #Data1 WHERE Period = 'Daily') AND
                    x.DayMarker <= GETDATE()
            )   
    ,Categories ([Operator], [Market], [Product], [Measure]) AS 
                ( 
                SELECT DISTINCT 
                        [Operator]
                        , [Market]
                        , [Product]
                        , [Measure] 
                FROM #Data1 
                WHERE [Period] = 'Daily'
                ) 
INSERT INTO #Data1 
    SELECT 
         c.[Operator]
        , c.[Market]
        , c.[Product]
        , [Period] = CONVERT(VARCHAR(100), 'Daily')
        , d.[Date]  
        , c.[Measure]   
        , 0 
    FROM Dates_cte d CROSS JOIN Categories c
    WHERE NOT EXISTS 
            ( 
            SELECT * 
            FROM #Data1 AS T 
            WHERE 
                    t.[Period] = 'Daily' AND
                    t.[Operator] = c.[Operator] AND 
                    t.[Market] = c.[Market] AND 
                    t.[Product] = c.[Product] AND 
                    t.[Measure] = c.[Measure] AND 
                    t.[Date] = d.[Date] 
            ) 

4 个解决方案

#1


3  

Use INSERT INTO ... SELECT FROM DimDate CROSS JOIN categories WHERE NOT EXISTS ....

使用插入……选择从十字架DimDate加入类别不存在....

Try this:

试试这个:

INSERT INTO the_table
([Date], Category2, Amount)
SELECT [Date], category2, 0
FROM DimDate
CROSS JOIN
(
    SELECT DISTINCT category2 FROM the_table
) AS categories
WHERE NOT EXISTS
(
    SELECT *
    FROM thetable AS T
    WHERE T.category2 = categories.Category2
    AND T.[Date] = DimDate.[Date]
)

See it working online: ideone

看它在网上工作:ideone。

If you're creating a data warehouse, I'd advise you to put the categories into a dimension table.

如果您正在创建一个数据仓库,我建议您将类别放到维度表中。

#2


1  

Obviously bad pseudocode that shows a possible solution

显然是糟糕的伪代码,显示了可能的解决方案

insert into table1
    select  from table2 
        where not exists (select from table1 where table1.date = table2.date)

That assumes you are tyring to add the data into table 1.

假定您正在将数据添加到表1中。

If you just want it in memory,

如果你只是想要记忆,

select * from table 1
union 
select * from table 2 where not exists (select from table1 where table1.date = table2.date)

or just an outer join

或者只是一个外部连接。

#3


1  

;WITH cat AS (SELECT Category2 FROM the_table GROUP BY Category2)
INSERT the_table([Date], Category2, Amount)
SELECT d.[Date], cat.Category2, 0
FROM DimDate AS d CROSS JOIN cat
LEFT OUTER JOIN the_table AS t
ON d.[Date] = t.[Date]
AND cat.Category2 = t.Category2
WHERE t.[Date] IS NULL;

#4


0  

Step 1, insert the missing dates:

步骤1,插入缺失的日期:

select [Date], '', 0 from DimDate
where [Date] not in (select [Date] from the_table)

Step 2, update the Categoriy2 column:

步骤2,更新分类2列:

update the_table
set Category2 =
     (select aux.Category from the_table aux where t.Date = 
        (select max(t.Date) from the_table t
         where t.Category2 <> '' and t.Date < aux.Date)

#1


3  

Use INSERT INTO ... SELECT FROM DimDate CROSS JOIN categories WHERE NOT EXISTS ....

使用插入……选择从十字架DimDate加入类别不存在....

Try this:

试试这个:

INSERT INTO the_table
([Date], Category2, Amount)
SELECT [Date], category2, 0
FROM DimDate
CROSS JOIN
(
    SELECT DISTINCT category2 FROM the_table
) AS categories
WHERE NOT EXISTS
(
    SELECT *
    FROM thetable AS T
    WHERE T.category2 = categories.Category2
    AND T.[Date] = DimDate.[Date]
)

See it working online: ideone

看它在网上工作:ideone。

If you're creating a data warehouse, I'd advise you to put the categories into a dimension table.

如果您正在创建一个数据仓库,我建议您将类别放到维度表中。

#2


1  

Obviously bad pseudocode that shows a possible solution

显然是糟糕的伪代码,显示了可能的解决方案

insert into table1
    select  from table2 
        where not exists (select from table1 where table1.date = table2.date)

That assumes you are tyring to add the data into table 1.

假定您正在将数据添加到表1中。

If you just want it in memory,

如果你只是想要记忆,

select * from table 1
union 
select * from table 2 where not exists (select from table1 where table1.date = table2.date)

or just an outer join

或者只是一个外部连接。

#3


1  

;WITH cat AS (SELECT Category2 FROM the_table GROUP BY Category2)
INSERT the_table([Date], Category2, Amount)
SELECT d.[Date], cat.Category2, 0
FROM DimDate AS d CROSS JOIN cat
LEFT OUTER JOIN the_table AS t
ON d.[Date] = t.[Date]
AND cat.Category2 = t.Category2
WHERE t.[Date] IS NULL;

#4


0  

Step 1, insert the missing dates:

步骤1,插入缺失的日期:

select [Date], '', 0 from DimDate
where [Date] not in (select [Date] from the_table)

Step 2, update the Categoriy2 column:

步骤2,更新分类2列:

update the_table
set Category2 =
     (select aux.Category from the_table aux where t.Date = 
        (select max(t.Date) from the_table t
         where t.Category2 <> '' and t.Date < aux.Date)