OLAP 存储引擎中存在错误: 处理时找到重复的属性键

时间:2021-12-10 18:03:41

该问题出现是因为在构建属性层次关系时,属性值出现重复导致。构建层次关系时,如果选取了一个跟主键有一一对应关系确不是主键的属性构建层次关系时,往往会因为没有设置正确引发该问题。解决这个问题的办法见:

Period Dimension (Time Dimension) Creation with Year, Month, Day Hierarchy

url:http://programmingsolution.net/post/Period-Dimension-Time-Dimension-Creation-with-Year-Month-Day-Hierarchy.aspx


----

Suppose we have a Period table which contains some of the typical period columns, like Period_Key, Day, Month, Year.

In the table, we have typical Period rows, like 

[IMAGE]

Now we want to create a Period Dimension from it.

[IMAGE]

We can typically create a datasource and data source view in the Analysis service project. I am not going in detail of it. Now we can create a dimension from this Period table by using the dimension wizard.

[IMAGE]

This will start the dimension wizard. We can select to create the dimension from an existing table, which is our Period table.

[IMAGE]

As the table for the Dimension, we will choose the Period table. For the Key column of this dimension, we choose PeriodKey. For display name also (Name Column), we choose PeriodKey.

[IMAGE]

For the Dimension attributes we choose all the four available attributes. We will surely configure them later. For now, we keep them as regular type

[IMAGE]

Then we are about to finish creating our Period Dimension with the following step.

[IMAGE]

When we are done creating our dimension, we will see this window.

[IMAGE]

Now we want to go into our own customized that is now provided by the wizard. We will create hierarchies and additionally resolve some of the issues we get creating the hierarchies.

At this stage, we can process the dimension and it will be successfully processed. But we want to create the hierarchies. We want a natural period hierarchy which will be like

Year --> Month --> Day

We can drag the Year attribute in the Hierarchies panel (middle panel). Then we can drag the month attribute and keep it below year and similarly drag Day attribute and put it below Month in the hierarchy. The final hierarchy should look like this.

[IMAGE]

Here comes one interesting thing. We see a warning in the Hierarchy.

[IMAGE]

Warning:  Attribute relationships do not exist between one or more levels of this hierarchy.  This may result in decreased query performance.

Now what is attribute relationship. At the right side of the Dimension structure, we see a tab named Attribute relationship.

[IMAGE]

If we click on that we will see the default relationship between the attributes that is set by the wizard.

[IMAGE]

This means how the data will be gathered for all the attributes. PeriodKey is the key of this dimension. So the cube will get the data for PeriodKey attribute from the underlying table. Then if anyone wants to see the data at year level, it will aggregate the data for the PeriodKey and will give the value. Same thing will happen for Month and Day. But we know, to get the Year level value, we done need to aggregate all the  data at PeriodKey level. We can easily aggregate the data at Month level and this should be about 30 times less that the Day level data. So this will increase the query performance. If we need to aggregate the Day level data for Year, this will not be good for the performance. That’s the thing the warning message is mentioning.


So we can set the relation like below. Click on Month and drag it to Year. Again Click on Day, drag it to Month.

Now if we go to the Dimension structure tab where we first were, we will see the warning gone.

[IMAGE]

Remember: If you don’t set the attribute relationship, then also you will get the correct value, but it may be slow based on your data volume. Sometimes in other dimension, we create the Hierarchy, but we really don’t want to set the attribute relation according to the levels. We want all the levels getting data from attribute key. But that is a different scenario.

 [IMAGE]

Now we can process the Dimension. But here comes another big surprise. The processing fails.

[IMAGE]

It gives a duplicate key error. Why?


Now we should look into another thing. The properties of each attribute of the dimension.

Click on the Year attribute and look at the properties section.

[IMAGE]

See the KeyColumns and NameColumn properties.

KeyColumn is used to get the members of the attribute from the underlying table. It issues a distinct query on the table for the key columns. The returned rows will be the member of this attribute. The name column is used for these members for the display value.

For YEAR, we have Period.Year as key columns. So it will give us the year members, that are typically … ,2008, 2009,2010 , …

This is ok for us.

For MONTH we can will see the following:

[IMAGE]

Period.Month is used as the Key Columns. This looks ok, but it is not. If we take distinct Month from Period table, we will get 1,2,3,4,……..,29,30,31. But see that Month is below Year in the Hierarchy. As soon as we create a hierarchy, it will relate the Year members with the Month members as a tree structure, where year members will be parent of month members. One important thing is: one parent member can have multiple children. But one child member cannot have multiple parents. But in our case, Month “1” has more than one parents which is all the years. Same for all other months. So we need to have multiple “1” member at month level, each one for each year. Essentially we want to have a structure like

[IMAGE]

So if issue a distinct query on Period table like this:


Select distinct Year, Month


From Period


Then we will get like:


2008
 1
 
2008
 2
 
2008
 3
 
2009
 1
 
2009
 2
 
2009
 3
 
2010
 1
 
2010
 2
 
2010
 3
 
Thus we will get all the months members replicated for each year. In this way we can eliminate the “one child multiple parent ” problem.


Now if we change the KeyColumns for the Month attribute as below, then we can get this implemented.


Click on the button to open the KeyColumns setting window.

[IMAGE]

We will see that only Month is in the right panel.

[IMAGE]

We will add the year column also.

[IMAGE]

After clicking  OK, we will see an error message in the Month attribute.

The NameColumn should be defined because there are multiple KeyColumns.

[IMAGE]

Quite Natural, isn’t it?

Now below the KeyColumns,  we can set the NameColumn. We will set the NameColumn as Month. So, we want multiple members to be generated for the same month, but that should be displayed as same, isn’t it?

[IMAGE]

Now we have our KeyColumns as a Collection of Year and Month, and NameColumn as Month.

[IMAGE]

This same thing we should do for the Day attribute. Use Year, Month and Day for the KeyColumns, and Day for the Name Column.

 [IMAGE] 

Now we have already removed the duplicate key error, which in fact occurs mostly for single child, multiple parent case.


Process the dimension and you can see a nice Period Hierarchy in the Dimension browser.