在SQL Server中实现多态关联的最佳方式是什么?

时间:2022-09-15 21:13:05

I have tons of instances where I need to implement some sort of Polymorphic Association in my database. I always waste tons of time thinking through all the options all over again. Here are the 3 I can think of. I'm hoping there is a best practice for SQL Server.

我有很多实例需要在数据库中实现某种多态关联。我总是浪费大量的时间重新考虑所有的选择。这是我能想到的3个。我希望SQL Server有一个最佳实践。

Here is the multiple column approach

这是多列方法

在SQL Server中实现多态关联的最佳方式是什么?

Here is the no foreign key approach

这里是没有外键的方法。

在SQL Server中实现多态关联的最佳方式是什么?

And here is the base table approach

这是基表方法

在SQL Server中实现多态关联的最佳方式是什么?

8 个解决方案

#1


1  

The two most common approaches are Table Per Class (i.e. a table for the base class and another table for each subclass that contains the additional columns necessary to describe the subclass) and Table Per Hierarchy (i.e. all columns in one table, with one ore more columns to allow for the discrimination of subclasses. Which is the better approach really depends on the particulars of your application and data access strategy.

两种最常见的方法是表每个类为基类(例如一个表为每个子类和另一个表,其中包含必要的附加列描述子类)和表/层次结构(即所有列在一个表,一个或多个列允许子类的歧视。哪种方法更好取决于应用程序和数据访问策略的细节。

You would have Table Per Class in your first example by reversing the direction of the FK and removing the extra ids from the parent. The other two are essentially variants of table per class.

在您的第一个示例中,您可以通过反转FK的方向并从父类中删除多余的id来获得每个类的表。另外两个基本是每个类的表的变体。

#2


1  

Another common Name for this model is the Supertype Model, where one has a base set of attributes that can be expanded via joining to another entity. In Oracle books, it is taught both as a logical model and physical implementation. The model without the relations would allow data to grow into invalid state and orphan records I would strongly validate the needs before selecting that model. The top model with the relation stored in the base object would cause nulls, and in a case where fields were mutually exclusive you would always have a null. The bottom diagram where the key is enforced in the child object would eliminate the nulls but also make the dependency a soft depenendency and allow orphans if cascading was not enforced. I think assessing those traits will help you select the model that fits best. I have used all three in the past.

这个模型的另一个通用名称是Supertype模型,其中一个有一个可以通过连接到另一个实体来扩展的基本属性集。在Oracle书籍中,它既是逻辑模型又是物理实现。没有关系的模型将允许数据成长为无效状态和孤立记录,我将在选择该模型之前强烈地验证需求。顶部模型与存储在基对象中的关系将导致nulls,在字段互斥的情况下,总是会有null。在子对象中强制执行键的底部图将消除空值,但也使依赖项成为软依赖性,并允许未强制执行级联的孤儿。我认为评估这些特性将有助于你选择最适合的模型。这三种我都用过。

#3


0  

According to me your first type of approach is the best way you can define the data as well as your classes but As your all primary data should be avail for the child.

在我看来,您的第一种方法是您可以定义数据和类的最佳方法,但是由于您的所有主要数据应该对子数据有用。

So you can check your requirement and define the Database.

因此,您可以检查您的需求并定义数据库。

#4


0  

I have used what I guess you would call the base table approach. For example, I had tables for names, addresses, and phonenumbers, each with an identity as PK. Then I had a main entity table entity(entityID), and a linking table: attribute(entityKey, attributeType, attributeKey), wherein the attributeKey could point to any of the first three tables, depending on the attributeType.

我已经使用了基本表方法。例如,我有名称、地址和电话号码的表,每个表的标识都是PK,然后我有一个主实体表实体(entityID)和一个链接表:attribute(entityKey, attributeType, attributeKey),其中attributeKey可以指向前三个表中的任何一个,具体取决于attributeType。

Some advantages: allows as many names, addresses, and phonenumbers per entity as we like, easy to add new attribute types, extreme normalization, easy to mine common attributes (i.e. identify duplicate people), some other business-specific security advantages

一些优点:允许每个实体有尽可能多的名称、地址和电话号码,易于添加新的属性类型,极端规范化,易于挖掘公共属性(例如识别重复的人),以及其他一些特定于业务的安全优势

Disadvantages: quite complex queries to build simple result sets made it difficult to manage (i.e. I had trouble hiring people with good enough T-SQL chops); performance is optimal for VERY specific use cases rather than general; query optimization can be tricky

缺点:构建简单结果集的查询非常复杂,这使得管理起来非常困难(例如,我很难雇佣那些具有足够优秀的T-SQL技能的人);性能对于非常特定的用例是最佳的,而不是通用的;查询优化可能比较棘手

Having lived with this structure for several years out of much longer career, I would hesitate to use it again unless I had the same weird business logic constraints and access patterns. For general usage, I strongly recommend your typed tables directly reference your entities. That is, Entity(entityID), Name(NameID, EntityID, Name), Phone(PhoneID, EntityID, Phone), Email(EmailID, EntityID, Email). You will have some data repetition and some common columns, but it will be much easier to program to and optimize.

在经历了很长一段时间的职业生涯后,我已经使用了这种结构好几年了,如果我没有同样奇怪的业务逻辑约束和访问模式,我将不愿再次使用它。对于一般用法,我强烈建议您的类型化表直接引用您的实体。即实体(entityID)、名称(NameID, entityID, Name)、电话(PhoneID, entityID, Phone)、电子邮件(EmailID, entityID, Email)。您将会有一些数据重复和一些常见的列,但是它将更容易编程和优化。

#5


0  

Approach 1 is best but association between something and object1, object2 ,object3 should be one to one.

方法1是最好的,但是对象1、object2、object3之间的关联应该是一对一的。

I mean FK in child (object1, object2, object3) table should be non null unique key or Primary key for child table.

我的意思是在child中FK (object1, object2, object3)表应该是非空的唯一键或子表的主键。

object1, object2 ,object3 can have Polymorphic object value .

object1、object2、object3可以具有多态对象值。

#6


0  

There is no single or universal best practice to achieve this. It all depends on the type of access the applications will need.

实现这一目标没有单一的或普遍的最佳实践。这完全取决于应用程序需要的访问类型。

My advice would be to make an overview on the expected type of access to these tables:

我的建议是对这些表的预期类型进行概述:

  1. Will you use an OR layer, stored procedures or dynamic SQL?
  2. 您将使用OR层、存储过程还是动态SQL?
  3. What numbers of records do you expect?
  4. 你希望记录的数量是多少?
  5. What is the level of difference between the different subclasses? How many columns?
  6. 不同子类之间的差异有多大?有多少列?
  7. Will you be doing aggregations or other complex reporting?
  8. 您将进行聚合还是其他复杂的报告?
  9. Will you have a data warehouse for reporting or not?
  10. 你是否有一个数据仓库用于报告?
  11. Will you often need to process records of different subclasses in one batch? ...
  12. 您是否经常需要在一个批处理中处理不同子类的记录?…

Based on answers to this questions, we could work out an appropriate solution.

根据对这些问题的回答,我们可以找到一个合适的解决方案。

One additional possibility to store properties specific to subclasses is to use a table with Name/value pairs. This approach can be particularly useful if there is a large number of different subclasses or when the specific fields in the subclasses are used infrequently.

存储特定于子类的属性的另一种可能性是使用具有名称/值对的表。如果有大量不同的子类,或者不经常使用子类中的特定字段,那么这种方法特别有用。

#7


0  

I have used the first approach. Under extreme loads the "Something" table becomes a bottleneck.

我用了第一种方法。在极端负载下,“某物”表成为瓶颈。

I took the approach of having template DDL for my different objects with the attribute specializations being appended to the end of the table definition.

我采用了为不同对象使用模板DDL的方法,并将属性专门化附加到表定义的末尾。

At the DB level if I genuinely needed to represent my different classes as a "Something" recordset then I put a view over the top of them

在DB级别上,如果我真的需要将我的不同的类表示为一个“某物”记录集,那么我就会在它们上面放置一个视图。

SELECT "Something" fields FROM object1
UNION ALL
SELECT "Something" fields FROM object2
UNION ALL
SELECT "Something" fields FROM object3

The challenge is as to how you assign a non-*ing primary key given that you have three independent objects. Typically people use a UUID/GUID however in my case the key was an 64 bit integer generated in an application based on a time and machine in order to avoid *es.

问题在于,如果您有三个独立的对象,那么如何分配一个非冲突的主键。通常人们使用UUID/GUID,但是在我的例子中,键是基于时间和机器在应用程序中生成的64位整数,以避免冲突。

If you take this approach then you avoid the problem of the "Something" object causing locking/blocking.

如果您采用这种方法,那么您就避免了导致锁定/阻塞的“Something”对象的问题。

If you want to alter the "Something" object then this can be awkward now you have three independent objects, all of which will require their structure to be altered.

如果您想要修改“某物”对象,那么现在您有了三个独立的对象,所有这些都需要修改它们的结构。

So to summarise. Option One will work fine in most cases however under seriously heavy load you may observe locking blocking that necessitates splitting out the design.

所以总结。选项一在大多数情况下都可以正常工作,但是在严重的负载下,您可能会观察到锁定的阻塞,因此需要进行设计。

#8


0  

Approach 1 with multiple columns foreign keys is the best one. Because that way you can have pre-defined connections with other tables And that makes it much easier for scripts to select, insert and update data.

使用多列外键的方法1是最好的方法。因为通过这种方式,您可以与其他表具有预定义的连接,这使得脚本更容易选择、插入和更新数据。

#1


1  

The two most common approaches are Table Per Class (i.e. a table for the base class and another table for each subclass that contains the additional columns necessary to describe the subclass) and Table Per Hierarchy (i.e. all columns in one table, with one ore more columns to allow for the discrimination of subclasses. Which is the better approach really depends on the particulars of your application and data access strategy.

两种最常见的方法是表每个类为基类(例如一个表为每个子类和另一个表,其中包含必要的附加列描述子类)和表/层次结构(即所有列在一个表,一个或多个列允许子类的歧视。哪种方法更好取决于应用程序和数据访问策略的细节。

You would have Table Per Class in your first example by reversing the direction of the FK and removing the extra ids from the parent. The other two are essentially variants of table per class.

在您的第一个示例中,您可以通过反转FK的方向并从父类中删除多余的id来获得每个类的表。另外两个基本是每个类的表的变体。

#2


1  

Another common Name for this model is the Supertype Model, where one has a base set of attributes that can be expanded via joining to another entity. In Oracle books, it is taught both as a logical model and physical implementation. The model without the relations would allow data to grow into invalid state and orphan records I would strongly validate the needs before selecting that model. The top model with the relation stored in the base object would cause nulls, and in a case where fields were mutually exclusive you would always have a null. The bottom diagram where the key is enforced in the child object would eliminate the nulls but also make the dependency a soft depenendency and allow orphans if cascading was not enforced. I think assessing those traits will help you select the model that fits best. I have used all three in the past.

这个模型的另一个通用名称是Supertype模型,其中一个有一个可以通过连接到另一个实体来扩展的基本属性集。在Oracle书籍中,它既是逻辑模型又是物理实现。没有关系的模型将允许数据成长为无效状态和孤立记录,我将在选择该模型之前强烈地验证需求。顶部模型与存储在基对象中的关系将导致nulls,在字段互斥的情况下,总是会有null。在子对象中强制执行键的底部图将消除空值,但也使依赖项成为软依赖性,并允许未强制执行级联的孤儿。我认为评估这些特性将有助于你选择最适合的模型。这三种我都用过。

#3


0  

According to me your first type of approach is the best way you can define the data as well as your classes but As your all primary data should be avail for the child.

在我看来,您的第一种方法是您可以定义数据和类的最佳方法,但是由于您的所有主要数据应该对子数据有用。

So you can check your requirement and define the Database.

因此,您可以检查您的需求并定义数据库。

#4


0  

I have used what I guess you would call the base table approach. For example, I had tables for names, addresses, and phonenumbers, each with an identity as PK. Then I had a main entity table entity(entityID), and a linking table: attribute(entityKey, attributeType, attributeKey), wherein the attributeKey could point to any of the first three tables, depending on the attributeType.

我已经使用了基本表方法。例如,我有名称、地址和电话号码的表,每个表的标识都是PK,然后我有一个主实体表实体(entityID)和一个链接表:attribute(entityKey, attributeType, attributeKey),其中attributeKey可以指向前三个表中的任何一个,具体取决于attributeType。

Some advantages: allows as many names, addresses, and phonenumbers per entity as we like, easy to add new attribute types, extreme normalization, easy to mine common attributes (i.e. identify duplicate people), some other business-specific security advantages

一些优点:允许每个实体有尽可能多的名称、地址和电话号码,易于添加新的属性类型,极端规范化,易于挖掘公共属性(例如识别重复的人),以及其他一些特定于业务的安全优势

Disadvantages: quite complex queries to build simple result sets made it difficult to manage (i.e. I had trouble hiring people with good enough T-SQL chops); performance is optimal for VERY specific use cases rather than general; query optimization can be tricky

缺点:构建简单结果集的查询非常复杂,这使得管理起来非常困难(例如,我很难雇佣那些具有足够优秀的T-SQL技能的人);性能对于非常特定的用例是最佳的,而不是通用的;查询优化可能比较棘手

Having lived with this structure for several years out of much longer career, I would hesitate to use it again unless I had the same weird business logic constraints and access patterns. For general usage, I strongly recommend your typed tables directly reference your entities. That is, Entity(entityID), Name(NameID, EntityID, Name), Phone(PhoneID, EntityID, Phone), Email(EmailID, EntityID, Email). You will have some data repetition and some common columns, but it will be much easier to program to and optimize.

在经历了很长一段时间的职业生涯后,我已经使用了这种结构好几年了,如果我没有同样奇怪的业务逻辑约束和访问模式,我将不愿再次使用它。对于一般用法,我强烈建议您的类型化表直接引用您的实体。即实体(entityID)、名称(NameID, entityID, Name)、电话(PhoneID, entityID, Phone)、电子邮件(EmailID, entityID, Email)。您将会有一些数据重复和一些常见的列,但是它将更容易编程和优化。

#5


0  

Approach 1 is best but association between something and object1, object2 ,object3 should be one to one.

方法1是最好的,但是对象1、object2、object3之间的关联应该是一对一的。

I mean FK in child (object1, object2, object3) table should be non null unique key or Primary key for child table.

我的意思是在child中FK (object1, object2, object3)表应该是非空的唯一键或子表的主键。

object1, object2 ,object3 can have Polymorphic object value .

object1、object2、object3可以具有多态对象值。

#6


0  

There is no single or universal best practice to achieve this. It all depends on the type of access the applications will need.

实现这一目标没有单一的或普遍的最佳实践。这完全取决于应用程序需要的访问类型。

My advice would be to make an overview on the expected type of access to these tables:

我的建议是对这些表的预期类型进行概述:

  1. Will you use an OR layer, stored procedures or dynamic SQL?
  2. 您将使用OR层、存储过程还是动态SQL?
  3. What numbers of records do you expect?
  4. 你希望记录的数量是多少?
  5. What is the level of difference between the different subclasses? How many columns?
  6. 不同子类之间的差异有多大?有多少列?
  7. Will you be doing aggregations or other complex reporting?
  8. 您将进行聚合还是其他复杂的报告?
  9. Will you have a data warehouse for reporting or not?
  10. 你是否有一个数据仓库用于报告?
  11. Will you often need to process records of different subclasses in one batch? ...
  12. 您是否经常需要在一个批处理中处理不同子类的记录?…

Based on answers to this questions, we could work out an appropriate solution.

根据对这些问题的回答,我们可以找到一个合适的解决方案。

One additional possibility to store properties specific to subclasses is to use a table with Name/value pairs. This approach can be particularly useful if there is a large number of different subclasses or when the specific fields in the subclasses are used infrequently.

存储特定于子类的属性的另一种可能性是使用具有名称/值对的表。如果有大量不同的子类,或者不经常使用子类中的特定字段,那么这种方法特别有用。

#7


0  

I have used the first approach. Under extreme loads the "Something" table becomes a bottleneck.

我用了第一种方法。在极端负载下,“某物”表成为瓶颈。

I took the approach of having template DDL for my different objects with the attribute specializations being appended to the end of the table definition.

我采用了为不同对象使用模板DDL的方法,并将属性专门化附加到表定义的末尾。

At the DB level if I genuinely needed to represent my different classes as a "Something" recordset then I put a view over the top of them

在DB级别上,如果我真的需要将我的不同的类表示为一个“某物”记录集,那么我就会在它们上面放置一个视图。

SELECT "Something" fields FROM object1
UNION ALL
SELECT "Something" fields FROM object2
UNION ALL
SELECT "Something" fields FROM object3

The challenge is as to how you assign a non-*ing primary key given that you have three independent objects. Typically people use a UUID/GUID however in my case the key was an 64 bit integer generated in an application based on a time and machine in order to avoid *es.

问题在于,如果您有三个独立的对象,那么如何分配一个非冲突的主键。通常人们使用UUID/GUID,但是在我的例子中,键是基于时间和机器在应用程序中生成的64位整数,以避免冲突。

If you take this approach then you avoid the problem of the "Something" object causing locking/blocking.

如果您采用这种方法,那么您就避免了导致锁定/阻塞的“Something”对象的问题。

If you want to alter the "Something" object then this can be awkward now you have three independent objects, all of which will require their structure to be altered.

如果您想要修改“某物”对象,那么现在您有了三个独立的对象,所有这些都需要修改它们的结构。

So to summarise. Option One will work fine in most cases however under seriously heavy load you may observe locking blocking that necessitates splitting out the design.

所以总结。选项一在大多数情况下都可以正常工作,但是在严重的负载下,您可能会观察到锁定的阻塞,因此需要进行设计。

#8


0  

Approach 1 with multiple columns foreign keys is the best one. Because that way you can have pre-defined connections with other tables And that makes it much easier for scripts to select, insert and update data.

使用多列外键的方法1是最好的方法。因为通过这种方式,您可以与其他表具有预定义的连接,这使得脚本更容易选择、插入和更新数据。