数据库设计—层次结构建模的表设计

时间:2022-12-16 16:54:05

I am designing a laboratory information system (LIS) and am confused on how to design the tables for the different laboratory tests. How should I deal with a table that has an attribute with multiple values and each of the multiple values of that attribute can also have multiple values as well?

我正在设计一个实验室信息系统(LIS),对如何为不同的实验室测试设计表格感到困惑。我应该如何处理具有多个值的属性的表,该属性的每个值也可以具有多个值?

Here's some of the data in my LIS design...

以下是我的LIS设计中的一些数据……

    HEMATOLOGY  <-------- Lab group
    **************************************************************
     CBC        <-------- Sub group 1
       RBC      <-------- Component
       WBC
       Hemoglobin
       Hematocrit
       MCV
       MCH
       MCHC
       Platelet count
     Hemoglobin
     Hematocrit
     WBC differential
       Neutrophils
       Lymphocytes
       Monocytes
       Eosinophils
       Basophils
     Platelet count
     Reticulocyte count
     ESR
     Bleeding time
     Clotting time
     Pro-time
     Peripheral smear
     Malarial smear
     ABO
     RH typing

    CLINICAL MICROSCOPY       <-------- Lab Group
    **************************************************************
     Routine urinalysis       <-------- Sub group 1
       Visual Examination     <-------- Sub group 2
         Color                <-------- Component
         Turbidity
         Specific Gravity       
       Chemical Examination
         pH
         protein
         glucose
         ketones
         RBC
         Hbg
         bilirubin
         specific gravitiy
         nitrite for bacteria
         urobilinogen
         leukocyte esterase 
       Microscopic Examination
         Red Blood Cells (RBCs)
         White Blood Cells (WBCs)
         Epithelial Cells 
         Microorganisms (bacteria, trichomonads, yeast) 
         Trichomonads 
         Casts 
         Crystals
     Occult Blood
     Pregnancy Test 

...This hierarchy of data also gets repeated in other lab groupings in my design (e.g. Blood chemistry, Serology, etc)...

…在我的设计中,这种层次数据在其他实验室分组中也会重复出现(例如,血液化学、血清学等)……

Another question is, how am I gonna deal with a component (for example, RBC) which can be a member of one or more lab groups?

另一个问题是,我如何处理一个组件(例如,RBC),它可以是一个或多个实验室组的成员?

I already implemented a solution to my problem by making a separate tables, 1 for lab group, 1 for sub group 1, 1 for sub group 2 and 1 for component. And then created another table to consolidate all of them by placing a foreign key of each in this table...the only trade off is that some of the rows in this table may have null values. Im not satisfied with my design, so I'm hoping someone could give me advise on how to make it right; any help would be greatly appreciated.

我已经为我的问题实现了一个解决方案,分别做了一个单独的表,1个实验组,1个子组1,第2组和第1组。然后创建另一个表,通过在这个表中放置每个表的外键来合并它们……唯一的权衡是该表中的一些行可能具有null值。我对我的设计不满意,所以我希望有人能给我一些如何使它正确的建议;如有任何帮助,我们将不胜感激。

2 个解决方案

#1


2  

Here are a couple options:

这里有几个选择:

If it is just the hierarchy above you are modeling, and there is no other data involved, then you can do it in two tables:

如果只是上面的层次结构建模,并且没有其他数据涉及,那么您可以在两个表中完成:

数据库设计—层次结构建模的表设计

One problem with this is that you do not enforce that, for example, a sub_group must be a child of a lab_group, or that a component must be child of either a sub_group_1 or a sub_group_2, but you could enforce these requirements in your application tier instead.

这样做的一个问题是,您不能强制执行,例如,sub_group必须是lab_group的子元素,或者组件必须是sub_group_1或sub_group_2的子元素,但是您可以在应用程序层中强制执行这些需求。

The plus side of this approach is that the schema is nice and simple. Even if the entities have more data associated with them, it might still be worth modeling the hierarchy like this and have some separate tables for the entities themselves.

这种方法的优点是模式很好也很简单。即使实体有更多与它们相关联的数据,也有必要像这样对层次结构进行建模,并为实体本身提供一些单独的表。

If you want to enforce the correct relationships at the data level, then you are going to have to split it out into separate tables. Maybe something like this:

如果您希望在数据级别上执行正确的关系,那么您将不得不将其拆分为单独的表。也许是这样的:

数据库设计—层次结构建模的表设计

This assumes that each sub_group_1 is only related to a single lab_group. If this is not the case then add a link table between lab_group and sub_group_1. Likewise for the sub_group_1 -> sub_group_2 relationship.

这假定每个sub_group_1只与一个lab_group相关。如果不是这样,那么在lab_group和sub_group_1之间添加一个链接表。对于sub_group_1 -> sub_group_2关系也是如此。

There is a single link table between component and sub_group_1 and sub_group_2. This allows a single component to be related to several sub_group_1 and sub_group_2 entities. The fact it is a single table means that a lot of the sub_group_1_id and sub_group_2_id records will be null (like you mentioned in your question). You could prevent the nulls be having two separate link tables:

组件和sub_group_1和sub_group_2之间有一个链接表。这允许单个组件与几个sub_group_1和sub_group_2实体相关联。它是一个单独的表,这意味着许多sub_group_1_id和sub_group_2_id记录将是空的(正如您在问题中提到的)。您可以防止nulls有两个独立的链路表:

  • sub_group_1_component with a foreign key to sub_group_1 and a foreign key to component
  • sub_group_1_component具有sub_group_1的外键和组件的外键
  • sub_group_2_component with a foreign key to sub_group_2 and a foreign key to component
  • sub_group_2_组件,带有一个外键到sub_group_2和一个外键到组件。

The reason I didn't put this in the diagram is that for me, having to query two tables rather than one to get all the component -> sub_group relationships is too much of a pain. For the sake of a little denormalisation (allowing a few nulls) it is much easier to query a single table. If you find yourself allowing a lot of nulls (like a single link table for the relationships between all the entities here) then that is probably denormalising too much.

我没有在图中说明的原因是,对于我来说,必须查询两个表而不是一个表来得到所有的组件-> sub_group关系太麻烦了。为了实现一个小的去核化(允许一些空值),查询一个表要容易得多。如果您发现自己允许大量的空值(比如这里所有实体之间的关系的一个链接表),那么这可能会导致太多的去核化。

#2


1  

Personally, I would create 3 tables using relationships for the values. It gives you the ability to create limitless arrays of values. Just try to make sure you give great column names, or your head will spin for days. :)

就个人而言,我将使用关系为值创建3个表。它使您能够创建无限的价值数组。只要确保你的列名写得很好,否则你的头会发晕好几天。:)

Also, null values aren't a problem look into all the different type of joins

此外,空值也不是问题,可以查看所有不同类型的连接

#1


2  

Here are a couple options:

这里有几个选择:

If it is just the hierarchy above you are modeling, and there is no other data involved, then you can do it in two tables:

如果只是上面的层次结构建模,并且没有其他数据涉及,那么您可以在两个表中完成:

数据库设计—层次结构建模的表设计

One problem with this is that you do not enforce that, for example, a sub_group must be a child of a lab_group, or that a component must be child of either a sub_group_1 or a sub_group_2, but you could enforce these requirements in your application tier instead.

这样做的一个问题是,您不能强制执行,例如,sub_group必须是lab_group的子元素,或者组件必须是sub_group_1或sub_group_2的子元素,但是您可以在应用程序层中强制执行这些需求。

The plus side of this approach is that the schema is nice and simple. Even if the entities have more data associated with them, it might still be worth modeling the hierarchy like this and have some separate tables for the entities themselves.

这种方法的优点是模式很好也很简单。即使实体有更多与它们相关联的数据,也有必要像这样对层次结构进行建模,并为实体本身提供一些单独的表。

If you want to enforce the correct relationships at the data level, then you are going to have to split it out into separate tables. Maybe something like this:

如果您希望在数据级别上执行正确的关系,那么您将不得不将其拆分为单独的表。也许是这样的:

数据库设计—层次结构建模的表设计

This assumes that each sub_group_1 is only related to a single lab_group. If this is not the case then add a link table between lab_group and sub_group_1. Likewise for the sub_group_1 -> sub_group_2 relationship.

这假定每个sub_group_1只与一个lab_group相关。如果不是这样,那么在lab_group和sub_group_1之间添加一个链接表。对于sub_group_1 -> sub_group_2关系也是如此。

There is a single link table between component and sub_group_1 and sub_group_2. This allows a single component to be related to several sub_group_1 and sub_group_2 entities. The fact it is a single table means that a lot of the sub_group_1_id and sub_group_2_id records will be null (like you mentioned in your question). You could prevent the nulls be having two separate link tables:

组件和sub_group_1和sub_group_2之间有一个链接表。这允许单个组件与几个sub_group_1和sub_group_2实体相关联。它是一个单独的表,这意味着许多sub_group_1_id和sub_group_2_id记录将是空的(正如您在问题中提到的)。您可以防止nulls有两个独立的链路表:

  • sub_group_1_component with a foreign key to sub_group_1 and a foreign key to component
  • sub_group_1_component具有sub_group_1的外键和组件的外键
  • sub_group_2_component with a foreign key to sub_group_2 and a foreign key to component
  • sub_group_2_组件,带有一个外键到sub_group_2和一个外键到组件。

The reason I didn't put this in the diagram is that for me, having to query two tables rather than one to get all the component -> sub_group relationships is too much of a pain. For the sake of a little denormalisation (allowing a few nulls) it is much easier to query a single table. If you find yourself allowing a lot of nulls (like a single link table for the relationships between all the entities here) then that is probably denormalising too much.

我没有在图中说明的原因是,对于我来说,必须查询两个表而不是一个表来得到所有的组件-> sub_group关系太麻烦了。为了实现一个小的去核化(允许一些空值),查询一个表要容易得多。如果您发现自己允许大量的空值(比如这里所有实体之间的关系的一个链接表),那么这可能会导致太多的去核化。

#2


1  

Personally, I would create 3 tables using relationships for the values. It gives you the ability to create limitless arrays of values. Just try to make sure you give great column names, or your head will spin for days. :)

就个人而言,我将使用关系为值创建3个表。它使您能够创建无限的价值数组。只要确保你的列名写得很好,否则你的头会发晕好几天。:)

Also, null values aren't a problem look into all the different type of joins

此外,空值也不是问题,可以查看所有不同类型的连接