EAV模型与混合策略的替代方案与简化和改进构建相比

时间:2021-06-05 12:58:44

I've been doing a ton of research on database design for an upcoming project.

我一直在为即将开展的项目进行大量的数据库设计研究。

This is the classic inner platform problem, where our client basically wants infinite customization and the ability to create forms and attributes on an entity, collect them values from end users, and be able to display the collected information on graphs.

这是典型的内部平台问题,我们的客户基本上需要无限定制,能够在实体上创建表单和属性,从最终用户收集它们的值,并能够在图表上显示收集的信息。

In will be something used by the clinicians to help monitor patients, and why even using the EAV is a thought is that we'll need to collect different information for different trial runs. Sometimes it might be what they ate that day. Others it might be blood sugar, or blood pressure(which is really two numbers), and othertimes it might be multiple questions (how is your pain today from 1-10?), all with the idea that we'll never really know in advance what exactly the end client will be asking for, or really what the accepted values will be.

临床医生将用它来帮助监测患者,为什么即使使用EAV,我们也需要为不同的试运行收集不同的信息。有时可能是他们那天吃的东西。其他可能是血糖,或血压(这实际上是两个数字),其他可能是多个问题(今天你的疼痛怎么从1-10?),所有的想法都是我们永远不会真正知道的提前最终客户要求的是什么,或者真正接受的是什么。

We'll also be graphing this data consistently throughout the program, and generating larger reports on a less regular basis.

我们还将在整个计划中一致地绘制这些数据,并在较不规则的基础上生成更大的报告。

Ideally I'd like to able to hard code as much of this as possible, as we are using SQL, and sticking to relational database best practices will simplify both the database design and the application design (both of which I'm writing).

理想情况下,我希望能够尽可能多地硬编码,因为我们使用SQL,并且坚持关系数据库最佳实践将简化数据库设计和应用程序设计(我正在编写这两者)。

We're doing a few trial runs, and my first inclination is to get as much information as possible from the cients, hard code the tables in the database, and then build from there. If we discover that we NEED to use an attribute table and an attribue_value table to collect those attributes (along with the fun-to-implement form builder things like dropdowns - and thus dropdown menu options and validation/required), we could do so for later launches.

我们正在进行一些试运行,我的第一个倾向是从客户那里获取尽可能多的信息,对数据库中的表进行硬编码,然后从那里构建。如果我们发现我们需要使用属性表和attribue_value表来收集这些属性(以及有趣的实现表单构建器,如下拉菜单 - 从而下拉菜单选项和验证/需要),我们可以这样做后来发布。

I've gone through basically every relevent stack overflow post; most say avoid EAV, get a better understanding of the requirements of the application, and, at some point, if the customer TRULY needs an EAV implementation, to go ahead and do it then.

我基本上已经完成了每个相关的堆栈溢出帖子;大多数人说避免EAV,更好地了解应用程序的要求,并且,在某些时候,如果客户TRULY需要EAV实施,那么继续执行它。

  • Has anyone ever used a hybrid model? Can you discuss it?

    有没有人曾经使用混合动力车型?你能讨论一下吗?

  • Has anyone ever successfully implemented the EAV model, and can you discuss it?

    有没有人成功实施过EAV模型,你能讨论一下吗?

  • Has you been in a similar decision, decided to not implement EAV for a project that seemed like it might have been a candidate? How did that turn out?

    你有没有做过类似的决定,决定不为一个似乎可能成为候选人的项目实施EAV?那是怎么回事?

Here are some interesting reads I've found along the way:

以下是我在此过程中发现的一些有趣的读物:

http://decipherinfosys.wordpress.com/2007/01/29/name-value-pair-design/ Storing time-series data, relational or non? Database EAV Pros/Cons and Alternatives Alternatives to Entity-Attribute-Value (EAV)?

http://decipherinfosys.wordpress.com/2007/01/29/name-value-pair-design/存储时间序列数据,关系数据还是非存储数据?数据库EAV优点/缺点和替代实体 - 属性 - 值(EAV)的替代方案?

And the link that really gave me a ton of insight.

这个链接确实给了我很多洞察力。

1 个解决方案

#1


0  

After some thought, and considering the clients needs/requests, using an EAV model was the correct answer here.

经过一番思考,并考虑到客户的需求/要求,使用EAV模型是正确的答案。

After doing some more research I decided to use Postrgresql and make full use of its HSTORE data type, which allows storing, searching, and indexing of key value pairs in a single field.

在做了一些研究后,我决定使用Postrgresql并充分利用其HSTORE数据类型,该类型允许在单个字段中存储,搜索和索引键值对。

Here is a paper benchmarking hstore vs EAV: http://wiki.hsr.ch/Datenbanken/files/Benchmark_of_KVP_vs.hstore-_doc.pdf

这是一篇关于hstore与EAV的论文基准:http://wiki.hsr.ch/Datenbanken/files/Benchmark_of_KVP_vs.hstore-_doc.pdf

The paper above benchmarks hstore vs an EAV table, and hstore came out way ahead.

上面的论文基准测试hstore vs一个EAV表,hstore走在了前面。

Another option we considered was having a task table that covered all the bases:

我们考虑的另一个选择是有一个涵盖所有基础的任务表:

id, name, value_1, value_2... note_1, notes_2

id,name,value_1,value_2 ... note_1,notes_2

Obviously the thought of that killed me inside a bit, so I was either going to use a task_type attribute table:

很明显,这个想法让我内心一点儿死了,所以我要么使用task_type属性表:

a task is prescribed by an administrator to a user and has a task_type, the task_type_attributes are for all tasks of that type (ie, define that for a exercise task, we want to be able to store information about the intensity of the exercise, the time the exercise took etc).

任务由管理员用户规定并具有task_type,该task_type_attributes是该类型的所有任务(即定义为一个演习任务,我们希望能够存储有关运动的强度信息,该运动时间等)。

Once the user brings up the task, they see the task_attributes as fields to fill out. They enter these fields, and the attribute_value they enter are then associated with the task_entry of the patient (which also states if they completed it, skipped it, etc)

用户启动任务后,会将task_attributes视为要填写的字段。他们进入这些字段,然后他们输入的attribute_value与患者的task_entry相关联(也表明他们是否已完成,跳过它等)

task_attributes

  • id
  • task_type_id
  • attribute
  • attribute_value_type (for generating the desired fields on the app side - ie, knowing to have a dropdown vs a text input)
  • attribute_value_type(用于在应用程序端生成所需字段 - 即知道有一个下拉列表与文本输入)

  • min_value
  • max_value
  • required

tasK_entry_values

  • task_entry_id
  • task_type_attribute_id
  • value

Hope this might be of use to someone. I'd also be interested in any and all criticism/feedback for this design.

希望这对某人有用。我也对这个设计的任何批评/反馈感兴趣。

#1


0  

After some thought, and considering the clients needs/requests, using an EAV model was the correct answer here.

经过一番思考,并考虑到客户的需求/要求,使用EAV模型是正确的答案。

After doing some more research I decided to use Postrgresql and make full use of its HSTORE data type, which allows storing, searching, and indexing of key value pairs in a single field.

在做了一些研究后,我决定使用Postrgresql并充分利用其HSTORE数据类型,该类型允许在单个字段中存储,搜索和索引键值对。

Here is a paper benchmarking hstore vs EAV: http://wiki.hsr.ch/Datenbanken/files/Benchmark_of_KVP_vs.hstore-_doc.pdf

这是一篇关于hstore与EAV的论文基准:http://wiki.hsr.ch/Datenbanken/files/Benchmark_of_KVP_vs.hstore-_doc.pdf

The paper above benchmarks hstore vs an EAV table, and hstore came out way ahead.

上面的论文基准测试hstore vs一个EAV表,hstore走在了前面。

Another option we considered was having a task table that covered all the bases:

我们考虑的另一个选择是有一个涵盖所有基础的任务表:

id, name, value_1, value_2... note_1, notes_2

id,name,value_1,value_2 ... note_1,notes_2

Obviously the thought of that killed me inside a bit, so I was either going to use a task_type attribute table:

很明显,这个想法让我内心一点儿死了,所以我要么使用task_type属性表:

a task is prescribed by an administrator to a user and has a task_type, the task_type_attributes are for all tasks of that type (ie, define that for a exercise task, we want to be able to store information about the intensity of the exercise, the time the exercise took etc).

任务由管理员用户规定并具有task_type,该task_type_attributes是该类型的所有任务(即定义为一个演习任务,我们希望能够存储有关运动的强度信息,该运动时间等)。

Once the user brings up the task, they see the task_attributes as fields to fill out. They enter these fields, and the attribute_value they enter are then associated with the task_entry of the patient (which also states if they completed it, skipped it, etc)

用户启动任务后,会将task_attributes视为要填写的字段。他们进入这些字段,然后他们输入的attribute_value与患者的task_entry相关联(也表明他们是否已完成,跳过它等)

task_attributes

  • id
  • task_type_id
  • attribute
  • attribute_value_type (for generating the desired fields on the app side - ie, knowing to have a dropdown vs a text input)
  • attribute_value_type(用于在应用程序端生成所需字段 - 即知道有一个下拉列表与文本输入)

  • min_value
  • max_value
  • required

tasK_entry_values

  • task_entry_id
  • task_type_attribute_id
  • value

Hope this might be of use to someone. I'd also be interested in any and all criticism/feedback for this design.

希望这对某人有用。我也对这个设计的任何批评/反馈感兴趣。