是一个数据库中介良好的系统设计?

时间:2021-03-30 13:02:43

background: we've got a number of server processes and client apps that are used entirely internally, in a fairly controlled environment. we capture a significant amount of data every day that goes into a couple database machines. most everything is c#, with a few c++ apps.

背景:我们有一些服务器进程和客户端应用程序完全在内部使用,处于相当受控的环境中。我们每天捕获大量数据到一对数据库机器中。大多数都是c#,有一些c ++应用程序。

just about every app has some basic (if not extensive) dependence on database data, whether it's for historical data, daily-calculated values, or assorted parameters. as the whole environment has gotten a bit more sprawling, I've been wondering about the sense in sticking an intermediary in between all client and server apps and the database, a sort of "database data broker". any app that needs values from the db makes a request to the data broker, instead of a dll wrapper function that calls a stored proc.

几乎每个应用程序都有一些基本的(如果不是广泛的)依赖于数据库数据,无论是历史数据,每日计算值还是各种参数。随着整个环境变得越来越庞大,我一直想知道在所有客户端和服务器应用程序与数据库(一种“数据库数据代理”)之间插入中介的意义。任何需要来自db的值的应用程序都会向数据代理发出请求,而不是调用存储过程的dll包装函数。

one immediate downside is that the data would make two trips across the network: from db to broker, and from broker to calling app. seems like poor form, but the amount of data would be small enough in each request that I'm ok with it as far as performance goes.

一个直接的缺点是数据将通过网络进行两次访问:从数据库到代理,从代理到调用应用程序。看起来很糟糕,但是在每个请求中,数据量都足够小,就性能而言,我对它很满意。

one (seeming) upside is that it would be trivial to set up a test environment, as it would entail just setting up a test data broker, and there's no maintaining of db connection strings locally anywhere else. also, I've been pondering creating a mini request language so you wouldn't have to enumerate functions for each dataset you might request (instead of GetX() and GetY(), there would be Get("name = X")

一个(看似)好处是设置一个测试环境是微不足道的,因为它只需要设置一个测试数据代理,并且在其他任何地方都不需要在本地维护数据库连接字符串。另外,我一直在考虑创建一个迷你请求语言,所以你不必枚举你可能要求的每个数据集的函数(而不是GetX()和GetY(),会有Get(“name = X”)

am I over-engineering this, or is it possibly a worthy architecture?

我是在过度设计这个,还是可能是一个有价值的架构?

edit: thanks for all the great comments so far, great food for thought.

编辑:感谢所有伟大的评论到目前为止,伟大的思考。

7 个解决方案

#1


It depends on what you're trying to accomplish with it. According to Rocky Lhotka, you should only add a tier if you are forced to, kicking and screaming all the way.

这取决于你想用它完成什么。根据Rocky Lhotka的说法,如果你*,一直踢,尖叫,你应该只增加一层。

I agree with him: don't tier unless you need to. I think there are valid reasons to add additional tiers, usually for purposes of security, scalability and maintainability. The question becomes: is yours a valid reason?

我同意他的意见:除非你需要,否则不要分层。我认为有正当理由添加额外的层,通常是为了安全性,可伸缩性和可维护性。问题变成:你的理由是正确的吗?

It looks like the major reason is maintainability. Does it outweigh the benefits you get by not having the tier?

看起来主要原因是可维护性。是否超过了没有等级所带来的好处?

#2


only you can answer these:

只有你能回答这些:

  • what are the benefits of doing this?
  • 这样做有什么好处?

  • what are the problems/risks of doing this?
  • 这样做的问题/风险是什么?

  • do you need this to make testing easier or even possible?
  • 您是否需要这样才能使测试更容易甚至可能?

  • if you make this change and when it goes live and crashes will you be fired?
  • 如果你做出这个改变,它何时上线,你会被解雇?

  • if you make the changes and it goes live will you get a promotion?
  • 如果您进行更改并且它上线,您会获得促销吗?

  • etc...

#3


As the former architect of a system that also used a database heavily as a "hub," I can say that there are several drawbacks that you should be aware of. Our system used databases:

作为一个系统的前架构师,该系统也将数据库作为“集线器”使用,我可以说有几个缺点你应该注意。我们的系统用数据库:

  • As a transaction store (typical OLTP stuff)
  • 作为交易商店(典型的OLTP东西)

  • As a staging queue (submitted but unprocessed transactions)
  • 作为暂存队列(已提交但未处理的事务)

  • As a historical data store (results of processed transactions)
  • 作为历史数据存储(已处理事务的结果)

  • As an interoperation layer (untranslated commands or transactions issued from other systems)
  • 作为互操作层(未翻译的命令或从其他系统发出的事务)

One of the major drawbacks is ownership costs. When your databases become the single point of failure for so many types of operations, it becomes necessary to ensure that they are all hosted in high-availability environments. This not only expensive from a hardware perspective, but it is also expensive to support deployments to HA environments, since developers typically have very limited visibility to the internals.

其中一个主要缺点是拥有成本。当您的数据库成为这么多类型操作的单点故障时,有必要确保它们都在高可用性环境中托管。从硬件角度来看,这不仅昂贵,而且支持部署到HA环境也很昂贵,因为开发人员通常对内部的可见性非常有限。

A second drawback is that you have to seriously design integrity in to all of your tables. In a typical SOA environment, you have complete control over how data is modified. When you expose it through database tables, you must consider that any application with the right credentials will have the ability to modify data. Because of this, you must carefully consider utilitarian implementations of constraints. If you had a single service managing persistence, you could be much looser in constraints on the database and enforce them in code.

第二个缺点是你必须认真设计所有表的完整性。在典型的SOA环境中,您可以完全控制数据的修改方式。通过数据库表公开它时,必须考虑具有正确凭据的任何应用程序都能够修改数据。因此,您必须仔细考虑约束的实用实现。如果您有一个管理持久性的服务,您可能会对数据库的约束更加宽松并在代码中强制执行它们。

Third, if you ever want to expose any functionality that the database tables currently allow you to provide to outside parties, you must write service code anyway, so you might be better served doing it strategically as opposed to reacting to requests.

第三,如果您希望公开数据库表当前允许您向外部各方提供的任何功能,您必须始终编写服务代码,这样您可以更好地服务于战略而不是对请求作出反应。

Fourth, UI interaction directly with the data layer creates security risks, especially if the client is a thick client.

第四,直接与数据层进行UI交互会产生安全风险,尤其是在客户端是胖客户端的情况下。

Finally, writing code that responds to events (service calls) is much easier than polling code. Typically, organizations that rely heavily on database polling end up reinventing the wheel every time a new project requires a new "monitoring service." It can be avoided by creating a "framework," but those have their own pitfalls (primarily around prescription versus adoption).

最后,编写响应事件(服务调用)的代码比轮询代码容易得多。通常,每当新项目需要新的“监控服务”时,严重依赖数据库轮询的组织最终会重新发明*。可以通过创建“框架”来避免这种情况,但这些问题有其自身的缺陷(主要围绕处方与采用)。

This is just a laundry list of problems I have encountered. It's not necessarily meant to dissuade you from using databases for these functions, but it helps to know the dangers ahead of time so you can at least plan for them if they ever do become issues.

这只是我遇到的问题的清单。它并不一定意味着阻止您使用数据库来实现这些功能,但它有助于提前了解危险,因此如果它们成为问题,您至少可以为它们进行规划。

EDIT

Just thought of another scenario that caused us pains. Versioning your changes can be difficult. For example, if you need to change the shape of a table (normalize/denormalize), it has a cascading effect if multiple applications rely on it. In a SOA scenario, it is much easier, because you can keep your old API, change the internal interaction so that it works with the changed tables, and allow consumers to migrate to the new version on their own schedule.

想到另一个让我们痛苦的场景。对更改进行版本控制可能很困难。例如,如果需要更改表的形状(normalize / denormalize),如果多个应用程序依赖它,则会产生级联效果。在SOA场景中,它更容易,因为您可以保留旧API,更改内部交互以使其与更改的表一起使用,并允许使用者按照自己的计划迁移到新版本。

#4


A data broker sounds like a really good way to abstract out the multiple data sources for your apps. It would be easy to consolidate, change repositories, or otherwise move data around if needed in the future.

数据代理听起来像是一种非常好的方法来抽象出应用程序的多个数据源。如果将来需要,可以很容易地整合,更改存储库或以其他方式移动数据。

#5


I may be misunderstanding something, but it seems to me like you should consider some entity framework. That is a framework you can use to "map" your interaction with the db to some domain objects. That way you work locally on domain objects that gets filled form your db, and when it is time to persist the state of your objects to the base, the framework handles all the connections back and forth. In this way you can also easily mock up these domain objects for unit testing without needing a db connection.

我可能会误解某些东西,但在我看来,你应该考虑一些实体框架。这是一个框架,您可以使用该框架将与db的交互“映射”到某些域对象。这样你就可以在你的数据库中填充的域对象本地工作,当需要将对象的状态持久保存到基础时,框架会来回处理所有连接。通过这种方式,您还可以轻松地模拟这些域对象以进行单元测试,而无需数据库连接。

Check out NHibernate for a good entity framework alternative.

查看NHibernate以获得一个好的实体框架替代方案。

#6


If you already have the database related know-how I think it's not a bad decission.

如果您已经拥有与数据库相关的专有技术,我认为这不是一个糟糕的决定。

Good things that I can think of:

我能想到的好事:

  • if the data model is consistent you can plug in new tools easily without making any changes in the other apps.
  • 如果数据模型一致,您可以轻松插入新工具,而无需对其他应用程序进行任何更改。

  • maybe you can have running the database more reliabily than your apps, so if one of them fails, the other one can still be working.
  • 也许你可以比你的应用程序更可靠地运行数据库,所以如果其中一个失败,另一个仍然可以工作。

  • you can make backups and rollbacks using the database tools.
  • 您可以使用数据库工具进行备份和回滚。

  • you can do emergency fixes manipulating the data directly with sql or some visual tool.
  • 您可以使用sql或某些可视化工具直接操作数据进行紧急修复。

But if you have to learn new frameworks along the way, maybe the benefits are not worth the extra initial effort.

但是如果你必须在这个过程中学习新的框架,也许这些好处不值得额外的初步努力。

#7


"any app that needs values from the db makes a request to the data broker"

“任何需要来自db的值的应用程序都会向数据代理发出请求”

When database technology was being invented over 40 years ago, the people doing that inventing had ideas along the lines of "any app that needs values from the db makes a request to the dbms".

当数据库技术在40多年前发明时,那些发明创造的人就有了“任何需要来自db的值的应用程序向dbms发出请求”的想法。

Have you ever pondered the possibility that YOU ALREADY HAVE a "data broker", and that there might be very little added value in creating a second one of your own ?

您有没有考虑过您是否已经拥有“数据经纪人”的可能性,并且创建自己的第二个可能没有什么附加价值?

#1


It depends on what you're trying to accomplish with it. According to Rocky Lhotka, you should only add a tier if you are forced to, kicking and screaming all the way.

这取决于你想用它完成什么。根据Rocky Lhotka的说法,如果你*,一直踢,尖叫,你应该只增加一层。

I agree with him: don't tier unless you need to. I think there are valid reasons to add additional tiers, usually for purposes of security, scalability and maintainability. The question becomes: is yours a valid reason?

我同意他的意见:除非你需要,否则不要分层。我认为有正当理由添加额外的层,通常是为了安全性,可伸缩性和可维护性。问题变成:你的理由是正确的吗?

It looks like the major reason is maintainability. Does it outweigh the benefits you get by not having the tier?

看起来主要原因是可维护性。是否超过了没有等级所带来的好处?

#2


only you can answer these:

只有你能回答这些:

  • what are the benefits of doing this?
  • 这样做有什么好处?

  • what are the problems/risks of doing this?
  • 这样做的问题/风险是什么?

  • do you need this to make testing easier or even possible?
  • 您是否需要这样才能使测试更容易甚至可能?

  • if you make this change and when it goes live and crashes will you be fired?
  • 如果你做出这个改变,它何时上线,你会被解雇?

  • if you make the changes and it goes live will you get a promotion?
  • 如果您进行更改并且它上线,您会获得促销吗?

  • etc...

#3


As the former architect of a system that also used a database heavily as a "hub," I can say that there are several drawbacks that you should be aware of. Our system used databases:

作为一个系统的前架构师,该系统也将数据库作为“集线器”使用,我可以说有几个缺点你应该注意。我们的系统用数据库:

  • As a transaction store (typical OLTP stuff)
  • 作为交易商店(典型的OLTP东西)

  • As a staging queue (submitted but unprocessed transactions)
  • 作为暂存队列(已提交但未处理的事务)

  • As a historical data store (results of processed transactions)
  • 作为历史数据存储(已处理事务的结果)

  • As an interoperation layer (untranslated commands or transactions issued from other systems)
  • 作为互操作层(未翻译的命令或从其他系统发出的事务)

One of the major drawbacks is ownership costs. When your databases become the single point of failure for so many types of operations, it becomes necessary to ensure that they are all hosted in high-availability environments. This not only expensive from a hardware perspective, but it is also expensive to support deployments to HA environments, since developers typically have very limited visibility to the internals.

其中一个主要缺点是拥有成本。当您的数据库成为这么多类型操作的单点故障时,有必要确保它们都在高可用性环境中托管。从硬件角度来看,这不仅昂贵,而且支持部署到HA环境也很昂贵,因为开发人员通常对内部的可见性非常有限。

A second drawback is that you have to seriously design integrity in to all of your tables. In a typical SOA environment, you have complete control over how data is modified. When you expose it through database tables, you must consider that any application with the right credentials will have the ability to modify data. Because of this, you must carefully consider utilitarian implementations of constraints. If you had a single service managing persistence, you could be much looser in constraints on the database and enforce them in code.

第二个缺点是你必须认真设计所有表的完整性。在典型的SOA环境中,您可以完全控制数据的修改方式。通过数据库表公开它时,必须考虑具有正确凭据的任何应用程序都能够修改数据。因此,您必须仔细考虑约束的实用实现。如果您有一个管理持久性的服务,您可能会对数据库的约束更加宽松并在代码中强制执行它们。

Third, if you ever want to expose any functionality that the database tables currently allow you to provide to outside parties, you must write service code anyway, so you might be better served doing it strategically as opposed to reacting to requests.

第三,如果您希望公开数据库表当前允许您向外部各方提供的任何功能,您必须始终编写服务代码,这样您可以更好地服务于战略而不是对请求作出反应。

Fourth, UI interaction directly with the data layer creates security risks, especially if the client is a thick client.

第四,直接与数据层进行UI交互会产生安全风险,尤其是在客户端是胖客户端的情况下。

Finally, writing code that responds to events (service calls) is much easier than polling code. Typically, organizations that rely heavily on database polling end up reinventing the wheel every time a new project requires a new "monitoring service." It can be avoided by creating a "framework," but those have their own pitfalls (primarily around prescription versus adoption).

最后,编写响应事件(服务调用)的代码比轮询代码容易得多。通常,每当新项目需要新的“监控服务”时,严重依赖数据库轮询的组织最终会重新发明*。可以通过创建“框架”来避免这种情况,但这些问题有其自身的缺陷(主要围绕处方与采用)。

This is just a laundry list of problems I have encountered. It's not necessarily meant to dissuade you from using databases for these functions, but it helps to know the dangers ahead of time so you can at least plan for them if they ever do become issues.

这只是我遇到的问题的清单。它并不一定意味着阻止您使用数据库来实现这些功能,但它有助于提前了解危险,因此如果它们成为问题,您至少可以为它们进行规划。

EDIT

Just thought of another scenario that caused us pains. Versioning your changes can be difficult. For example, if you need to change the shape of a table (normalize/denormalize), it has a cascading effect if multiple applications rely on it. In a SOA scenario, it is much easier, because you can keep your old API, change the internal interaction so that it works with the changed tables, and allow consumers to migrate to the new version on their own schedule.

想到另一个让我们痛苦的场景。对更改进行版本控制可能很困难。例如,如果需要更改表的形状(normalize / denormalize),如果多个应用程序依赖它,则会产生级联效果。在SOA场景中,它更容易,因为您可以保留旧API,更改内部交互以使其与更改的表一起使用,并允许使用者按照自己的计划迁移到新版本。

#4


A data broker sounds like a really good way to abstract out the multiple data sources for your apps. It would be easy to consolidate, change repositories, or otherwise move data around if needed in the future.

数据代理听起来像是一种非常好的方法来抽象出应用程序的多个数据源。如果将来需要,可以很容易地整合,更改存储库或以其他方式移动数据。

#5


I may be misunderstanding something, but it seems to me like you should consider some entity framework. That is a framework you can use to "map" your interaction with the db to some domain objects. That way you work locally on domain objects that gets filled form your db, and when it is time to persist the state of your objects to the base, the framework handles all the connections back and forth. In this way you can also easily mock up these domain objects for unit testing without needing a db connection.

我可能会误解某些东西,但在我看来,你应该考虑一些实体框架。这是一个框架,您可以使用该框架将与db的交互“映射”到某些域对象。这样你就可以在你的数据库中填充的域对象本地工作,当需要将对象的状态持久保存到基础时,框架会来回处理所有连接。通过这种方式,您还可以轻松地模拟这些域对象以进行单元测试,而无需数据库连接。

Check out NHibernate for a good entity framework alternative.

查看NHibernate以获得一个好的实体框架替代方案。

#6


If you already have the database related know-how I think it's not a bad decission.

如果您已经拥有与数据库相关的专有技术,我认为这不是一个糟糕的决定。

Good things that I can think of:

我能想到的好事:

  • if the data model is consistent you can plug in new tools easily without making any changes in the other apps.
  • 如果数据模型一致,您可以轻松插入新工具,而无需对其他应用程序进行任何更改。

  • maybe you can have running the database more reliabily than your apps, so if one of them fails, the other one can still be working.
  • 也许你可以比你的应用程序更可靠地运行数据库,所以如果其中一个失败,另一个仍然可以工作。

  • you can make backups and rollbacks using the database tools.
  • 您可以使用数据库工具进行备份和回滚。

  • you can do emergency fixes manipulating the data directly with sql or some visual tool.
  • 您可以使用sql或某些可视化工具直接操作数据进行紧急修复。

But if you have to learn new frameworks along the way, maybe the benefits are not worth the extra initial effort.

但是如果你必须在这个过程中学习新的框架,也许这些好处不值得额外的初步努力。

#7


"any app that needs values from the db makes a request to the data broker"

“任何需要来自db的值的应用程序都会向数据代理发出请求”

When database technology was being invented over 40 years ago, the people doing that inventing had ideas along the lines of "any app that needs values from the db makes a request to the dbms".

当数据库技术在40多年前发明时,那些发明创造的人就有了“任何需要来自db的值的应用程序向dbms发出请求”的想法。

Have you ever pondered the possibility that YOU ALREADY HAVE a "data broker", and that there might be very little added value in creating a second one of your own ?

您有没有考虑过您是否已经拥有“数据经纪人”的可能性,并且创建自己的第二个可能没有什么附加价值?