
时间:2021-03-30 13:02:43

background: we've got a number of server processes and client apps that are used entirely internally, in a fairly controlled environment. we capture a significant amount of data every day that goes into a couple database machines. most everything is c#, with a few c++ apps.

背景:我们有一些服务器进程和客户端应用程序完全在内部使用,处于相当受控的环境中。我们每天捕获大量数据到一对数据库机器中。大多数都是c#,有一些c ++应用程序。

just about every app has some basic (if not extensive) dependence on database data, whether it's for historical data, daily-calculated values, or assorted parameters. as the whole environment has gotten a bit more sprawling, I've been wondering about the sense in sticking an intermediary in between all client and server apps and the database, a sort of "database data broker". any app that needs values from the db makes a request to the data broker, instead of a dll wrapper function that calls a stored proc.


one immediate downside is that the data would make two trips across the network: from db to broker, and from broker to calling app. seems like poor form, but the amount of data would be small enough in each request that I'm ok with it as far as performance goes.


one (seeming) upside is that it would be trivial to set up a test environment, as it would entail just setting up a test data broker, and there's no maintaining of db connection strings locally anywhere else. also, I've been pondering creating a mini request language so you wouldn't have to enumerate functions for each dataset you might request (instead of GetX() and GetY(), there would be Get("name = X")

一个(看似)好处是设置一个测试环境是微不足道的,因为它只需要设置一个测试数据代理,并且在其他任何地方都不需要在本地维护数据库连接字符串。另外,我一直在考虑创建一个迷你请求语言,所以你不必枚举你可能要求的每个数据集的函数(而不是GetX()和GetY(),会有Get(“name = X”)

am I over-engineering this, or is it possibly a worthy architecture?


edit: thanks for all the great comments so far, great food for thought.


7 个解决方案


It depends on what you're trying to accomplish with it. According to Rocky Lhotka, you should only add a tier if you are forced to, kicking and screaming all the way.

这取决于你想用它完成什么。根据Rocky Lhotka的说法,如果你*,一直踢,尖叫,你应该只增加一层。

I agree with him: don't tier unless you need to. I think there are valid reasons to add additional tiers, usually for purposes of security, scalability and maintainability. The question becomes: is yours a valid reason?


It looks like the major reason is maintainability. Does it outweigh the benefits you get by not having the tier?



only you can answer these:


  • what are the benefits of doing this?
  • 这样做有什么好处?

  • what are the problems/risks of doing this?
  • 这样做的问题/风险是什么?

  • do you need this to make testing easier or even possible?
  • 您是否需要这样才能使测试更容易甚至可能?

  • if you make this change and when it goes live and crashes will you be fired?
  • 如果你做出这个改变,它何时上线,你会被解雇?

  • if you make the changes and it goes live will you get a promotion?
  • 如果您进行更改并且它上线,您会获得促销吗?

  • etc...


As the former architect of a system that also used a database heavily as a "hub," I can say that there are several drawbacks that you should be aware of. Our system used databases:


  • As a transaction store (typical OLTP stuff)
  • 作为交易商店(典型的OLTP东西)

  • As a staging queue (submitted but unprocessed transactions)
  • 作为暂存队列(已提交但未处理的事务)

  • As a historical data store (results of processed transactions)
  • 作为历史数据存储(已处理事务的结果)

  • As an interoperation layer (untranslated commands or transactions issued from other systems)
  • 作为互操作层(未翻译的命令或从其他系统发出的事务)

One of the major drawbacks is ownership costs. When your databases become the single point of failure for so many types of operations, it becomes necessary to ensure that they are all hosted in high-availability environments. This not only expensive from a hardware perspective, but it is also expensive to support deployments to HA environments, since developers typically have very limited visibility to the internals.


A second drawback is that you have to seriously design integrity in to all of your tables. In a typical SOA environment, you have complete control over how data is modified. When you expose it through database tables, you must consider that any application with the right credentials will have the ability to modify data. Because of this, you must carefully consider utilitarian implementations of constraints. If you had a single service managing persistence, you could be much looser in constraints on the database and enforce them in code.


Third, if you ever want to expose any functionality that the database tables currently allow you to provide to outside parties, you must write service code anyway, so you might be better served doing it strategically as opposed to reacting to requests.


Fourth, UI interaction directly with the data layer creates security risks, especially if the client is a thick client.


Finally, writing code that responds to events (service calls) is much easier than polling code. Typically, organizations that rely heavily on database polling end up reinventing the wheel every time a new project requires a new "monitoring service." It can be avoided by creating a "framework," but those have their own pitfalls (primarily around prescription versus adoption).


This is just a laundry list of problems I have encountered. It's not necessarily meant to dissuade you from using databases for these functions, but it helps to know the dangers ahead of time so you can at least plan for them if they ever do become issues.



Just thought of another scenario that caused us pains. Versioning your changes can be difficult. For example, if you need to change the shape of a table (normalize/denormalize), it has a cascading effect if multiple applications rely on it. In a SOA scenario, it is much easier, because you can keep your old API, change the internal interaction so that it works with the changed tables, and allow consumers to migrate to the new version on their own schedule.

想到另一个让我们痛苦的场景。对更改进行版本控制可能很困难。例如,如果需要更改表的形状(normalize / denormalize),如果多个应用程序依赖它,则会产生级联效果。在SOA场景中,它更容易,因为您可以保留旧API,更改内部交互以使其与更改的表一起使用,并允许使用者按照自己的计划迁移到新版本。


A data broker sounds like a really good way to abstract out the multiple data sources for your apps. It would be easy to consolidate, change repositories, or otherwise move data around if needed in the future.



I may be misunderstanding something, but it seems to me like you should consider some entity framework. That is a framework you can use to "map" your interaction with the db to some domain objects. That way you work locally on domain objects that gets filled form your db, and when it is time to persist the state of your objects to the base, the framework handles all the connections back and forth. In this way you can also easily mock up these domain objects for unit testing without needing a db connection.


Check out NHibernate for a good entity framework alternative.



If you already have the database related know-how I think it's not a bad decission.


Good things that I can think of:


  • if the data model is consistent you can plug in new tools easily without making any changes in the other apps.
  • 如果数据模型一致,您可以轻松插入新工具,而无需对其他应用程序进行任何更改。

  • maybe you can have running the database more reliabily than your apps, so if one of them fails, the other one can still be working.
  • 也许你可以比你的应用程序更可靠地运行数据库,所以如果其中一个失败,另一个仍然可以工作。

  • you can make backups and rollbacks using the database tools.
  • 您可以使用数据库工具进行备份和回滚。

  • you can do emergency fixes manipulating the data directly with sql or some visual tool.
  • 您可以使用sql或某些可视化工具直接操作数据进行紧急修复。

But if you have to learn new frameworks along the way, maybe the benefits are not worth the extra initial effort.



"any app that needs values from the db makes a request to the data broker"


When database technology was being invented over 40 years ago, the people doing that inventing had ideas along the lines of "any app that needs values from the db makes a request to the dbms".


Have you ever pondered the possibility that YOU ALREADY HAVE a "data broker", and that there might be very little added value in creating a second one of your own ?



It depends on what you're trying to accomplish with it. According to Rocky Lhotka, you should only add a tier if you are forced to, kicking and screaming all the way.

这取决于你想用它完成什么。根据Rocky Lhotka的说法,如果你*,一直踢,尖叫,你应该只增加一层。

I agree with him: don't tier unless you need to. I think there are valid reasons to add additional tiers, usually for purposes of security, scalability and maintainability. The question becomes: is yours a valid reason?


It looks like the major reason is maintainability. Does it outweigh the benefits you get by not having the tier?



only you can answer these:


  • what are the benefits of doing this?
  • 这样做有什么好处?

  • what are the problems/risks of doing this?
  • 这样做的问题/风险是什么?

  • do you need this to make testing easier or even possible?
  • 您是否需要这样才能使测试更容易甚至可能?

  • if you make this change and when it goes live and crashes will you be fired?
  • 如果你做出这个改变,它何时上线,你会被解雇?

  • if you make the changes and it goes live will you get a promotion?
  • 如果您进行更改并且它上线,您会获得促销吗?

  • etc...


As the former architect of a system that also used a database heavily as a "hub," I can say that there are several drawbacks that you should be aware of. Our system used databases:


  • As a transaction store (typical OLTP stuff)
  • 作为交易商店(典型的OLTP东西)

  • As a staging queue (submitted but unprocessed transactions)
  • 作为暂存队列(已提交但未处理的事务)

  • As a historical data store (results of processed transactions)
  • 作为历史数据存储(已处理事务的结果)

  • As an interoperation layer (untranslated commands or transactions issued from other systems)
  • 作为互操作层(未翻译的命令或从其他系统发出的事务)

One of the major drawbacks is ownership costs. When your databases become the single point of failure for so many types of operations, it becomes necessary to ensure that they are all hosted in high-availability environments. This not only expensive from a hardware perspective, but it is also expensive to support deployments to HA environments, since developers typically have very limited visibility to the internals.


A second drawback is that you have to seriously design integrity in to all of your tables. In a typical SOA environment, you have complete control over how data is modified. When you expose it through database tables, you must consider that any application with the right credentials will have the ability to modify data. Because of this, you must carefully consider utilitarian implementations of constraints. If you had a single service managing persistence, you could be much looser in constraints on the database and enforce them in code.


Third, if you ever want to expose any functionality that the database tables currently allow you to provide to outside parties, you must write service code anyway, so you might be better served doing it strategically as opposed to reacting to requests.


Fourth, UI interaction directly with the data layer creates security risks, especially if the client is a thick client.


Finally, writing code that responds to events (service calls) is much easier than polling code. Typically, organizations that rely heavily on database polling end up reinventing the wheel every time a new project requires a new "monitoring service." It can be avoided by creating a "framework," but those have their own pitfalls (primarily around prescription versus adoption).


This is just a laundry list of problems I have encountered. It's not necessarily meant to dissuade you from using databases for these functions, but it helps to know the dangers ahead of time so you can at least plan for them if they ever do become issues.



Just thought of another scenario that caused us pains. Versioning your changes can be difficult. For example, if you need to change the shape of a table (normalize/denormalize), it has a cascading effect if multiple applications rely on it. In a SOA scenario, it is much easier, because you can keep your old API, change the internal interaction so that it works with the changed tables, and allow consumers to migrate to the new version on their own schedule.

想到另一个让我们痛苦的场景。对更改进行版本控制可能很困难。例如,如果需要更改表的形状(normalize / denormalize),如果多个应用程序依赖它,则会产生级联效果。在SOA场景中,它更容易,因为您可以保留旧API,更改内部交互以使其与更改的表一起使用,并允许使用者按照自己的计划迁移到新版本。


A data broker sounds like a really good way to abstract out the multiple data sources for your apps. It would be easy to consolidate, change repositories, or otherwise move data around if needed in the future.



I may be misunderstanding something, but it seems to me like you should consider some entity framework. That is a framework you can use to "map" your interaction with the db to some domain objects. That way you work locally on domain objects that gets filled form your db, and when it is time to persist the state of your objects to the base, the framework handles all the connections back and forth. In this way you can also easily mock up these domain objects for unit testing without needing a db connection.


Check out NHibernate for a good entity framework alternative.



If you already have the database related know-how I think it's not a bad decission.


Good things that I can think of:


  • if the data model is consistent you can plug in new tools easily without making any changes in the other apps.
  • 如果数据模型一致,您可以轻松插入新工具,而无需对其他应用程序进行任何更改。

  • maybe you can have running the database more reliabily than your apps, so if one of them fails, the other one can still be working.
  • 也许你可以比你的应用程序更可靠地运行数据库,所以如果其中一个失败,另一个仍然可以工作。

  • you can make backups and rollbacks using the database tools.
  • 您可以使用数据库工具进行备份和回滚。

  • you can do emergency fixes manipulating the data directly with sql or some visual tool.
  • 您可以使用sql或某些可视化工具直接操作数据进行紧急修复。

But if you have to learn new frameworks along the way, maybe the benefits are not worth the extra initial effort.



"any app that needs values from the db makes a request to the data broker"


When database technology was being invented over 40 years ago, the people doing that inventing had ideas along the lines of "any app that needs values from the db makes a request to the dbms".


Have you ever pondered the possibility that YOU ALREADY HAVE a "data broker", and that there might be very little added value in creating a second one of your own ?
