
时间:2022-10-04 10:16:10

Non-relational databases are attracting more attention day by day. The main limitation is that today's complicated data are indeed connected. Isn't it convenient to connect databases as we connect tables in RDBMS? Of course, I just mean simple cases. Imagine three tables of Articles, Tags, Relationships. In a RDBMS like Mysql, we can run three queries to


1. Find ID of a given tag
2. Find Articles connected with the captured Tag ID
3. Fetch the contents of Articles tagged with the term

Instead of three queries, we perform a single query by JOIN. I think three queries in a key/value database like BerkeleyDB is faster than a JOIN query in Mysql.


Is this idea practical? Or other issues are involved to ignore this approach?


3 个解决方案



NoSQL databases can support relational data models just fine. You're just left to implement the relational mapping yourself in your application, and that effort is typically not insignificant.


In some applications this extra effort will be worthwhile. Perhaps you only have a small number of tables and the joins you need are very simple. Or perhaps you've done some performance evaluation between a traditional relational DBMS and a NoSQL alternative and found that the NoSQL option is more appropriate for your needs for any number of reasons (performance, scalability, flexibility, whatever).


You should keep one thing in mind, however. A typical SQL DBMS is basically a NoSQL DB with an optimized, well-built relational engine in front of it. Some databases even let you bypass the relational layer and treat their system like a pure NoSQL DB.

但是,你应该记住一件事。典型的SQL DBMS基本上是一个NoSQL DB,它前面有一个优化的,精心构建的关系引擎。有些数据库甚至允许您绕过关系层并将其系统视为纯粹的NoSQL DB。

Therefore, the moment you start to build your own relational mappings and joins on top of a NoSQL DB you should ask yourself, "Didn't someone build this for me already?" The answer may well be "yes", and the solution might be to go with a traditional SQL DBMS.

因此,当你开始构建自己的关系映射并加入NoSQL数据库时,你应该问自己:“有没有人为我建立这个?”答案可能是“是”,解决方案可能是使用传统的SQL DBMS。

To answer the "3 query" part of your question specifically, the answer is "maybe". You certainly might be able to make such a query run faster in a NoSQL DB than in an RDBMS, but you need to keep in mind that there are more things to consider here than just the raw speed of your query:

要具体回答问题的“3查询”部分,答案是“也许”。您当然可以在NoSQL DB中比在RDBMS中更快地运行这样的查询,但是您需要记住,这里需要考虑的事情多于查询的原始速度:

  1. The technical debt you will incur as you build join-like functionality that you wouldn't have had to build otherwise
  2. 当您构建类似连接的功能时,您将产生的技术债务,否则您将不必构建

  3. The time it will take you to build, test and optimize your query code which will likely be more significant than writing a simple SQL query
  4. 您需要花时间构建,测试和优化查询代码,这可能比编写简单的SQL查询更重要

  5. Any difference in transactional guarantees or other typical product features (replication, management tools, etc) which you may lose or gain depending on the NoSQL option you choose
  6. 根据您选择的NoSQL选项,您可能会丢失或获得的事务保证或其他典型产品功能(复制,管理工具等)的任何差异

  7. The ability to hire DBMs who know how to run your database from an operational perspective
  8. 能够从操作角度聘请知道如何运行数据库的DBM

You might review that list and say to yourself, "No big deal, I'm running a simple app with only a few thousand DB entries and I'll maintain it myself". If so, knock yourself out - Berkeley (and other NoSQL options) would work fine. I've used Berkeley many times for those kinds of applications. But you may have a different answer if you are building the back-end for a significantly-sized SaaS product which might soon have millions of users and very complex queries.

您可以查看该列表并对自己说:“没什么大不了的,我正在运行一个只有几千个数据库条目的简单应用程序,我会自己维护它”。如果是这样,那就把自己搞得一团糟 - 伯克利(以及其他NoSQL选项)可以正常工作。我已经多次使用伯克利来进行这类应用。但是,如果要为大规模的SaaS产品构建后端,可能会有很多用户和非常复杂的查询,那么您可能会有不同的答案。

We can't give a one-size-fits-all answer, unfortunately. You'll have to make the judgement call yourself based on the needs of you application and organization.




Sure, a single record join is pretty speedy in either solution, but that's not the big advantage of joins. Joins are useful when you're joining many, many rows with many, many other rows. Imagine if, in your example, you wanted to do that for 100 different tags. Without joins, you're talking 300 queries to SQL's one.




Another solution on noSql systems is playOrm. It does Joins BUT only in partitions so the table can be infinite size, but the partitions have to be on par with the size of RDBMS tables. It does all the fancy hibernate stuff as well for you with all the related annotations though it has some differences and will be adding Embedded for use when you denormalize. It makes things much easier. Typically dealing with nosql is kind of a pain in all the translation logic you have to do and all the manual indexing and updates and removes from the index....playOrm does all this for you instead.

noSql系统的另一个解决方案是playOrm。它只在分区中加入BUT,因此表可以是无限大小,但分区必须与RDBMS表的大小相同。虽然它有一些不同之处,但它会为你提供所有相关注释的所有花哨的hibernate内容,并且会在你反规范化时添加Embedded。它使事情变得更容易。通常处理nosql是你必须做的所有翻译逻辑中的一种痛苦,并且所有手动索引和更新都会从索引中删除.... playOrm会为你做所有这些。



NoSQL databases can support relational data models just fine. You're just left to implement the relational mapping yourself in your application, and that effort is typically not insignificant.


In some applications this extra effort will be worthwhile. Perhaps you only have a small number of tables and the joins you need are very simple. Or perhaps you've done some performance evaluation between a traditional relational DBMS and a NoSQL alternative and found that the NoSQL option is more appropriate for your needs for any number of reasons (performance, scalability, flexibility, whatever).


You should keep one thing in mind, however. A typical SQL DBMS is basically a NoSQL DB with an optimized, well-built relational engine in front of it. Some databases even let you bypass the relational layer and treat their system like a pure NoSQL DB.

但是,你应该记住一件事。典型的SQL DBMS基本上是一个NoSQL DB,它前面有一个优化的,精心构建的关系引擎。有些数据库甚至允许您绕过关系层并将其系统视为纯粹的NoSQL DB。

Therefore, the moment you start to build your own relational mappings and joins on top of a NoSQL DB you should ask yourself, "Didn't someone build this for me already?" The answer may well be "yes", and the solution might be to go with a traditional SQL DBMS.

因此,当你开始构建自己的关系映射并加入NoSQL数据库时,你应该问自己:“有没有人为我建立这个?”答案可能是“是”,解决方案可能是使用传统的SQL DBMS。

To answer the "3 query" part of your question specifically, the answer is "maybe". You certainly might be able to make such a query run faster in a NoSQL DB than in an RDBMS, but you need to keep in mind that there are more things to consider here than just the raw speed of your query:

要具体回答问题的“3查询”部分,答案是“也许”。您当然可以在NoSQL DB中比在RDBMS中更快地运行这样的查询,但是您需要记住,这里需要考虑的事情多于查询的原始速度:

  1. The technical debt you will incur as you build join-like functionality that you wouldn't have had to build otherwise
  2. 当您构建类似连接的功能时,您将产生的技术债务,否则您将不必构建

  3. The time it will take you to build, test and optimize your query code which will likely be more significant than writing a simple SQL query
  4. 您需要花时间构建,测试和优化查询代码,这可能比编写简单的SQL查询更重要

  5. Any difference in transactional guarantees or other typical product features (replication, management tools, etc) which you may lose or gain depending on the NoSQL option you choose
  6. 根据您选择的NoSQL选项,您可能会丢失或获得的事务保证或其他典型产品功能(复制,管理工具等)的任何差异

  7. The ability to hire DBMs who know how to run your database from an operational perspective
  8. 能够从操作角度聘请知道如何运行数据库的DBM

You might review that list and say to yourself, "No big deal, I'm running a simple app with only a few thousand DB entries and I'll maintain it myself". If so, knock yourself out - Berkeley (and other NoSQL options) would work fine. I've used Berkeley many times for those kinds of applications. But you may have a different answer if you are building the back-end for a significantly-sized SaaS product which might soon have millions of users and very complex queries.

您可以查看该列表并对自己说:“没什么大不了的,我正在运行一个只有几千个数据库条目的简单应用程序,我会自己维护它”。如果是这样,那就把自己搞得一团糟 - 伯克利(以及其他NoSQL选项)可以正常工作。我已经多次使用伯克利来进行这类应用。但是,如果要为大规模的SaaS产品构建后端,可能会有很多用户和非常复杂的查询,那么您可能会有不同的答案。

We can't give a one-size-fits-all answer, unfortunately. You'll have to make the judgement call yourself based on the needs of you application and organization.




Sure, a single record join is pretty speedy in either solution, but that's not the big advantage of joins. Joins are useful when you're joining many, many rows with many, many other rows. Imagine if, in your example, you wanted to do that for 100 different tags. Without joins, you're talking 300 queries to SQL's one.




Another solution on noSql systems is playOrm. It does Joins BUT only in partitions so the table can be infinite size, but the partitions have to be on par with the size of RDBMS tables. It does all the fancy hibernate stuff as well for you with all the related annotations though it has some differences and will be adding Embedded for use when you denormalize. It makes things much easier. Typically dealing with nosql is kind of a pain in all the translation logic you have to do and all the manual indexing and updates and removes from the index....playOrm does all this for you instead.

noSql系统的另一个解决方案是playOrm。它只在分区中加入BUT,因此表可以是无限大小,但分区必须与RDBMS表的大小相同。虽然它有一些不同之处,但它会为你提供所有相关注释的所有花哨的hibernate内容,并且会在你反规范化时添加Embedded。它使事情变得更容易。通常处理nosql是你必须做的所有翻译逻辑中的一种痛苦,并且所有手动索引和更新都会从索引中删除.... playOrm会为你做所有这些。