关系数据库的哪些方面使得它们很难在谷歌应用程序引擎这样的服务上进行足够的扩展?

时间:2022-10-03 22:13:42

Apparently the reason for the BigTable architecture has to do with the difficulty scaling relational databases when you're dealing with the massive number of servers that Google has to deal with.

显然,BigTable架构的原因与在处理谷歌必须处理的大量服务器时难以扩展关系数据库有关。

But technically speaking what exactly makes it difficult for relational databases to scale?

但是从技术上讲,是什么让关系数据库难以扩展呢?

In the enterprise data centers of large corporations they seem to be able to do this successfully so I'm wondering why it's not possible to simply do this at a greater order of magnitude in order for it to scale on Google's servers.

在大公司的企业数据中心,他们似乎能够成功地做到这一点,所以我想知道为什么不可能简单地在谷歌的服务器上以更大的数量级做这件事。

3 个解决方案

#1


3  

In addition to Mitch's answer, there's another facet: Webapps are generally poorly suited to relational databases. Relational databases put emphasis on normalization - essentially, making writes easier, but reads harder (in terms of work done, not necessarially for you). This works very well for OLAP, ad-hoc query type situations, but not so well for webapps, which are generally massively weighted in favor of reads over writes.

除了Mitch的回答之外,还有另外一个方面:Webapps通常不适合关系数据库。关系数据库强调规范化——本质上是使写操作更简单,但是读起来更困难(在工作方面,对您来说不是必需的)。这对于OLAP(一种特殊的查询类型情况)非常有效,但对于webapps来说就不太好了,后者通常更倾向于读操作而不是写操作。

The strategy taken by non-relational databases such as Bigtable is the reverse: denormalize, to make reads much easier, at the cost of making writes more expensive.

非关系数据库(如Bigtable)采取的策略是相反的:非规范化,以使读操作更容易,以使写操作更昂贵为代价。

#2


6  

When you perform a query that involves relationships which are physically distributed, you have to pull that data for each relationship into a central place. That obviously won't scale well for large volumes of data.

当执行涉及物理分布的关系的查询时,必须将每个关系的数据拉到一个中心位置。对于大量的数据来说,这显然不能很好地扩展。

A well set-up RDBMS server will perform the majority of it's queries on hot-pages in RAM, with little physical disk or network I/O.

配置良好的RDBMS服务器将在RAM中的热页上执行大部分查询,而物理磁盘或网络I/O很少。

If you are constrained by network I/O, then the benefits of relational data become lessened.

如果您受到网络I/O的限制,那么关系数据的好处就会减少。

#3


0  

The main reason as stated is physical location and network IO. Additionally, even large corporations deal with a fraction of the data that search engines deal with.

主要原因是物理位置和网络IO。此外,即使是大公司也要处理搜索引擎所处理的数据的一小部分。

Think about the index on a standard database, maybe a few feilds... search engines need fast text search, on large text fields.

想想标准数据库上的索引,可能有几个feilds…搜索引擎需要在大型文本字段上进行快速的文本搜索。

#1


3  

In addition to Mitch's answer, there's another facet: Webapps are generally poorly suited to relational databases. Relational databases put emphasis on normalization - essentially, making writes easier, but reads harder (in terms of work done, not necessarially for you). This works very well for OLAP, ad-hoc query type situations, but not so well for webapps, which are generally massively weighted in favor of reads over writes.

除了Mitch的回答之外,还有另外一个方面:Webapps通常不适合关系数据库。关系数据库强调规范化——本质上是使写操作更简单,但是读起来更困难(在工作方面,对您来说不是必需的)。这对于OLAP(一种特殊的查询类型情况)非常有效,但对于webapps来说就不太好了,后者通常更倾向于读操作而不是写操作。

The strategy taken by non-relational databases such as Bigtable is the reverse: denormalize, to make reads much easier, at the cost of making writes more expensive.

非关系数据库(如Bigtable)采取的策略是相反的:非规范化,以使读操作更容易,以使写操作更昂贵为代价。

#2


6  

When you perform a query that involves relationships which are physically distributed, you have to pull that data for each relationship into a central place. That obviously won't scale well for large volumes of data.

当执行涉及物理分布的关系的查询时,必须将每个关系的数据拉到一个中心位置。对于大量的数据来说,这显然不能很好地扩展。

A well set-up RDBMS server will perform the majority of it's queries on hot-pages in RAM, with little physical disk or network I/O.

配置良好的RDBMS服务器将在RAM中的热页上执行大部分查询,而物理磁盘或网络I/O很少。

If you are constrained by network I/O, then the benefits of relational data become lessened.

如果您受到网络I/O的限制,那么关系数据的好处就会减少。

#3


0  

The main reason as stated is physical location and network IO. Additionally, even large corporations deal with a fraction of the data that search engines deal with.

主要原因是物理位置和网络IO。此外,即使是大公司也要处理搜索引擎所处理的数据的一小部分。

Think about the index on a standard database, maybe a few feilds... search engines need fast text search, on large text fields.

想想标准数据库上的索引,可能有几个feilds…搜索引擎需要在大型文本字段上进行快速的文本搜索。