NoSQL和面向列的数据库有什么不同?

时间:2022-11-21 15:49:58

The more I read about NoSQL, the more it begins to sound like a column oriented database to me.

我读的关于NoSQL的文章越多,对我来说就越像一个面向列的数据库。

What's the difference between NoSQL (e.g. CouchDB, Cassandra, MongoDB) and a column oriented database (e.g. Vertica, MonetDB)?

NoSQL(例如CouchDB、Cassandra、MongoDB)和面向列的数据库(例如Vertica、MonetDB)之间有什么区别?

7 个解决方案

#1


7  

Some NoSQL databases are column-oriented databases, and some SQL databases are column-oriented as well. Whether the database is column or row-oriented is a physical storage implementation detail of the database and can be true of both relational and non-relational (NoSQL) databases.

有些NoSQL数据库是面向列的数据库,有些SQL数据库也是面向列的。数据库是面向列还是面向行是数据库的物理存储实现细节,关系和非关系(NoSQL)数据库也是如此。

Vertica, for example, is a column-oriented relational database so it wouldn't actually qualify as a NoSQL datastore.

例如,Vertica是一个面向列的关系数据库,因此它实际上不能作为NoSQL数据存储。

A "NoSQL movement" datastore is better defined as being non-relational, shared-nothing, horizontally scalable database without (necessarily) ACID guarantees. Some column-oriented databases can be characterized this way. Besides column stores, NoSQL implementations also include document stores, object stores, tuple stores, and graph stores.

一个“NoSQL运动”的数据存储被定义为非关系的、无共享的、水平可伸缩的数据库,没有(必要的)ACID保证。一些面向列的数据库可以这样描述。除了列存储之外,NoSQL实现还包括文档存储、对象存储、元组存储和图形存储。

#2


6  

NoSQL is term used for Not Only SQL, which covers four major categories - Key-Value, Document, Column Family and Graph databases.

NoSQL不仅指SQL,它还包括四个主要类别——键值、文档、列族和图形数据库。

Key-value databases are well-suited to applications that have frequent small reads and writes along with simple data models. These records are stored and retrieved using a key that uniquely identifies the record, and is used to quickly find the data within the database.

键值数据库非常适合具有频繁的小读和写以及简单数据模型的应用程序。使用唯一标识记录的键存储和检索这些记录,并用于快速查找数据库中的数据。

e.g. Redis, Riak etc.

例如复述,Riak等等。

Document databases have ability to store varying attributes along with large amounts of data

文档数据库能够存储不同的属性以及大量的数据

e.g. MongoDB , CouchDB etc.

例如MongoDB、CouchDB等。

Column family databases are designed for large volumes of data, read and write performance, and high availability

列族数据库是为大量数据、读写性能和高可用性设计的。

e.g Cassandra, HBase etc.

e。g卡桑德拉,HBase等等。

Graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data

图形数据库是使用图形结构进行语义查询的数据库,其中包含节点、边和属性来表示和存储数据

e.g Neo4j, InfiniteGraph etc.

e。g Neo4j InfiniteGraph等等。

Before understanding NoSQL, you have to understand some key concepts.

在了解NoSQL之前,您必须了解一些关键概念。

Consistency – All the servers in the system will have the same data so anyone using the system will get the same copy regardless of which server answers their request.

一致性——系统中的所有服务器都具有相同的数据,因此任何使用系统的人都将获得相同的副本,而不管哪个服务器响应他们的请求。

Availability – The system will always respond to a request (even if it's not the latest data or consistent across the system or just a message saying the system isn't working) .

可用性——系统总是响应一个请求(即使它不是系统中最新的数据或一致性,或者只是一个消息说系统不起作用)。

Partition Tolerance – The system continues to operate as a whole even if individual servers fail or can't be reached.

分区公差—即使单个服务器失败或无法到达,系统仍然作为一个整体运行。

Most of the times, only two out above three properties will be satisfied by NoSQL databases.

大多数情况下,NoSQL数据库只满足上述三个属性中的两个。

From your question,

从你的问题,

CouchDB : AP ( Availability & Partition) & Document database

CouchDB: AP(可用性和分区)&文档数据库。

Cassandra : AP ( Availability & Partition) & Column family database

Cassandra: AP(可用性和分区)和列家庭数据库。

MongoDB : CP ( Consistency & Partition) & Document database

MongoDB: CP(一致性和分区)和文档数据库

Vertica : CA ( Consistency & Availability) & Column family database

Vertica: CA(一致性和可用性)&列族数据库

MonetDB : ACID (Atomicity Consistency Isolation Durability) & Relational database

MonetDB: ACID(原子一致性隔离持久性)&关系数据库

From : http://blog.nahurst.com/visual-guide-to-nosql-systems

来自:http://blog.nahurst.com/visual-guide-to-nosql-systems

NoSQL和面向列的数据库有什么不同?

Have a look at this article1 , article2 and ppt for various scenarios to select a particular type of database.

查看本文第1条、第2条和各种场景的ppt,选择特定类型的数据库。

#3


5  

A NoSQL Database is a different paradigm from traditional schema based databases. They are designed to scale and hold documents like json data. Obviously they have a way of querying information, but you should expect syntax like eval("person = * and age > 10) for retrieving data. Even if they support standard SQL interface, they are intended for something else, so if you like SQL you should stick to traditional databases.

NoSQL数据库不同于传统的基于模式的数据库。它们被设计成可伸缩并保存像json数据这样的文档。显然,它们有查询信息的方法,但是您应该期望使用eval(“person = *和age > 10)之类的语法来检索数据。即使它们支持标准的SQL接口,它们也是用于其他目的的,所以如果您喜欢SQL,您应该坚持使用传统的数据库。

A column-oriented database is different from traditional row-oriented databases because of how they store data. By storing a whole column together instead of a row, you can minimize disk access when selecting a few columns from a row containing many columns. In row-oriented databases there's no difference if you select just one or all fields from a row.

面向列的数据库不同于传统的面向行的数据库,因为它们存储数据的方式不同。通过将整个列存储在一起而不是将行存储在一起,您可以在从包含多个列的行中选择几个列时最小化磁盘访问。在面向行的数据库中,如果只从一行中选择一个或所有字段,则没有区别。

You have to pay for a more expensive insert though. Inserting a new row will cause many disk operations, depending on the number of columns.

不过,你得为更昂贵的插入物付费。插入新行将导致许多磁盘操作,这取决于列的数量。

But there's no difference with traditional databases in terms of SQL, ACID, foreign keys and stuff like that.

但在SQL、ACID、外键等方面,传统数据库与传统数据库没有区别。

#4


3  

I would suggest reading the taxonomy section of the NoSQL wikipedia entry to get a feel for just how different NoSQL databases are from a traditional schema-oriented database. Being column-oriented implies rows and columns, which implies a (two dimensional) schema, while NoSQL databases tend to be schema-less (key-value stores) or have structured contents but without a formal schema (document stores).

我建议阅读NoSQL wikipedia条目的分类法部分,以了解不同的NoSQL数据库是如何从传统的模式导向数据库中获得的。面向列意味着行和列,这意味着(二维的)模式,而NoSQL数据库往往是无模式(键值存储)或具有结构化内容,但没有正式模式(文档存储)。

For document stores, the structure and contents of each "document" are independent of other documents in the same "collection". Adding a field is usually a code change rather than a database change: new documents get an entry for the new field, while older documents are considered to have a null value for the non-existent field. Similarly, "removing" a field could mean that you simply stop referring to it in your code rather than going to the trouble of deleting it from each document (unless space is at a premium, and then you have the option of removing only those with the largest contents). Contrast this to how an entire table must be changed to add or remove a column in a traditional row/column database.

对于文档存储,每个“文档”的结构和内容独立于同一“集合”中的其他文档。添加字段通常是代码更改,而不是数据库更改:新文档获得新字段的条目,而旧文档被认为对不存在的字段具有空值。类似地,“删除”字段可能意味着您只需停止在代码中引用该字段,而不必麻烦地从每个文档中删除该字段(除非空格非常重要,然后您可以选择只删除内容最大的字段)。与此相反,必须更改整个表以添加或删除传统行/列数据库中的列。

Documents can also hold lists as well as other nested documents. Here's a sample document from MongoDB (a post from a blog or other forum), represented as JSON:

文档还可以保存列表和其他嵌套文档。这是MongoDB的一个示例文档(来自博客或其他论坛的文章),表示为JSON:

{
  _id : ObjectId("4e77bb3b8a3e000000004f7a"),
  when : Date("2011-09-19T02:10:11.3Z"),
  author : "alex",
  title : "No Free Lunch",
  text : "This is the text of the post.  It could be very long.",
  tags : [ "business", "ramblings" ],
  votes : 5,
  voters : [ "jane", "joe", "spencer", "phyllis", "li" ],
  comments : [
    { who : "jane", when : Date("2011-09-19T04:00:10.112Z"),
      comment : "I agree." },
    { who : "meghan", when : Date("2011-09-20T14:36:06.958Z"),
      comment : "You must be joking.  etc etc ..." }
  ]
}

Note how "comments" is a list of nested documents with their own independent structure. Queries can "reach into" these documents from the outer document, for example to find posts that have comments by Jane, or posts with comments from a certain date range.

请注意,“comments”是一个包含它们自己独立结构的嵌套文档的列表。查询可以从外部文档“访问”这些文档,例如查找由Jane评论的文章,或者在某个日期范围内发表评论的文章。

So in short, two of the major differences typical of NoSQL databases are the lack of a (formal) schema and contents that go beyond the two dimensional orientation of a traditional row/column database.

因此,简而言之,NoSQL数据库的两个主要区别是缺少(正式的)模式和内容,这些内容超出了传统行/列数据库的二维定位。

#5


1  

Distinguishing between coloumn stores Read this blog. This answers your question.

区分coloumn商店,请阅读本博客。这回答了你的问题。

#6


0  

As @tuinstoel wrote, the article answers your question in point 3:

正如@tuinstoel所写的,这篇文章回答了第3点中的问题:

3. Interface. Group A is distinguished by being part of the NoSQL movement and does not typically have a traditional SQL interface. Group B supports standard SQL interfaces.

3所示。接口。组A是NoSQL运动的一部分,并且通常没有传统的SQL接口。B组支持标准的SQL接口。

#7


0  

Here is how I see it: Column Oriented databases are dealing with the way data is physically stored on disk. As the name suggests, the each column is stored in its own separate space/file. This allows for 2 important things:

我是这样看的:面向列的数据库处理数据物理存储在磁盘上的方式。顾名思义,每个列都存储在自己的独立空间/文件中。这包含了两件重要的事情:

  1. You achieve better compression ratio to the order of 10:1 because you have single data type to deal with.
  2. 您可以获得更好的压缩比,达到10:1的顺序,因为您需要处理单个数据类型。
  3. You achieve better data read performance because you avoid whole row scans and can just pick and choose the columns specified in your SELECT query.
  4. 您可以实现更好的数据读取性能,因为您可以避免整个行扫描,并且可以选择SELECT查询中指定的列。

NoSQL on the other hand are a whole new breed of databases that define "logical" aggregate levels to explain the data. Some treat the data as having hierachical relationship (aggregate being a "node"), while the other treat the data as documents (which is the aggregate level). They do not dictate the physical storage strategy (some may do, but abstracted away from the end user).

另一方面,NoSQL是一种全新的数据库类型,它们定义“逻辑”聚合级别来解释数据。有些将数据视为具有层次关系(聚合是一个“节点”),而另一些将数据视为文档(即聚合级别)。它们不指定物理存储策略(有些可能会,但会从最终用户抽象出来)。

Also, the whole NoSQL movement is more to do with unstructured data, or rather data sets whose schema cannot be predefined, or in unknown beforehand, and therefore cannot conform to the strict relational model.

而且,整个NoSQL运动更多的是使用非结构化数据,或者更确切地说是数据集,其模式不能预先定义,或者预先未知,因此不能符合严格的关系模型。

Column Oriented databases still deal with relational data, although eliminate the need for index etc.

面向列的数据库仍然处理关系数据,但不需要索引等。

#1


7  

Some NoSQL databases are column-oriented databases, and some SQL databases are column-oriented as well. Whether the database is column or row-oriented is a physical storage implementation detail of the database and can be true of both relational and non-relational (NoSQL) databases.

有些NoSQL数据库是面向列的数据库,有些SQL数据库也是面向列的。数据库是面向列还是面向行是数据库的物理存储实现细节,关系和非关系(NoSQL)数据库也是如此。

Vertica, for example, is a column-oriented relational database so it wouldn't actually qualify as a NoSQL datastore.

例如,Vertica是一个面向列的关系数据库,因此它实际上不能作为NoSQL数据存储。

A "NoSQL movement" datastore is better defined as being non-relational, shared-nothing, horizontally scalable database without (necessarily) ACID guarantees. Some column-oriented databases can be characterized this way. Besides column stores, NoSQL implementations also include document stores, object stores, tuple stores, and graph stores.

一个“NoSQL运动”的数据存储被定义为非关系的、无共享的、水平可伸缩的数据库,没有(必要的)ACID保证。一些面向列的数据库可以这样描述。除了列存储之外,NoSQL实现还包括文档存储、对象存储、元组存储和图形存储。

#2


6  

NoSQL is term used for Not Only SQL, which covers four major categories - Key-Value, Document, Column Family and Graph databases.

NoSQL不仅指SQL,它还包括四个主要类别——键值、文档、列族和图形数据库。

Key-value databases are well-suited to applications that have frequent small reads and writes along with simple data models. These records are stored and retrieved using a key that uniquely identifies the record, and is used to quickly find the data within the database.

键值数据库非常适合具有频繁的小读和写以及简单数据模型的应用程序。使用唯一标识记录的键存储和检索这些记录,并用于快速查找数据库中的数据。

e.g. Redis, Riak etc.

例如复述,Riak等等。

Document databases have ability to store varying attributes along with large amounts of data

文档数据库能够存储不同的属性以及大量的数据

e.g. MongoDB , CouchDB etc.

例如MongoDB、CouchDB等。

Column family databases are designed for large volumes of data, read and write performance, and high availability

列族数据库是为大量数据、读写性能和高可用性设计的。

e.g Cassandra, HBase etc.

e。g卡桑德拉,HBase等等。

Graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data

图形数据库是使用图形结构进行语义查询的数据库,其中包含节点、边和属性来表示和存储数据

e.g Neo4j, InfiniteGraph etc.

e。g Neo4j InfiniteGraph等等。

Before understanding NoSQL, you have to understand some key concepts.

在了解NoSQL之前,您必须了解一些关键概念。

Consistency – All the servers in the system will have the same data so anyone using the system will get the same copy regardless of which server answers their request.

一致性——系统中的所有服务器都具有相同的数据,因此任何使用系统的人都将获得相同的副本,而不管哪个服务器响应他们的请求。

Availability – The system will always respond to a request (even if it's not the latest data or consistent across the system or just a message saying the system isn't working) .

可用性——系统总是响应一个请求(即使它不是系统中最新的数据或一致性,或者只是一个消息说系统不起作用)。

Partition Tolerance – The system continues to operate as a whole even if individual servers fail or can't be reached.

分区公差—即使单个服务器失败或无法到达,系统仍然作为一个整体运行。

Most of the times, only two out above three properties will be satisfied by NoSQL databases.

大多数情况下,NoSQL数据库只满足上述三个属性中的两个。

From your question,

从你的问题,

CouchDB : AP ( Availability & Partition) & Document database

CouchDB: AP(可用性和分区)&文档数据库。

Cassandra : AP ( Availability & Partition) & Column family database

Cassandra: AP(可用性和分区)和列家庭数据库。

MongoDB : CP ( Consistency & Partition) & Document database

MongoDB: CP(一致性和分区)和文档数据库

Vertica : CA ( Consistency & Availability) & Column family database

Vertica: CA(一致性和可用性)&列族数据库

MonetDB : ACID (Atomicity Consistency Isolation Durability) & Relational database

MonetDB: ACID(原子一致性隔离持久性)&关系数据库

From : http://blog.nahurst.com/visual-guide-to-nosql-systems

来自:http://blog.nahurst.com/visual-guide-to-nosql-systems

NoSQL和面向列的数据库有什么不同?

Have a look at this article1 , article2 and ppt for various scenarios to select a particular type of database.

查看本文第1条、第2条和各种场景的ppt,选择特定类型的数据库。

#3


5  

A NoSQL Database is a different paradigm from traditional schema based databases. They are designed to scale and hold documents like json data. Obviously they have a way of querying information, but you should expect syntax like eval("person = * and age > 10) for retrieving data. Even if they support standard SQL interface, they are intended for something else, so if you like SQL you should stick to traditional databases.

NoSQL数据库不同于传统的基于模式的数据库。它们被设计成可伸缩并保存像json数据这样的文档。显然,它们有查询信息的方法,但是您应该期望使用eval(“person = *和age > 10)之类的语法来检索数据。即使它们支持标准的SQL接口,它们也是用于其他目的的,所以如果您喜欢SQL,您应该坚持使用传统的数据库。

A column-oriented database is different from traditional row-oriented databases because of how they store data. By storing a whole column together instead of a row, you can minimize disk access when selecting a few columns from a row containing many columns. In row-oriented databases there's no difference if you select just one or all fields from a row.

面向列的数据库不同于传统的面向行的数据库,因为它们存储数据的方式不同。通过将整个列存储在一起而不是将行存储在一起,您可以在从包含多个列的行中选择几个列时最小化磁盘访问。在面向行的数据库中,如果只从一行中选择一个或所有字段,则没有区别。

You have to pay for a more expensive insert though. Inserting a new row will cause many disk operations, depending on the number of columns.

不过,你得为更昂贵的插入物付费。插入新行将导致许多磁盘操作,这取决于列的数量。

But there's no difference with traditional databases in terms of SQL, ACID, foreign keys and stuff like that.

但在SQL、ACID、外键等方面,传统数据库与传统数据库没有区别。

#4


3  

I would suggest reading the taxonomy section of the NoSQL wikipedia entry to get a feel for just how different NoSQL databases are from a traditional schema-oriented database. Being column-oriented implies rows and columns, which implies a (two dimensional) schema, while NoSQL databases tend to be schema-less (key-value stores) or have structured contents but without a formal schema (document stores).

我建议阅读NoSQL wikipedia条目的分类法部分,以了解不同的NoSQL数据库是如何从传统的模式导向数据库中获得的。面向列意味着行和列,这意味着(二维的)模式,而NoSQL数据库往往是无模式(键值存储)或具有结构化内容,但没有正式模式(文档存储)。

For document stores, the structure and contents of each "document" are independent of other documents in the same "collection". Adding a field is usually a code change rather than a database change: new documents get an entry for the new field, while older documents are considered to have a null value for the non-existent field. Similarly, "removing" a field could mean that you simply stop referring to it in your code rather than going to the trouble of deleting it from each document (unless space is at a premium, and then you have the option of removing only those with the largest contents). Contrast this to how an entire table must be changed to add or remove a column in a traditional row/column database.

对于文档存储,每个“文档”的结构和内容独立于同一“集合”中的其他文档。添加字段通常是代码更改,而不是数据库更改:新文档获得新字段的条目,而旧文档被认为对不存在的字段具有空值。类似地,“删除”字段可能意味着您只需停止在代码中引用该字段,而不必麻烦地从每个文档中删除该字段(除非空格非常重要,然后您可以选择只删除内容最大的字段)。与此相反,必须更改整个表以添加或删除传统行/列数据库中的列。

Documents can also hold lists as well as other nested documents. Here's a sample document from MongoDB (a post from a blog or other forum), represented as JSON:

文档还可以保存列表和其他嵌套文档。这是MongoDB的一个示例文档(来自博客或其他论坛的文章),表示为JSON:

{
  _id : ObjectId("4e77bb3b8a3e000000004f7a"),
  when : Date("2011-09-19T02:10:11.3Z"),
  author : "alex",
  title : "No Free Lunch",
  text : "This is the text of the post.  It could be very long.",
  tags : [ "business", "ramblings" ],
  votes : 5,
  voters : [ "jane", "joe", "spencer", "phyllis", "li" ],
  comments : [
    { who : "jane", when : Date("2011-09-19T04:00:10.112Z"),
      comment : "I agree." },
    { who : "meghan", when : Date("2011-09-20T14:36:06.958Z"),
      comment : "You must be joking.  etc etc ..." }
  ]
}

Note how "comments" is a list of nested documents with their own independent structure. Queries can "reach into" these documents from the outer document, for example to find posts that have comments by Jane, or posts with comments from a certain date range.

请注意,“comments”是一个包含它们自己独立结构的嵌套文档的列表。查询可以从外部文档“访问”这些文档,例如查找由Jane评论的文章,或者在某个日期范围内发表评论的文章。

So in short, two of the major differences typical of NoSQL databases are the lack of a (formal) schema and contents that go beyond the two dimensional orientation of a traditional row/column database.

因此,简而言之,NoSQL数据库的两个主要区别是缺少(正式的)模式和内容,这些内容超出了传统行/列数据库的二维定位。

#5


1  

Distinguishing between coloumn stores Read this blog. This answers your question.

区分coloumn商店,请阅读本博客。这回答了你的问题。

#6


0  

As @tuinstoel wrote, the article answers your question in point 3:

正如@tuinstoel所写的,这篇文章回答了第3点中的问题:

3. Interface. Group A is distinguished by being part of the NoSQL movement and does not typically have a traditional SQL interface. Group B supports standard SQL interfaces.

3所示。接口。组A是NoSQL运动的一部分,并且通常没有传统的SQL接口。B组支持标准的SQL接口。

#7


0  

Here is how I see it: Column Oriented databases are dealing with the way data is physically stored on disk. As the name suggests, the each column is stored in its own separate space/file. This allows for 2 important things:

我是这样看的:面向列的数据库处理数据物理存储在磁盘上的方式。顾名思义,每个列都存储在自己的独立空间/文件中。这包含了两件重要的事情:

  1. You achieve better compression ratio to the order of 10:1 because you have single data type to deal with.
  2. 您可以获得更好的压缩比,达到10:1的顺序,因为您需要处理单个数据类型。
  3. You achieve better data read performance because you avoid whole row scans and can just pick and choose the columns specified in your SELECT query.
  4. 您可以实现更好的数据读取性能,因为您可以避免整个行扫描,并且可以选择SELECT查询中指定的列。

NoSQL on the other hand are a whole new breed of databases that define "logical" aggregate levels to explain the data. Some treat the data as having hierachical relationship (aggregate being a "node"), while the other treat the data as documents (which is the aggregate level). They do not dictate the physical storage strategy (some may do, but abstracted away from the end user).

另一方面,NoSQL是一种全新的数据库类型,它们定义“逻辑”聚合级别来解释数据。有些将数据视为具有层次关系(聚合是一个“节点”),而另一些将数据视为文档(即聚合级别)。它们不指定物理存储策略(有些可能会,但会从最终用户抽象出来)。

Also, the whole NoSQL movement is more to do with unstructured data, or rather data sets whose schema cannot be predefined, or in unknown beforehand, and therefore cannot conform to the strict relational model.

而且,整个NoSQL运动更多的是使用非结构化数据,或者更确切地说是数据集,其模式不能预先定义,或者预先未知,因此不能符合严格的关系模型。

Column Oriented databases still deal with relational data, although eliminate the need for index etc.

面向列的数据库仍然处理关系数据,但不需要索引等。