对c#开发人员使用Neo4j和MSSQL的实际性能比较。

时间:2022-04-25 15:43:20

Assume we have a web site with a small social graph that people (say ~1M users) can "like" stuff, follow each other, comment on each other posts and ... (the usual scenario).

假设我们有一个网站,上面有一个小的社交网络,人们(比如100万用户)可以“喜欢”东西,互相关注,互相评论,然后……(通常的场景)。

In .NET for this we have two options:

在。net中,我们有两个选择:

  1. Using EF (currently 6.1) and MSSQL (v2012 or above) to implement the social graph (the hard way)
  2. 使用EF(当前6.1)和MSSQL (v2012或以上)实现社会图(the hard way)
  3. Using Neo4j (currently 2.1.4) and Neo4jClient (which as far as I know is the best driver for .NET users)
  4. 使用Neo4j(目前是2.1.4)和Neo4jClient(据我所知,这是。net用户最好的驱动程序)

Given the above scenario and the fact that Neo4j doesn't have a native driver for .NET and the current version of Neo4jClient (1.0.0.657) uses REST api to connect to the database engine, which one would be faster for questions like "Who likes stuff like I do" or "What a person would like (based on the people it follow)" and some other usual question regarding the social graphs?

鉴于上述情况,Neo4j没有本地司机为。net和Neo4jClient的当前版本(1.0.0.657)使用REST api连接到数据库引擎,哪一个会更快的问题像“谁像我一样喜欢的东西”或“一个人想(基于它跟着的人)”和其他一些常见的问题关于社交图吗?

1 个解决方案

#1


4  

You haven't specified that much information; your question may be likely to elicit a lot of opinion, but I'll try to give this a fair shake. (Disclaimer: I'm from the neo4j side of this, but I've worked with most of the other things you mention)

你没有说明太多的信息;你的问题可能会引起很多人的意见,但我会试着公平地对待它。(免责声明:我来自neo4j的这一面,但我已经和你提到的其他东西一起工作了)

Your question has three elements I want to split apart:

你的问题有三个要素,我想把它们分开:

  1. Graph or Relational? (MySQL vs. Neo4J)
  2. 图或关系?(MySQL比Neo4J)
  3. Driver/Engineering issues (Neo4jClient/REST vs EF/MySQL)
  4. 驱动/工程问题(Neo4jClient/REST vs EF/MySQL)
  5. Modeling practicalities (implementing the social graph "the hard way" vs. in neo4j)
  6. 建模实用性(在neo4j中实现社会图“the hard way”和“in neo4j”)

Graph or Relational?

图或关系?

You should read another answer I posted about general parameters of the performance of graph databases and graph database query. I won't recap all of that (since it's already on SO) but here's the executive summary: graph databases are very good and fast at path-associative queries where you need to traverse a bunch of edges. Those operations correspond to things in the relational world where you'd join a whole pile of tables together, or where the join depth is variable. In those situations, graph will be better than relational (performance wise). If you want to do bulk scans of users or single joins, you're probably better off with relational (again, see other answer for more detail here). So on this criteria, I am inferring that you only really want to traverse one edge at a time - e.g. "Show me all of the stuff that Bob likes" and that you don't need to do deeper queries like "Show me everyone who is separated by 3-4 degrees from Bob".

您应该阅读我发布的关于图形数据库性能的一般参数和图形数据库查询的另一个答案。我不会重述所有这些(因为它已经是这样了),但这里是执行概要:图数据库非常好,在路径关联查询中非常快,您需要遍历一些边缘。这些操作对应于关系世界中的事情,您可以将一堆表连接在一起,或者连接深度是可变的。在这些情况下,图形将比关系(性能)更好。如果您希望对用户或单个连接进行批量扫描,那么最好使用关系(同样,请参阅其他答案以获得更多细节)。因此,在这个条件下,我推断你只是想一次穿过一条边——例如。“给我看看Bob喜欢的所有东西”,你不需要做更深入的查询,比如“给我看看每个和Bob相距3-4度的人”。

Driver/Engineering Issues

司机/工程问题

Speed wise, it's generally known that the java API is faster than the REST API for neo4j. Performance for the REST API would be variable, and depend on a lot of other factors like whether the DB is hosted on the same machine, or how "network far" away it is. You always have extra overhead with REST that comes with things like HTTP and serializing/deserializing JSON that you wouldn't have if you used the java API. So all other things being equal (disclaimer: they never are ;) the REST API will generally be slower than something like EF.

就速度而言,一般都知道java API比neo4j的REST API要快。REST API的性能是可变的,取决于许多其他因素,比如DB是否驻留在同一台机器上,或者它离网络有多远。如果使用java API,就不会有HTTP和序列化/反序列化JSON之类的附加REST开销。因此,在所有其他条件相同的情况下(声明:它们从来都不是;),REST API通常会比EF之类的东西慢。

Modeling Practicalities

建模的可行性

Here, neo4j is going to win by a lot. With MySQL, you'll have the ever-present object-relational impedance mismatch; neo4j lessens (but does not eliminate) those impedance mismatch problems. Modeling wise, neo4j is schemaless, which comes with lots of pros and cons. You can probably cobble together a working model faster with neo4j because your domain is fundamentally graphy-y.

在这里,neo4j将会胜出很多。neo4j减轻(但不能消除)这些阻抗不匹配问题。从建模的角度来看,neo4j是无模式的,它有很多优点和缺点。

#1


4  

You haven't specified that much information; your question may be likely to elicit a lot of opinion, but I'll try to give this a fair shake. (Disclaimer: I'm from the neo4j side of this, but I've worked with most of the other things you mention)

你没有说明太多的信息;你的问题可能会引起很多人的意见,但我会试着公平地对待它。(免责声明:我来自neo4j的这一面,但我已经和你提到的其他东西一起工作了)

Your question has three elements I want to split apart:

你的问题有三个要素,我想把它们分开:

  1. Graph or Relational? (MySQL vs. Neo4J)
  2. 图或关系?(MySQL比Neo4J)
  3. Driver/Engineering issues (Neo4jClient/REST vs EF/MySQL)
  4. 驱动/工程问题(Neo4jClient/REST vs EF/MySQL)
  5. Modeling practicalities (implementing the social graph "the hard way" vs. in neo4j)
  6. 建模实用性(在neo4j中实现社会图“the hard way”和“in neo4j”)

Graph or Relational?

图或关系?

You should read another answer I posted about general parameters of the performance of graph databases and graph database query. I won't recap all of that (since it's already on SO) but here's the executive summary: graph databases are very good and fast at path-associative queries where you need to traverse a bunch of edges. Those operations correspond to things in the relational world where you'd join a whole pile of tables together, or where the join depth is variable. In those situations, graph will be better than relational (performance wise). If you want to do bulk scans of users or single joins, you're probably better off with relational (again, see other answer for more detail here). So on this criteria, I am inferring that you only really want to traverse one edge at a time - e.g. "Show me all of the stuff that Bob likes" and that you don't need to do deeper queries like "Show me everyone who is separated by 3-4 degrees from Bob".

您应该阅读我发布的关于图形数据库性能的一般参数和图形数据库查询的另一个答案。我不会重述所有这些(因为它已经是这样了),但这里是执行概要:图数据库非常好,在路径关联查询中非常快,您需要遍历一些边缘。这些操作对应于关系世界中的事情,您可以将一堆表连接在一起,或者连接深度是可变的。在这些情况下,图形将比关系(性能)更好。如果您希望对用户或单个连接进行批量扫描,那么最好使用关系(同样,请参阅其他答案以获得更多细节)。因此,在这个条件下,我推断你只是想一次穿过一条边——例如。“给我看看Bob喜欢的所有东西”,你不需要做更深入的查询,比如“给我看看每个和Bob相距3-4度的人”。

Driver/Engineering Issues

司机/工程问题

Speed wise, it's generally known that the java API is faster than the REST API for neo4j. Performance for the REST API would be variable, and depend on a lot of other factors like whether the DB is hosted on the same machine, or how "network far" away it is. You always have extra overhead with REST that comes with things like HTTP and serializing/deserializing JSON that you wouldn't have if you used the java API. So all other things being equal (disclaimer: they never are ;) the REST API will generally be slower than something like EF.

就速度而言,一般都知道java API比neo4j的REST API要快。REST API的性能是可变的,取决于许多其他因素,比如DB是否驻留在同一台机器上,或者它离网络有多远。如果使用java API,就不会有HTTP和序列化/反序列化JSON之类的附加REST开销。因此,在所有其他条件相同的情况下(声明:它们从来都不是;),REST API通常会比EF之类的东西慢。

Modeling Practicalities

建模的可行性

Here, neo4j is going to win by a lot. With MySQL, you'll have the ever-present object-relational impedance mismatch; neo4j lessens (but does not eliminate) those impedance mismatch problems. Modeling wise, neo4j is schemaless, which comes with lots of pros and cons. You can probably cobble together a working model faster with neo4j because your domain is fundamentally graphy-y.

在这里,neo4j将会胜出很多。neo4j减轻(但不能消除)这些阻抗不匹配问题。从建模的角度来看,neo4j是无模式的,它有很多优点和缺点。