
时间:2022-10-04 09:37:09

I am using MySQL to store reports from a tool. I am extremely happy with the speed and flexibility with which users can query data. The tool also has some data which is a graph. My question is, is it a good idea to store the graph in MySQL? The number of nodes and edges in the graph is in the millions and queries are usually graph traversals.


3 个解决方案



MySQL is not created and optimized as a graph database in particular. You might want to try Neo4J which is a good graph database.




Plain SQL is usually unfit for manipulating a graph datastructure. There are techniques to index it, however.


For instance, if yours is not frequently updated, using a GRIPP index will let you handle graph traversal queries extremely well. The latter lets you answer parent-child and depth-related queries in more or less fixed time -- irrespective of the graph's number of nodes or density of links.

例如,如果您的频繁更新,使用GRIPP索引将使您能够非常好地处理图遍历查询。后者允许您在或多或少的固定时间内回答父子查询和与深度相关的查询 - 无论图的节点数或链接密度如何。



SQL databases don't handle graph data very well in general. The problem is that to do a graph traversal you either have to pull the entire graph into memory in a single query, then manipulate it and store the changes, or you have to perform huge amounts of joins to traverse the graph one node at a time, which becomes prohibitively slow. With graphs of the scale you are looking at it would probably be better to use a graph database or to use a memory database like REDIS as a fast caching layer and then persist it in the background.




MySQL is not created and optimized as a graph database in particular. You might want to try Neo4J which is a good graph database.




Plain SQL is usually unfit for manipulating a graph datastructure. There are techniques to index it, however.


For instance, if yours is not frequently updated, using a GRIPP index will let you handle graph traversal queries extremely well. The latter lets you answer parent-child and depth-related queries in more or less fixed time -- irrespective of the graph's number of nodes or density of links.

例如,如果您的频繁更新,使用GRIPP索引将使您能够非常好地处理图遍历查询。后者允许您在或多或少的固定时间内回答父子查询和与深度相关的查询 - 无论图的节点数或链接密度如何。



SQL databases don't handle graph data very well in general. The problem is that to do a graph traversal you either have to pull the entire graph into memory in a single query, then manipulate it and store the changes, or you have to perform huge amounts of joins to traverse the graph one node at a time, which becomes prohibitively slow. With graphs of the scale you are looking at it would probably be better to use a graph database or to use a memory database like REDIS as a fast caching layer and then persist it in the background.
