MySQL / PostgreSQL是否缓存查询的解析/编译?

时间:2022-01-22 00:59:54

Suppose I execute a query in MySQL (or PostgreSQL), let's say:

假设我在MySQL(或PostgreSQL)中执行查询,让我们说:

SELECT * FROM USER WHERE age = 20;

Does the database engine parse and compile the query/statement each time I execute it? Or does it hold some cache of the previous statements/queries?

每次执行时,数据库引擎是否解析并编译查询/语句?或者它是否保留了先前语句/查询的缓存?

If it has a cache mechanism, does it treat the following two queries differently?

如果它有缓存机制,它会以不同方式处理以下两个查询吗?

/* first query */
SELECT * FROM USER WHERE age = 20 AND name = 'foo';

/* second query */
SELECT * FROM USER WHERE name = 'foo' AND age = 20;

I'm asking that because I'm using some tool for generating the SQL queries in my code, that doesn't consistent with the order of the conditions in the queries. I just want to be sure that this behavior doesn't effect my database performance.

我问这是因为我在我的代码中使用了一些工具来生成SQL查询,这与查询中条件的顺序不一致。我只想确保此行为不会影响我的数据库性能。

Thanks

1 个解决方案

#1


2  

SQL is a declarative language, not a procedural language. What actually gets executed is basically a DAG -- directed acyclic graph -- of components that you probably would not recognize as SQL constructs (such as "hash join" or "filter" or "sort").

SQL是一种声明性语言,而不是一种过程语言。实际执行的内容基本上是一个DAG指导的非循环图 - 您可能不会将其识别为SQL构造的组件(例如“散列连接”或“过滤器”或“排序”)。

The two queries you mention with conditions on name and age are going to compile to essentially the same compiled form. If you have appropriate indexes or partitions, both queries will use them. If you don't, then they might execute the boolean conditions in different orders. However, the big overhead on such a query is the full table scan, not the individual comparisons.

您在名称和年龄方面提到的两个查询将编译为基本相同的编译形式。如果您有适当的索引或分区,则两个查询都将使用它们。如果不这样做,那么他们可能会以不同的顺序执行布尔条件。但是,这种查询的巨大开销是全表扫描,而不是单独的比较。

Under some rare circumstances, you might want to be sure that conditions are executed in a particular order -- especially if you have an expensive user-defined function. The compiler will often do this for you. If not, you can use a case expression:

在极少数情况下,您可能希望确保以特定顺序执行条件 - 特别是如果您有昂贵的用户定义函数。编译器通常会为您执行此操作。如果没有,您可以使用案例表达式:

where (case when col <> 0 then 0
            when expensive_function( . . . ) then 1
            else 0
       end) = 1

This will execute the expensive function only when col = 0, because case expressions evaluate their expressions in sequential order (in a non-aggregation query).

这将仅在col = 0时执行昂贵的函数,因为case表达式按顺序(在非聚合查询中)评估它们的表达式。

As for caching, that depends on the server and options. In general, databases cache query plans, so they don't need to be recompiled. And this is often at the prepared statement level rather than the text of the query. Databases don't generally cache results, because the data in the underlying tables might change.

至于缓存,这取决于服务器和选项。通常,数据库缓存查询计划,因此不需要重新编译它们。这通常是在准备好的语句级别而不是查询的文本。数据库通常不会缓存结果,因为基础表中的数据可能会更改。

#1


2  

SQL is a declarative language, not a procedural language. What actually gets executed is basically a DAG -- directed acyclic graph -- of components that you probably would not recognize as SQL constructs (such as "hash join" or "filter" or "sort").

SQL是一种声明性语言,而不是一种过程语言。实际执行的内容基本上是一个DAG指导的非循环图 - 您可能不会将其识别为SQL构造的组件(例如“散列连接”或“过滤器”或“排序”)。

The two queries you mention with conditions on name and age are going to compile to essentially the same compiled form. If you have appropriate indexes or partitions, both queries will use them. If you don't, then they might execute the boolean conditions in different orders. However, the big overhead on such a query is the full table scan, not the individual comparisons.

您在名称和年龄方面提到的两个查询将编译为基本相同的编译形式。如果您有适当的索引或分区,则两个查询都将使用它们。如果不这样做,那么他们可能会以不同的顺序执行布尔条件。但是,这种查询的巨大开销是全表扫描,而不是单独的比较。

Under some rare circumstances, you might want to be sure that conditions are executed in a particular order -- especially if you have an expensive user-defined function. The compiler will often do this for you. If not, you can use a case expression:

在极少数情况下,您可能希望确保以特定顺序执行条件 - 特别是如果您有昂贵的用户定义函数。编译器通常会为您执行此操作。如果没有,您可以使用案例表达式:

where (case when col <> 0 then 0
            when expensive_function( . . . ) then 1
            else 0
       end) = 1

This will execute the expensive function only when col = 0, because case expressions evaluate their expressions in sequential order (in a non-aggregation query).

这将仅在col = 0时执行昂贵的函数,因为case表达式按顺序(在非聚合查询中)评估它们的表达式。

As for caching, that depends on the server and options. In general, databases cache query plans, so they don't need to be recompiled. And this is often at the prepared statement level rather than the text of the query. Databases don't generally cache results, because the data in the underlying tables might change.

至于缓存,这取决于服务器和选项。通常,数据库缓存查询计划,因此不需要重新编译它们。这通常是在准备好的语句级别而不是查询的文本。数据库通常不会缓存结果,因为基础表中的数据可能会更改。