在SQL查询的SELECT子句中对Oracle PL / SQL语句进行惰性求值

时间:2021-03-20 00:24:40

I have a performance problem with an Oracle select statement that I use in a cursor. In the statement one of the terms in the SELECT clause is expensive to evaluate (it's a PL/SQL procedure call, which accesses the database quite heavily). The WHERE clause and ORDER BY clauses are straightforward, however.

我在游标中使用的Oracle select语句存在性能问题。在语句中,SELECT子句中的一个术语的评估成本很高(它是一个PL / SQL过程调用,它会非常重要地访问数据库)。但是,WHERE子句和ORDER BY子句很简单。

I expected that Oracle would first perform the WHERE clause to identify the set of records that match the query, then perform the ORDER BY clause to order them, and finally evaluate each of the terms in the SELECT clause. As I'm using this statement in a cursor from which I then pull results, I expected that the expensive evaluation of the SELECT term would only be performed as needed, when each result was requested from the cursor.

我希望Oracle首先执行WHERE子句来识别与查询匹配的记录集,然后执行ORDER BY子句对它们进行排序,最后评估SELECT子句中的每个术语。因为我在游标中使用了这个语句然后从中拉取结果,我预计只有在从游标请求每个结果时才会根据需要执行昂贵的SELECT术语评估。

However, I've found that this is not the sequence that Oracle uses. Instead it appears to evaluate the terms in the SELECT clause for each record that matches the WHERE clause before performing the sort. Due to this, the procedure that is expensive to call is called for every result result in the result set before any results are returned from the cursor.

但是,我发现这不是Oracle使用的序列。相反,它似乎在执行排序之前为每个匹配WHERE子句的记录评估SELECT子句中的术语。因此,在从游标返回任何结果之前,将为结果集中的每个结果调用调用昂贵的过程。

I want to be able to get the first results out of the cursor as quickly as possible. Can anyone tell me how to persuade Oracle not to evaluate the procedure call in the SELECT statement until after the sort has been performed?

我希望能够尽快从光标中获取第一个结果。任何人都可以告诉我如何说服Oracle在SELECT语句中评估过程调用,直到执行排序后?

This is all probably easier to describe in example code:

这可能更容易在示例代码中描述:

Given a table example with columns a, b, c and d, I have a statement like:

给定一个包含a,b,c和d列的表示例,我有一个如下语句:

select a, b, expensive_procedure(c)
  from example
 where <the_where_clause>
 order by d;

On executing this, expensive_procedure() is called for every record that matches the WHERE clause, even if I open the statement as a cursor and only pull one result from it.

执行此操作时,将为每个与WHERE子句匹配的记录调用expensive_procedure(),即使我将该语句作为游标打开并仅从中拉出一个结果。

I've tried restructuring the statement as:

我已经尝试将声明重组为:

select a, b, expensive_procedure(c)
  from example, (select example2.rowid, ROWNUM
                   from example example2
                  where <the_where_clause>
                  order by d)
  where example.rowid = example2.rowid;

Where the presence of ROWNUM in the inner SELECT statement forces Oracle to evaluate it first. This restructuring has the desired performance benefit. Unfortunately it doesn't always respect the ordering that is required.

内部SELECT语句中存在ROWNUM会强制Oracle首先对其进行求值。这种重组具有理想的性能优势。不幸的是,它并不总是尊重所需的顺序。

Just to be clear, I know that I won't be improving the time it takes to return the entire result set. I'm looking to improve the time taken to return the first few results from the statement. I want the time taken to be progressive as I iterate over the results from the cursor, not all of it to elapse before the first result is returned.

为了清楚起见,我知道我不会改善返回整个结果集所需的时间。我希望改善从声明中返回前几个结果所花费的时间。我希望所有时间都是渐进的,因为我迭代了光标的结果,而不是在返回第一个结果之前所有的结果。

Can any Oracle gurus tell me how I can persuade Oracle to stop executing the PL/SQL until it is necessary?

任何Oracle专家都可以告诉我如何说服Oracle在必要之前停止执行PL / SQL吗?

6 个解决方案

#1


Why join EXAMPLE to itself in the in-line view? Why not just:

为什么在内联视图中将EXAMPLE连接到自身?为什么不呢:

select /*+ no_merge(v) */ a, b, expensive_procedure(c)
from 
( select a, b, c
  from example
  where <the_where_clause>
  order by d
) v;

#2


Does this do what you intend?

这样做你想要的吗?

WITH 
cheap AS
(
    SELECT A, B, C
    FROM EXAMPLE
    WHERE <the_where_clause>
)
SELECT A, B, expensive_procedure(C)
FROM cheap
ORDER BY D

#3


You might want to give this a try

你可能想尝试一下

select a, b, expensive_procedure(c)
  from example, (select /*+ NO_MERGE */
                    example2.rowid, 
                    ROWNUM
                    from example example2
                    where <the_where_clause>
                    order by d)
  where example.rowid = example2.rowid;

#4


Might some form of this work?

可能某种形式的这项工作?

FOR R IN (SELECT a,b,c FROM example WHERE ...) LOOP
  e := expensive_procedure(R.c);
  ...
END LOOP;

#5


If your WHERE conditions are equalities, i. e.

如果你的WHERE条件是平等的,i。即

WHERE   col1 = :value1
        AND col2 = :value2

you can create a composite index on (col1, col2, d):

你可以在(col1,col2,d)上创建一个复合索引:

CREATE INDEX ix_example_col1_col2_d ON example(col1, col2, d)

and hint your query to use it:

并提示您的查询使用它:

SELECT  /*+ INDEX (e ix_example_col1_col2_d) */
        a, b, expensive_procedure(c)
FROM    example e
WHERE   col1 = :value1
        AND col2 = :value2
ORDER BY
        d

In the example below, t_even is a 1,000,000 rows table with an index on value.

在下面的示例中,t_even是一个1,000,000行的表,其值为索引。

Fetching 100 columns from this query:

从此查询中获取100列:

SELECT  SYS_GUID()
FROM    t_even
ORDER BY
        value

is instant (0,03 seconds), while this one:

是即时(0,03秒),而这一个:

SELECT  SYS_GUID()
FROM    t_even
ORDER BY
        value + 1

takes about 170 seconds to fetch first 100 rows.

大约需要170秒来获取前100行。

SYS_GUID() is quite expensive in Oracle

SYS_GUID()在Oracle中相当昂贵

As proposed by others, you can also use this:

正如其他人提出的那样,你也可以使用这个:

SELECT  a, b, expensive_proc(c)
FROM    (
        SELECT  /*+ NO_MERGE */
                *
        FROM    mytable
        ORDER BY
                d
        )

, but using an index will improve your query response time (how soon the first row is returned).

,但使用索引将改善您的查询响应时间(返回第一行的时间)。

#6


One of the key problems with the solutions that we've tried is how to adjust the application that generates the SQL to structure the query correctly. The built SQL will vary in terms of number of columns retrieved, number and type of conditions in the where clause and number and type of expressions in the order by.

我们尝试过的解决方案的一个关键问题是如何调整生成SQL的应用程序来正确构建查询。构建的SQL将根据检索的列数,where子句中的条件数量和类型以及顺序中的表达式的数量和类型而变化。

The inline view returning ROWIDs for joining to the outer was an almost completely generic solution that we can utilise, except where the search is returning a significant portion of the data. In this case the optimiser decides [correctly] that a HASH join is cheaper than a NESTED LOOP.

返回ROWID以连接到外部的内联视图是我们可以使用的几乎完全通用的解决方案,除非搜索返回了大部分数据。在这种情况下,优化器[正确]决定HASH连接比NESTED LOOP便宜。

The other issue was that some of the objects involved are VIEWs that can't have ROWIDs.

另一个问题是涉及的一些对象是不能有ROWID的VIEW。

For information: "D" was not a typo. The expression for the order by is not selected as part of the return value. Not an unusual thing:

有关信息:“D”不是拼写错误。未按顺序选择order by的表达式作为返回值的一部分。不是一件不寻常的事:

select index_name, column_name
from user_ind_columns
where table_name = 'TABLE_OF_INTEREST'
order by index_name, column_position;

Here, you don't need to know the column_position, but sorting by it is critical.

在这里,您不需要知道column_position,但按其排序至关重要。

We have reasons (with which we won't bore the reader) for avoiding the need for hints in the solution, but it's not looking like this is possible.

我们有理由(我们不会让读者感到烦恼)避免在解决方案中提示,但看起来这并不可能。

Thanks for the suggestions thus far - we have tried most of them already ...

感谢到目前为止的建议 - 我们已经尝试了大部分建议......

#1


Why join EXAMPLE to itself in the in-line view? Why not just:

为什么在内联视图中将EXAMPLE连接到自身?为什么不呢:

select /*+ no_merge(v) */ a, b, expensive_procedure(c)
from 
( select a, b, c
  from example
  where <the_where_clause>
  order by d
) v;

#2


Does this do what you intend?

这样做你想要的吗?

WITH 
cheap AS
(
    SELECT A, B, C
    FROM EXAMPLE
    WHERE <the_where_clause>
)
SELECT A, B, expensive_procedure(C)
FROM cheap
ORDER BY D

#3


You might want to give this a try

你可能想尝试一下

select a, b, expensive_procedure(c)
  from example, (select /*+ NO_MERGE */
                    example2.rowid, 
                    ROWNUM
                    from example example2
                    where <the_where_clause>
                    order by d)
  where example.rowid = example2.rowid;

#4


Might some form of this work?

可能某种形式的这项工作?

FOR R IN (SELECT a,b,c FROM example WHERE ...) LOOP
  e := expensive_procedure(R.c);
  ...
END LOOP;

#5


If your WHERE conditions are equalities, i. e.

如果你的WHERE条件是平等的,i。即

WHERE   col1 = :value1
        AND col2 = :value2

you can create a composite index on (col1, col2, d):

你可以在(col1,col2,d)上创建一个复合索引:

CREATE INDEX ix_example_col1_col2_d ON example(col1, col2, d)

and hint your query to use it:

并提示您的查询使用它:

SELECT  /*+ INDEX (e ix_example_col1_col2_d) */
        a, b, expensive_procedure(c)
FROM    example e
WHERE   col1 = :value1
        AND col2 = :value2
ORDER BY
        d

In the example below, t_even is a 1,000,000 rows table with an index on value.

在下面的示例中,t_even是一个1,000,000行的表,其值为索引。

Fetching 100 columns from this query:

从此查询中获取100列:

SELECT  SYS_GUID()
FROM    t_even
ORDER BY
        value

is instant (0,03 seconds), while this one:

是即时(0,03秒),而这一个:

SELECT  SYS_GUID()
FROM    t_even
ORDER BY
        value + 1

takes about 170 seconds to fetch first 100 rows.

大约需要170秒来获取前100行。

SYS_GUID() is quite expensive in Oracle

SYS_GUID()在Oracle中相当昂贵

As proposed by others, you can also use this:

正如其他人提出的那样,你也可以使用这个:

SELECT  a, b, expensive_proc(c)
FROM    (
        SELECT  /*+ NO_MERGE */
                *
        FROM    mytable
        ORDER BY
                d
        )

, but using an index will improve your query response time (how soon the first row is returned).

,但使用索引将改善您的查询响应时间(返回第一行的时间)。

#6


One of the key problems with the solutions that we've tried is how to adjust the application that generates the SQL to structure the query correctly. The built SQL will vary in terms of number of columns retrieved, number and type of conditions in the where clause and number and type of expressions in the order by.

我们尝试过的解决方案的一个关键问题是如何调整生成SQL的应用程序来正确构建查询。构建的SQL将根据检索的列数,where子句中的条件数量和类型以及顺序中的表达式的数量和类型而变化。

The inline view returning ROWIDs for joining to the outer was an almost completely generic solution that we can utilise, except where the search is returning a significant portion of the data. In this case the optimiser decides [correctly] that a HASH join is cheaper than a NESTED LOOP.

返回ROWID以连接到外部的内联视图是我们可以使用的几乎完全通用的解决方案,除非搜索返回了大部分数据。在这种情况下,优化器[正确]决定HASH连接比NESTED LOOP便宜。

The other issue was that some of the objects involved are VIEWs that can't have ROWIDs.

另一个问题是涉及的一些对象是不能有ROWID的VIEW。

For information: "D" was not a typo. The expression for the order by is not selected as part of the return value. Not an unusual thing:

有关信息:“D”不是拼写错误。未按顺序选择order by的表达式作为返回值的一部分。不是一件不寻常的事:

select index_name, column_name
from user_ind_columns
where table_name = 'TABLE_OF_INTEREST'
order by index_name, column_position;

Here, you don't need to know the column_position, but sorting by it is critical.

在这里,您不需要知道column_position,但按其排序至关重要。

We have reasons (with which we won't bore the reader) for avoiding the need for hints in the solution, but it's not looking like this is possible.

我们有理由(我们不会让读者感到烦恼)避免在解决方案中提示,但看起来这并不可能。

Thanks for the suggestions thus far - we have tried most of them already ...

感谢到目前为止的建议 - 我们已经尝试了大部分建议......