mySQL query performance with INNER JOINs

时间:2022-09-08 04:14:54

I have what may be a basic performance question. I've done a lot of SQL queries, but not much in terms of complex inner joins and such. So, here it is:

我有一个基本的性能问题。我已经完成了很多SQL查询,但在复杂的内部连接方面并没有那么多。所以,这里是:

I have a database with 4 tables, countries, territories, employees, and transactions.

我有一个包含4个表,国家,地区,员工和交易的数据库。

The transactions links up with the employees and countries. The employees links up with the territories. In order to produce a required report, I'm running a PHP script that processes a SQL query against a mySQL database.

这些交易与员工和国家联系起来。员工与地区联系。为了生成所需的报告,我正在运行一个PHP脚本来处理针对mySQL数据库的SQL查询。

SELECT trans.transactionDate, agent.code, agent.type, trans.transactionAmount, agent.territory       
FROM transactionTable as trans 
INNER JOIN 
(
    SELECT agent1.code as code, agent1.type as type, territory.territory as territory FROM agentTable as agent1 
    INNER JOIN territoryTable as territory 
    ON agent1.zip=territory.zip
) AS agent
ON agent.code=trans.agent 
ORDER BY trans.agent

There are about 50,000 records in the agent table, and over 200,000 in the transaction table. The other two are relatively tiny. It's taking about 7 minutes to run this query. And I haven't even inserted the fourth table yet, which needs to relate a field in the transactionTable (country) to a field in the countryTable (country) and return a field in the countryTable (region).

代理表中有大约50,000条记录,事务表中大约有200,000条记录。另外两个相对较小。运行此查询大约需要7分钟。我还没有插入第四个表,它需要将transactionTable(country)中的字段与countryTable(country)中的字段相关联,并返回countryTable(region)中的字段。

So, two questions:

那么,有两个问题:

  1. Where would I logically put the connection between the transactionTable and the countryTable?

    我在逻辑上将transactionTable和countryTable之间的连接放在哪里?

  2. Can anyone suggest a way that this can be quickened up?

    任何人都可以建议一种可以加快速度的方法吗?

Thanks.

1 个解决方案

#1


0  

Your query should be equivalent to this:

您的查询应与以下内容等效:

SELECT tx.transactionDate,
       a.code,
       a.type,
       tx.transactionAmount,
       t.territory
FROM transactionTable tx,
     agentTable a,
     territoryTable t
WHERE tx.agent = a.code
  AND a.zip = t.zip
ORDER BY tx.agent

or to this if you like to use JOIN:

或者如果你想使用JOIN:

SELECT tx.transactionDate,
       a.code,
       a.type,
       tx.transactionAmount,
       t.territory
FROM transactionTable tx
JOIN agentTable a     ON tx.agent = a.code
JOIN territoryTable t ON a.zip = t.zip
ORDER BY tx.agent

In order to work fast, you must have following indexes on your tables:

为了快速工作,您必须在表上具有以下索引:

CREATE INDEX transactionTable_agent ON transactionTable(agent);
CREATE INDEX territoryTable_zip     ON territoryTable(zip);
CREATE INDEX agentTable_code        ON agentTable(code);

(basically any field that is part of WHERE or JOIN constraint should be indexed).

(基本上任何属于WHERE或JOIN约束的字段都应该被索引)。

That said, your table structure looks suspicious in a sense that it is joined by apparently non-unique fields like zip code. You really want to join by more unique entities, like agent id, transaction id and so on - otherwise expect your queries to generate a lot of redundant data and be really slow.

也就是说,你的表结构在某种意义上看起来很可疑,它是由明显非独特的字段(如邮政编码)加入的。您真的想通过更多独特的实体加入,例如代理ID,事务ID等等 - 否则期望您的查询生成大量冗余数据并且非常慢。

One more note: INNER JOIN is equivalent to simply JOIN, there is no reason to type redundant clause.

还有一点需要注意:INNER JOIN相当于简单的JOIN,没有理由输入冗余子句。

#1


0  

Your query should be equivalent to this:

您的查询应与以下内容等效:

SELECT tx.transactionDate,
       a.code,
       a.type,
       tx.transactionAmount,
       t.territory
FROM transactionTable tx,
     agentTable a,
     territoryTable t
WHERE tx.agent = a.code
  AND a.zip = t.zip
ORDER BY tx.agent

or to this if you like to use JOIN:

或者如果你想使用JOIN:

SELECT tx.transactionDate,
       a.code,
       a.type,
       tx.transactionAmount,
       t.territory
FROM transactionTable tx
JOIN agentTable a     ON tx.agent = a.code
JOIN territoryTable t ON a.zip = t.zip
ORDER BY tx.agent

In order to work fast, you must have following indexes on your tables:

为了快速工作,您必须在表上具有以下索引:

CREATE INDEX transactionTable_agent ON transactionTable(agent);
CREATE INDEX territoryTable_zip     ON territoryTable(zip);
CREATE INDEX agentTable_code        ON agentTable(code);

(basically any field that is part of WHERE or JOIN constraint should be indexed).

(基本上任何属于WHERE或JOIN约束的字段都应该被索引)。

That said, your table structure looks suspicious in a sense that it is joined by apparently non-unique fields like zip code. You really want to join by more unique entities, like agent id, transaction id and so on - otherwise expect your queries to generate a lot of redundant data and be really slow.

也就是说,你的表结构在某种意义上看起来很可疑,它是由明显非独特的字段(如邮政编码)加入的。您真的想通过更多独特的实体加入,例如代理ID,事务ID等等 - 否则期望您的查询生成大量冗余数据并且非常慢。

One more note: INNER JOIN is equivalent to simply JOIN, there is no reason to type redundant clause.

还有一点需要注意:INNER JOIN相当于简单的JOIN,没有理由输入冗余子句。