
时间:2021-07-17 00:18:18

Well I have a videos website and a few of its tables are:



id ~ int(11), auto-increment [PRIMARY KEY]
tag_name ~ varchar(255)


tag_id ~ int(11) [PRIMARY KEY]
video_id ~ int(11) [PRIMARY KEY]


id ~ int(11), auto-increment [PRIMARY KEY]
video_name ~ varchar(255)

Now at this point the tags table has >1000 rows and the videotags table has >32000 rows. So when I run a query to display all tags from most common to least common it takes >15 seconds to execute.

此时,标签表有> 1000行,录像带表有> 32000行。因此,当我运行查询以显示从最常见到最不常见的所有标记时,执行时间大于15秒。

I am using PHP and my code (watered down for simplicity) is as follows:


foreach ($database->query("SELECT tag_name,COUNT(tag_id) AS 'tag_count' FROM tags LEFT OUTER JOIN videotags ON GROUP BY ORDER BY tag_count DESC") as $tags)
    echo $tags["tag_name"] . ', ';

Now keeping in mind that this being 100% accurate isn't as important to me as it being fast. So even if the query was executed once a day and its results were used for the remainder of the day, I wouldn't care.


I know absolutely nothing about MySQL/PHP caching so please help!

我对MySQL / PHP缓存一无所知,所以请帮忙!

4 个解决方案



MarkR mentioned the index. Make sure you:


create index videotags_tag_id on videotags(tag_id);



32,000 rows is still a small table - there's no way your performance should be that bad.

32,000行仍然是一个小桌子 - 你的表现不可能那么糟糕。

Can you run EXPLAIN on your query - I'd guess you're indexes are wrong somewhere.

你可以在你的查询上运行EXPLAIN - 我猜你的索引在某处是错误的。

You say in the question:


tag_id ~ int(11) [PRIMARY KEY]
video_id ~ int(11) [PRIMARY KEY]

Are they definitely in that order? If not, then it won't use the index.




I think your best bet is to create some kind of summary table which you maintain when things change.


The query above needs to scan all the rows in the table in order to find the aggregates in the group by - there is NO WHERE CLAUSE. A query with no where clause has no hope of optimisation, as it necessarily has to check every row.

上面的查询需要扫描表中的所有行,以便通过以下方式查找聚合: - 没有WHERE CLAUSE。没有where子句的查询没有优化的希望,因为它必须检查每一行。

The fix is to create a summary table with the same data as the result of that query (or similar), which you will have to maintain from time to time when the data change or change significantly.


Only you can decide, based on the nature of your application and your data, whether it's appropriate to update the summary table on a scheduled basis, on each update, or some combination.


As you're doing a join, the right indexes are still beneficial, but you knew that, right, and had already done it?




Are you using InnoDB or MyISAM? In MyISAM COUNT is basically free, but in InnoDB it has to physically count the rows.




MarkR mentioned the index. Make sure you:


create index videotags_tag_id on videotags(tag_id);



32,000 rows is still a small table - there's no way your performance should be that bad.

32,000行仍然是一个小桌子 - 你的表现不可能那么糟糕。

Can you run EXPLAIN on your query - I'd guess you're indexes are wrong somewhere.

你可以在你的查询上运行EXPLAIN - 我猜你的索引在某处是错误的。

You say in the question:


tag_id ~ int(11) [PRIMARY KEY]
video_id ~ int(11) [PRIMARY KEY]

Are they definitely in that order? If not, then it won't use the index.




I think your best bet is to create some kind of summary table which you maintain when things change.


The query above needs to scan all the rows in the table in order to find the aggregates in the group by - there is NO WHERE CLAUSE. A query with no where clause has no hope of optimisation, as it necessarily has to check every row.

上面的查询需要扫描表中的所有行,以便通过以下方式查找聚合: - 没有WHERE CLAUSE。没有where子句的查询没有优化的希望,因为它必须检查每一行。

The fix is to create a summary table with the same data as the result of that query (or similar), which you will have to maintain from time to time when the data change or change significantly.


Only you can decide, based on the nature of your application and your data, whether it's appropriate to update the summary table on a scheduled basis, on each update, or some combination.


As you're doing a join, the right indexes are still beneficial, but you knew that, right, and had already done it?




Are you using InnoDB or MyISAM? In MyISAM COUNT is basically free, but in InnoDB it has to physically count the rows.
