大表上的MySQL性能问题 - 如何有效地缓存结果?

时间:2021-12-14 04:14:33

I have a MySQL table to store user statistics with >2MM rows and 8 columns and an index on the userID. When the user visits his profile a lot of those informations are retrieved from the database resulting in - worst case - a couple of dozen SELECT queries sometimes joined with other tables. It is similar to the profile on SO which has to pull a lot of data as well.

我有一个MySQL表来存储用户统计信息,其中包含> 2MM行和8列以及userID上的索引。当用户访问他的配置文件时,从数据库中检索到很多这些信息,导致 - 最坏的情况 - 有时会与其他表连接的几十个SELECT查询。它类似于SO上的配置文件,它也必须提取大量数据。

Some queries like to get the user score require a COUNT and other performance eating MySQL functions. So just the queries for the profile page can take up to 10-20 seconds.

一些查询喜欢获得用户分数需要COUNT和其他性能吃MySQL功能。因此,只需查询配置文件页面最多可能需要10-20秒。

Now my questions:

现在我的问题:

  1. How does a website like SO pull so many informations that quickly?
  2. 像SO这样的网站如何快速提取这么多信息?

  3. Do I need a caching layer?
  4. 我需要缓存层吗?

  5. Should I precalculate the count of the score, etc. that consume MySQL performance?
  6. 我应该预先计算消耗MySQL性能的分数等吗?

  7. Should I use one writing optimized table and one reading optimized table? If so how could I retrieve live data like on SO?
  8. 我应该使用一个写入优化表和一个读取优化表吗?如果是这样,我怎样才能检索SO上的实时数据?

  9. Should I move away from MySQL?
  10. 我应该离开MySQL吗?

1 个解决方案

#1


2  

How does a website like SO pull so many informations that quickly?

像SO这样的网站如何快速提取这么多信息?

They give you an illusion of pulling "so much" data, but in reality they get one chunk at a time. Basically, you would need one query to get the COUNT of the results, and another query to get the data for any page within that resultset. The result set is not cached. The parameters to the query may be cached. But it is best to implement in a way the query parameters are supplied with every page request.

他们给你一个拉“太多”数据的错觉,但实际上他们一次得到一个块。基本上,您需要一个查询来获取结果的COUNT,另一个查询来获取该结果集中任何页面的数据。结果集未缓存。可以缓存查询的参数。但最好以每种页面请求提供查询参数的方式实现。

Do I need a caching layer?

我需要缓存层吗?

May be. But not necessary to implement this. Caching may be useful to store the previously fetched pages, and the same pages are used by entire community, rather than one user.

也许。但没有必要实现这一点。缓存对于存储先前获取的页面可能是有用的,并且整个社区而不是一个用户使用相同的页面。

Should I precalculate the count of the score, etc. that consume MySQL performance?

我应该预先计算消耗MySQL性能的分数等吗?

Not relevant for this use case.

与此用例无关。

Should I use one writing optimized table and one reading optimized table? If so how could I retrieve live data like on SO?

我应该使用一个写入优化表和一个读取优化表吗?如果是这样,我怎样才能检索SO上的实时数据?

Not required.

Should I move away from MySQL?

我应该离开MySQL吗?

Not required.

For more details on this concept, lookup paginated Ajax grids

有关此概念的更多详细信息,请查找分页的Ajax网格

#1


2  

How does a website like SO pull so many informations that quickly?

像SO这样的网站如何快速提取这么多信息?

They give you an illusion of pulling "so much" data, but in reality they get one chunk at a time. Basically, you would need one query to get the COUNT of the results, and another query to get the data for any page within that resultset. The result set is not cached. The parameters to the query may be cached. But it is best to implement in a way the query parameters are supplied with every page request.

他们给你一个拉“太多”数据的错觉,但实际上他们一次得到一个块。基本上,您需要一个查询来获取结果的COUNT,另一个查询来获取该结果集中任何页面的数据。结果集未缓存。可以缓存查询的参数。但最好以每种页面请求提供查询参数的方式实现。

Do I need a caching layer?

我需要缓存层吗?

May be. But not necessary to implement this. Caching may be useful to store the previously fetched pages, and the same pages are used by entire community, rather than one user.

也许。但没有必要实现这一点。缓存对于存储先前获取的页面可能是有用的,并且整个社区而不是一个用户使用相同的页面。

Should I precalculate the count of the score, etc. that consume MySQL performance?

我应该预先计算消耗MySQL性能的分数等吗?

Not relevant for this use case.

与此用例无关。

Should I use one writing optimized table and one reading optimized table? If so how could I retrieve live data like on SO?

我应该使用一个写入优化表和一个读取优化表吗?如果是这样,我怎样才能检索SO上的实时数据?

Not required.

Should I move away from MySQL?

我应该离开MySQL吗?

Not required.

For more details on this concept, lookup paginated Ajax grids

有关此概念的更多详细信息,请查找分页的Ajax网格