MYSQL优化137000行的表

时间:2020-12-08 01:13:38

I'm trying to optimize a redmine database before it gets too much of a pain; the Changes (basically a log of all the SVN Changes) is at 137000 rows (ish) and the table is set to the b asic default settings. No key packing etc.

我正在尝试优化redmine数据库,然后才会出现太大的痛苦;更改(基本上是所有SVN更改的日志)为137000行(ish),表格设置为b asic默认设置。没有钥匙包装等

The table is as follows

表格如下

ID int[11] Auto Inc (PK)
changeset_id int[11]
action varchar[1]
path varchar[255]
from_path varchar[255]
from_revision varchar[255]
revision varchar[255]
branch  varchar[255]

Indices: Primary (ID),
              changeset_id set to INDEX BTREE

指数:主要(ID),changeset_id设置为INDEX BTREE

All on latin1 charset based on a bit of info from http://forge.mysql.com/wiki/Top10SQLPerformanceTips

所有关于latin1字符集的基于http://forge.mysql.com/wiki/Top10SQLPerformanceTips的一些信息

The Table Engine is InnoDB Pack Keys is set to Default (only packs char varchar)

表引擎是InnoDB Pack Keys设置为Default(仅包char varchar)

All the other options are turned off.

所有其他选项均已关闭。

Whats the best way to optimize this? (Bar Truncate ;o) )

什么是优化这种方法的最佳方法? (Bar Truncate; o))

2 个解决方案

#1


There are some general optimization techniques for mysql: the first would be make sure your datatypes fit the ABCs (see here). Going over then from top to bottom, ID and changeset_id look good, action should probably be a char1 instead of a varchar (nullable if you can leave it blank (and in general, make sure your nullable is set correctly on other fields)). As for the 5 other fields (which depending on size would probably dominate the table), are strings the correct datatype? (I'm guessing yes with path, from_path, branch, but maybe revision should be a number (I'm guessing it isn't so it supports git or something))

mysql有一些通用的优化技术:首先要确保你的数据类型适合ABCs(见这里)。从上到下,ID和changeset_id看起来很好,动作应该是char1而不是varchar(如果可以将其留空,则可为空(通常,确保在其他字段上正确设置了nullable))。至于其他5个字段(根据大小可能会占据表格),字符串是否为正确的数据类型? (我猜是路径,from_path,分支,但也许修改应该是一个数字(我猜它不是因此它支持git或其他东西))

Also, they look like normalization targets, especially since a "paths" and "revisions" table would normalize four of them (here's a basic tutorial, if you need it)

此外,它们看起来像规范化目标,特别是因为“路径”和“修订”表将规范化其中的四个(这里是一个基本教程,如果你需要它)

#2


It depends entirely on your read and write characteristics, i.e., the queries you're making, and how often you're writing to it.

它完全取决于您的读写特征,即您正在进行的查询以及您写入它的频率。

The way to optimize for writing is to minimize the number of indexes. Ideally, you use what in MS SQL server would be the "clustered index" with a monotonically incrementing key, ensuring that you write new records to the end of the table, and you write no other separate index. Better yet, even, is to skip the DBMS and write to a plain old log file of some sort, if you don't need any transactional capability.

优化写入的方法是最小化索引的数量。理想情况下,您使用MS SQL服务器中的内容将是具有单调递增键的“聚簇索引”,确保您将新记录写入表的末尾,并且不编写其他单独的索引。更好的是,如果您不需要任何事务功能,则跳过DBMS并写入某种普通的旧日志文件。

For queries, well, that can get as complex as you like. Do keep in mind, though, that if you need any significant amount of data from the table for a query (i.e., it's more than just looking up a single record based on a key), table scans may not be such a bad thing. Generally, if you're examining more than 3-5% of the contents of a table, a table scan will be very fast. Again, for this, a plain old file will probably be faster than a DBMS.

对于查询,嗯,这可能会变得如你所愿。但请记住,如果您需要查询表中的任何大量数据(即,它不仅仅是根据键查找单个记录),表扫描可能不是一件坏事。通常,如果您正在检查表中超过3-5%的内容,则表扫描速度会非常快。同样,对于这个,普通的旧文件可能比DBMS更快。

If you have to optimize for both, consider optimizing for writing, and then making a copy on a regular basis that you optimize for queries, and doing the queries against the copy.

如果必须针对两者进行优化,请考虑优化写入,然后定期复制以优化查询,并针对副本执行查询。

#1


There are some general optimization techniques for mysql: the first would be make sure your datatypes fit the ABCs (see here). Going over then from top to bottom, ID and changeset_id look good, action should probably be a char1 instead of a varchar (nullable if you can leave it blank (and in general, make sure your nullable is set correctly on other fields)). As for the 5 other fields (which depending on size would probably dominate the table), are strings the correct datatype? (I'm guessing yes with path, from_path, branch, but maybe revision should be a number (I'm guessing it isn't so it supports git or something))

mysql有一些通用的优化技术:首先要确保你的数据类型适合ABCs(见这里)。从上到下,ID和changeset_id看起来很好,动作应该是char1而不是varchar(如果可以将其留空,则可为空(通常,确保在其他字段上正确设置了nullable))。至于其他5个字段(根据大小可能会占据表格),字符串是否为正确的数据类型? (我猜是路径,from_path,分支,但也许修改应该是一个数字(我猜它不是因此它支持git或其他东西))

Also, they look like normalization targets, especially since a "paths" and "revisions" table would normalize four of them (here's a basic tutorial, if you need it)

此外,它们看起来像规范化目标,特别是因为“路径”和“修订”表将规范化其中的四个(这里是一个基本教程,如果你需要它)

#2


It depends entirely on your read and write characteristics, i.e., the queries you're making, and how often you're writing to it.

它完全取决于您的读写特征,即您正在进行的查询以及您写入它的频率。

The way to optimize for writing is to minimize the number of indexes. Ideally, you use what in MS SQL server would be the "clustered index" with a monotonically incrementing key, ensuring that you write new records to the end of the table, and you write no other separate index. Better yet, even, is to skip the DBMS and write to a plain old log file of some sort, if you don't need any transactional capability.

优化写入的方法是最小化索引的数量。理想情况下,您使用MS SQL服务器中的内容将是具有单调递增键的“聚簇索引”,确保您将新记录写入表的末尾,并且不编写其他单独的索引。更好的是,如果您不需要任何事务功能,则跳过DBMS并写入某种普通的旧日志文件。

For queries, well, that can get as complex as you like. Do keep in mind, though, that if you need any significant amount of data from the table for a query (i.e., it's more than just looking up a single record based on a key), table scans may not be such a bad thing. Generally, if you're examining more than 3-5% of the contents of a table, a table scan will be very fast. Again, for this, a plain old file will probably be faster than a DBMS.

对于查询,嗯,这可能会变得如你所愿。但请记住,如果您需要查询表中的任何大量数据(即,它不仅仅是根据键查找单个记录),表扫描可能不是一件坏事。通常,如果您正在检查表中超过3-5%的内容,则表扫描速度会非常快。同样,对于这个,普通的旧文件可能比DBMS更快。

If you have to optimize for both, consider optimizing for writing, and then making a copy on a regular basis that you optimize for queries, and doing the queries against the copy.

如果必须针对两者进行优化,请考虑优化写入,然后定期复制以优化查询,并针对副本执行查询。