数据库中有多少行太多?

时间:2021-05-25 20:13:48

I've a MySQL InnoDB table with 1,000,000 records. Is this too much? Or databases can handle this and more? I ask because I noticed that some queries (for example, getting the last row from a table) are slower (seconds) in the table with 1 millon rows than in one with 100.

我有一个MySQL InnoDB表,有1,000,000条记录。这是太多了吗?或者数据库可以处理这个或更多?我这样问是因为我注意到有些查询(例如,从表中获取最后一行)在具有1millon行的表中比在具有100行的表中慢(秒)。

10 个解决方案

#1


101  

I've a MySQL InnoDB table with 1000000 registers. Is this too much?

我有一个MySQL InnoDB表,有1000000个寄存器。这是太多了吗?

No, 1,000,000 rows (AKA records) is not too much for a database.

不,1,000,000行(即记录)对数据库来说不是太多。

I ask because I noticed that some queries (for example, getting the last register of a table) are slower (seconds) in the table with 1 million registers than in one with 100.

我这样问是因为我注意到有些查询(例如,获取一个表的最后一个寄存器)在有100万个寄存器的表中比在有100个寄存器的表中要慢(秒)。

There's a lot to account for in that statement. The usual suspects are:

这句话里有很多东西要说明。通常的嫌疑人:

  1. Poorly written query
  2. 写得很差的查询
  3. Not using a primary key, assuming one even exists on the table
  4. 不使用主键,假设一个键存在于表中
  5. Poorly designed data model (table structure)
  6. 设计不良的数据模型(表结构)
  7. Lack of indexes
  8. 缺少索引

#2


50  

I have a database with more than 97,000,000 records(30GB datafile), and having no problem .

我的数据库有超过97,000,000条记录(30GB数据文件),并且没有问题。

Just remember to define and improve your table index.

只需记住定义和改进表索引。

So its obvious that 1,000,000 is not MANY ! (But if you don't index; yes, it is MANY )

所以很明显,1,000,000并不多!(但如果你不做索引;是的,很多)

#3


15  

Use 'explain' to examine your query and see if there is anything wrong with the query plan.

使用“explain”检查查询,看看查询计划是否有问题。

#4


9  

I think this is a common misconception - size is only one part of the equation when it comes to database scalability. There are other issues that are hard (or harder):

我认为这是一个常见的误解——在数据库可扩展性方面,大小只是等式的一部分。还有其他困难的问题:

  • How large is the working set (i.e. how much data needs to be loaded in memory and actively worked on). If you just insert data and then do nothing with it, it's actually an easy problem to solve.

    工作集有多大(即需要在内存中装载多少数据并积极处理)。如果你只是插入数据,然后什么都不做,这实际上是一个很容易解决的问题。

  • What level of concurrency is required? Is there just one user inserting/reading, or do we have many thousands of clients operating at once?

    需要什么级别的并发性?是否只有一个用户插入/读取,或者我们有成千上万的客户同时操作?

  • What levels of promise/durability and consistency of performance are required? Do we have to make sure that we can honor each commit. Is it okay if the average transaction is fast, or do we want to make sure that all transactions are reliably fast (six sigma quality control like - http://www.mysqlperformanceblog.com/2010/06/07/performance-optimization-and-six-sigma/).

    需要什么样的承诺/持久性和性能一致性?我们必须确保我们能兑现每一个承诺吗?如果平均事务是快速的,是否可以?还是我们要确保所有事务都是可靠的快速的(6西格玛质量控制,如:http://www.mysqlperformanceblog.com/2010/06/07/performance optimization zationandsix -sigma/) ?

  • Do you need to do any operational issues, such as ALTER the table schema? In InnoDB this is possible, but incredibly slow since it often has to create a temporary table in foreground (blocking all connections).

    您是否需要做任何操作问题,比如更改表模式?在InnoDB中,这是可能的,但是速度非常慢,因为它经常需要在前台创建一个临时表(阻塞所有连接)。

So I'm going to state the two limiting issues are going to be:

我想说的是两个极限问题是

  • Your own skill at writing queries / having good indexes.
  • 您自己编写查询/拥有良好索引的技能。
  • How much pain you can tolerate waiting on ALTER TABLE statements.
  • 您可以容忍在ALTER TABLE语句上等待的痛苦程度。

#5


3  

I've seen non-partitioned tables with several billion (indexed) records, that self-joined for analytical work. We eventually partitioned the thing but honestly we didn't see that much difference.

我曾见过具有数十亿(索引)记录的非分区表,它们是用于分析工作的自连接表。我们最终分割了这个东西,但老实说,我们没有看到太大的差别。

That said, that was in Oracle and I have not tested that volume of data in MySQL. Indexes are your friend :)

也就是说,那是在Oracle中,我还没有在MySQL中测试过这么多数据。索引是你的朋友:)

#6


2  

If you mean 1 million rows, then it depends on how your indexing is done and the configuration of your hardware. A million rows is not a large amount for an enterprise database, or even a dev database on decent equipment.

如果您指的是100万行,那么它取决于如何进行索引以及硬件的配置。对于一个企业数据库,甚至是一个关于优秀设备的开发数据库来说,一百万行都不是一个大数目。

if you mean 1 million columns (not sure thats even possible in MySQL) then yes, this seems a bit large and will probably cause problems.

如果你指的是100万列(不确定这在MySQL中是否可能),那么是的,这看起来有点大,可能会引起问题。

#7


2  

Register? Do you mean record?

注册吗?你的意思是记录?

One million records is not a real big deal for a database these days. If you run into any issue, it's likely not the database system itself, but rather the hardware that you're running it on. You're not going to run into a problem with the DB before you run out of hardware to throw at it, most likely.

现在一百万的记录对一个数据库来说并不是什么大问题。如果遇到任何问题,很可能不是数据库系统本身,而是运行它的硬件。最有可能的是,在没有足够的硬件可用之前,您不会遇到DB的问题。

Now, obviously some queries are slower than others, but if two very similar queries run in vastly different times, you need to figure out what the database's execution plan is and optimize for it, i.e. use correct indexes, proper normalization, etc.

现在,显然有些查询比其他查询慢,但是如果两个非常相似的查询在非常不同的时间运行,您需要确定数据库的执行计划并对其进行优化,例如使用正确的索引、适当的规范化等等。

Incidentally, there is no such thing as a "last" record in a table, from a logical standpoint they have no inherent order.

顺便说一句,表中没有“最后”记录,从逻辑的角度来看,它们没有内在的顺序。

#8


1  

Assuming you mean "records" by "registers" no, it's not too much, MySQL scales really well and can hold as many records as you have space for in your hard disk.

假设你指的是“注册”的“记录”不,它不是太多,MySQL扩展得很好,可以容纳尽可能多的记录在你的硬盘上。

Obviously though search queries will be slower. There is really no way around that except making sure that the fields are properly indexed.

显然,尽管搜索查询会比较慢。除了确保字段被正确索引之外,实际上没有其他方法可以解决这个问题。

#9


0  

The larger the table gets (as in more rows in it), the slower queries will typically run if there are no indexes. Once you add the right indexes your query performance should improve or at least not degrade as much as the table grows. However, if the query itself returns more rows as the table gets bigger, then you'll start to see degradation again.

表越大(在其中有更多的行),如果没有索引,查询速度就会越慢。一旦添加了正确的索引,您的查询性能应该得到改善,或者至少不会随着表的增长而降低。但是,如果查询本身在表变大时返回更多的行,那么您将再次看到降级。

While 1M rows are not that many, it also depends on how much memory you have on the DB server. If the table is too big to be cached in memory by the server, then queries will be slower.

虽然1M行并不多,但这也取决于DB服务器上有多少内存。如果表太大,服务器无法将其缓存在内存中,那么查询将会变慢。

#10


0  

Using the query provided will be exceptionally slow because of using a sort merge method to sort the data.

由于使用sort merge方法对数据进行排序,使用所提供的查询将异常缓慢。

I would recommend rethinking the design so you are using indexes to retrieve it or make sure it is already ordered in that manner so no sorting is needed.

我建议重新考虑设计,所以您使用索引来检索它,或者确保它已经按照这种方式排序,所以不需要排序。

#1


101  

I've a MySQL InnoDB table with 1000000 registers. Is this too much?

我有一个MySQL InnoDB表,有1000000个寄存器。这是太多了吗?

No, 1,000,000 rows (AKA records) is not too much for a database.

不,1,000,000行(即记录)对数据库来说不是太多。

I ask because I noticed that some queries (for example, getting the last register of a table) are slower (seconds) in the table with 1 million registers than in one with 100.

我这样问是因为我注意到有些查询(例如,获取一个表的最后一个寄存器)在有100万个寄存器的表中比在有100个寄存器的表中要慢(秒)。

There's a lot to account for in that statement. The usual suspects are:

这句话里有很多东西要说明。通常的嫌疑人:

  1. Poorly written query
  2. 写得很差的查询
  3. Not using a primary key, assuming one even exists on the table
  4. 不使用主键,假设一个键存在于表中
  5. Poorly designed data model (table structure)
  6. 设计不良的数据模型(表结构)
  7. Lack of indexes
  8. 缺少索引

#2


50  

I have a database with more than 97,000,000 records(30GB datafile), and having no problem .

我的数据库有超过97,000,000条记录(30GB数据文件),并且没有问题。

Just remember to define and improve your table index.

只需记住定义和改进表索引。

So its obvious that 1,000,000 is not MANY ! (But if you don't index; yes, it is MANY )

所以很明显,1,000,000并不多!(但如果你不做索引;是的,很多)

#3


15  

Use 'explain' to examine your query and see if there is anything wrong with the query plan.

使用“explain”检查查询,看看查询计划是否有问题。

#4


9  

I think this is a common misconception - size is only one part of the equation when it comes to database scalability. There are other issues that are hard (or harder):

我认为这是一个常见的误解——在数据库可扩展性方面,大小只是等式的一部分。还有其他困难的问题:

  • How large is the working set (i.e. how much data needs to be loaded in memory and actively worked on). If you just insert data and then do nothing with it, it's actually an easy problem to solve.

    工作集有多大(即需要在内存中装载多少数据并积极处理)。如果你只是插入数据,然后什么都不做,这实际上是一个很容易解决的问题。

  • What level of concurrency is required? Is there just one user inserting/reading, or do we have many thousands of clients operating at once?

    需要什么级别的并发性?是否只有一个用户插入/读取,或者我们有成千上万的客户同时操作?

  • What levels of promise/durability and consistency of performance are required? Do we have to make sure that we can honor each commit. Is it okay if the average transaction is fast, or do we want to make sure that all transactions are reliably fast (six sigma quality control like - http://www.mysqlperformanceblog.com/2010/06/07/performance-optimization-and-six-sigma/).

    需要什么样的承诺/持久性和性能一致性?我们必须确保我们能兑现每一个承诺吗?如果平均事务是快速的,是否可以?还是我们要确保所有事务都是可靠的快速的(6西格玛质量控制,如:http://www.mysqlperformanceblog.com/2010/06/07/performance optimization zationandsix -sigma/) ?

  • Do you need to do any operational issues, such as ALTER the table schema? In InnoDB this is possible, but incredibly slow since it often has to create a temporary table in foreground (blocking all connections).

    您是否需要做任何操作问题,比如更改表模式?在InnoDB中,这是可能的,但是速度非常慢,因为它经常需要在前台创建一个临时表(阻塞所有连接)。

So I'm going to state the two limiting issues are going to be:

我想说的是两个极限问题是

  • Your own skill at writing queries / having good indexes.
  • 您自己编写查询/拥有良好索引的技能。
  • How much pain you can tolerate waiting on ALTER TABLE statements.
  • 您可以容忍在ALTER TABLE语句上等待的痛苦程度。

#5


3  

I've seen non-partitioned tables with several billion (indexed) records, that self-joined for analytical work. We eventually partitioned the thing but honestly we didn't see that much difference.

我曾见过具有数十亿(索引)记录的非分区表,它们是用于分析工作的自连接表。我们最终分割了这个东西,但老实说,我们没有看到太大的差别。

That said, that was in Oracle and I have not tested that volume of data in MySQL. Indexes are your friend :)

也就是说,那是在Oracle中,我还没有在MySQL中测试过这么多数据。索引是你的朋友:)

#6


2  

If you mean 1 million rows, then it depends on how your indexing is done and the configuration of your hardware. A million rows is not a large amount for an enterprise database, or even a dev database on decent equipment.

如果您指的是100万行,那么它取决于如何进行索引以及硬件的配置。对于一个企业数据库,甚至是一个关于优秀设备的开发数据库来说,一百万行都不是一个大数目。

if you mean 1 million columns (not sure thats even possible in MySQL) then yes, this seems a bit large and will probably cause problems.

如果你指的是100万列(不确定这在MySQL中是否可能),那么是的,这看起来有点大,可能会引起问题。

#7


2  

Register? Do you mean record?

注册吗?你的意思是记录?

One million records is not a real big deal for a database these days. If you run into any issue, it's likely not the database system itself, but rather the hardware that you're running it on. You're not going to run into a problem with the DB before you run out of hardware to throw at it, most likely.

现在一百万的记录对一个数据库来说并不是什么大问题。如果遇到任何问题,很可能不是数据库系统本身,而是运行它的硬件。最有可能的是,在没有足够的硬件可用之前,您不会遇到DB的问题。

Now, obviously some queries are slower than others, but if two very similar queries run in vastly different times, you need to figure out what the database's execution plan is and optimize for it, i.e. use correct indexes, proper normalization, etc.

现在,显然有些查询比其他查询慢,但是如果两个非常相似的查询在非常不同的时间运行,您需要确定数据库的执行计划并对其进行优化,例如使用正确的索引、适当的规范化等等。

Incidentally, there is no such thing as a "last" record in a table, from a logical standpoint they have no inherent order.

顺便说一句,表中没有“最后”记录,从逻辑的角度来看,它们没有内在的顺序。

#8


1  

Assuming you mean "records" by "registers" no, it's not too much, MySQL scales really well and can hold as many records as you have space for in your hard disk.

假设你指的是“注册”的“记录”不,它不是太多,MySQL扩展得很好,可以容纳尽可能多的记录在你的硬盘上。

Obviously though search queries will be slower. There is really no way around that except making sure that the fields are properly indexed.

显然,尽管搜索查询会比较慢。除了确保字段被正确索引之外,实际上没有其他方法可以解决这个问题。

#9


0  

The larger the table gets (as in more rows in it), the slower queries will typically run if there are no indexes. Once you add the right indexes your query performance should improve or at least not degrade as much as the table grows. However, if the query itself returns more rows as the table gets bigger, then you'll start to see degradation again.

表越大(在其中有更多的行),如果没有索引,查询速度就会越慢。一旦添加了正确的索引,您的查询性能应该得到改善,或者至少不会随着表的增长而降低。但是,如果查询本身在表变大时返回更多的行,那么您将再次看到降级。

While 1M rows are not that many, it also depends on how much memory you have on the DB server. If the table is too big to be cached in memory by the server, then queries will be slower.

虽然1M行并不多,但这也取决于DB服务器上有多少内存。如果表太大,服务器无法将其缓存在内存中,那么查询将会变慢。

#10


0  

Using the query provided will be exceptionally slow because of using a sort merge method to sort the data.

由于使用sort merge方法对数据进行排序,使用所提供的查询将异常缓慢。

I would recommend rethinking the design so you are using indexes to retrieve it or make sure it is already ordered in that manner so no sorting is needed.

我建议重新考虑设计,所以您使用索引来检索它,或者确保它已经按照这种方式排序,所以不需要排序。