mysql 官方文档之数据库优化之一优化概述

时间:2022-06-07 05:59:55

         Optimization Overview

       Database performance depends on several factors at the database level, such as tables, queries, and configuration settings. These software constructs result in CPU and I/O operations at the hardware level, which you must minimize and make as efficient as possible. As you work on databas performance, you start by learning the high-level rules and guidelines for the software side, and measuring performance using wall-clock time. As you become an expert, you learn more about what happens internally, and start measuring things such as CPU cycles and I/O operations.

        数据库性能主要取决于数据库层面上的几个因素,例如表结构,查询语句,配置设置;尽可能使最小化内存占用,同时又尽可能的有效率的软件设计,是在硬件层面上的因素,如CPU,I/O操作等;当你开始处理数据库性能的时候,你通过学习高级规则,和高级指南(语法和实例),用壁挂时间测试性能;当你成为专家时,你需要去了解更多内部是如何运作的,开始用CPU周期和I/O操作来测量一些东西。

     Typical users aim to get the best database performance out of their existing software and hardware configurations. Advanced users look for opportunities to improve the MySQL software itself, or develop their own storage engines and hardware appliances to expand the MySQL ecosystem.              

        具有代表性的使用着会在现有的软件和硬件的基础上去获得最好的数据库性能,牛逼的用户会从mysql本身,或者开发他们自己的存储引擎或者硬件设备来扩展mysql生态系统。

        Optimizing at the Database Level(数据库层面上的优化)

        The most important factor in making a database application fast is its basic design:
        • Are the tables structured properly? In particular, do the columns have the right data types, and
does each table have the appropriate columns for the type of work? For example, applications that
perform frequent updates often have many tables with few columns, while applications that analyze
large amounts of data often have few tables with many columns.
         • Are the right indexes in place to make queries efficient?
         • Are you using the appropriate storage engine for each table, and taking advantage of the strengths
and features of each storage engine you use? In particular, the choice of a transactional storage
engine such as InnoDB or a nontransactional one such as MyISAM can be very important for

performance and scalability.    

Note:  InnoDB is the default storage engine for new tables. In practice, the advanced InnoDB performance features mean that InnoDB tables often outperform the simpler MyISAM tables, especially for a busy database.  

         • Does each table use an appropriate row format? This choice also depends on the storage engine used for the table. In particular, compressed tables use less disk space and so require less disk I/O to read and write the data. Compression is available for all kinds of workloads with InnoDB tables, and for read-only MyISAM tables.

        • Does the application use an appropriate locking strategy? For example, by allowing shared access when possible so that database operations can run concurrently, and requesting exclusive access when appropriate so that critical operations get top priority. Again, the choice of storage engine is significant. The InnoDB storage engine handles most locking issues without involvement from you, allowing for better concurrency in the database and reducing the amount of experimentation and tuning for your code

          • Are all memory areas used for caching sized correctly? That is, large enough to hold frequently accessed data, but not so large that they overload physical memory and cause paging. The main memory areas to configure are the InnoDB buffer pool and the MyISAM key cache.

           一个好的数据库应用最重要的因素是它基本的设计;

          (1)表结构是否合理?特别是这个列的数据类型是否正确,每一个表是否有满足业务需要的合适的一些列(字段)?例如,一个需要频繁更新数据的应用程序数据库通常会有很多个拥有较少的列的表;一个需要分析大量数据的应用程序数据库,通常会有比较少的拥有很多个列的表。

            (2)是否具有能够加快查询效率的索引?每一个表是否都使用了合适的存储引擎,和是否利用好你使用的这些存储引擎的优势,特征?例如,选择支持事务的存储引擎InnoDB,一个非事务性的存储引擎MyISAM,选择合适的存储引擎对于性能和可扩展性非常重要;(InnoDB是缺省的数据库表存储引擎,在实践中,InnoDB性能特征被放大,增强,意味着尤其是在一个繁忙的数据库中InnoDB 的表通常要由于简单的MyISAM的表)

            (3)每一个表是否使用了合适的行格式?这主要取决于为这些表适用的存储引擎,特别地,压缩表使用更少的磁盘空间,减少了读写数据的磁盘IO操作,压缩表适用于所有使用InnoDB表的工作负载,荷用于只读的MyISAM表的工作负载。

             (4)是否使用了合适的锁策略?通过当使用共享锁满足我们可能需要的并发操作,当在合适的时候去独占资源的时候,使用排他锁,使的关键操作获得最高优先权;再一次强调选择合适的存储引擎至关重要;InnoDB存储引擎能够在大部分情况下不影响你的操作处理好锁的问题,具有更好的并发性能,并能减少大量的代码实验和调优;

             (5)所有的内存区域都用于缓存的大小是否正确?只需要足够大 能够HOLD住频繁访问数据就可以啦,不能够太大,这样会导致物理内存超载,引起分页;要配置的主要内存区域是Innodb 缓冲池和MyISAM键缓存

             Optimizing at the Hardware Level(硬件层面上的优化)

             Any database application eventually hits hardware limits as the database becomes more and more busy. A DBA must evaluate whether it is possible to tune the application or reconfigure the server to avoid these bottlenecks, or whether more hardware resources are required. System bottlenecks typically arise from these sources:
              • Disk seeks. It takes time for the disk to find a piece of data. With modern disks, the mean time for this is usually lower than 10ms, so we can in theory do about 100 seeks a second. This time improves slowly with new disks and is very hard to optimize for a single table. The way to optimize seek time is to distribute the data onto more than one disk.
              • Disk reading and writing. When the disk is at the correct position, we need to read or write the data. With modern disks, one disk delivers at least 10–20MB/s throughput. This is easier to optimize than seeks because you can read in parallel from multiple disks.
              • CPU cycles. When the data is in main memory, we must process it to get our result. Having large tables compared to the amount of memory is the most common limiting factor. But with small tables, speed is usually not the problem.
                • Memory bandwidth. When the CPU needs more data than can fit in the CPU cache, main memory
bandwidth becomes a bottleneck. This is an uncommon bottleneck for most systems, but one to be
aware of.

             随着数据库变得越来越繁忙,任何数据库应用程序最终都会遇到硬件限制。DBA必须评估是否有可能调优应用程序或重新配置服务器以避免这些瓶颈,或者是否需要更多的硬件资源。系统瓶颈通常有这些来源:

           (1)磁盘搜索。磁盘需要时间来找到一段数据。对于现代磁盘,平均时间通常小于10毫秒,所以理论上我们每秒可以做100次搜索。这一次使用新磁盘进行缓慢改进,对于单个表的优化来说是非常困难的。优化查找时间的方法是将数据分布到多个磁盘上。

          (2)磁盘读和写。当磁盘处于正确的位置时,我们需要读取或写入数据。有了现代磁盘,一个磁盘传输至少具有10-20mb/s的吞吐量。这比搜索更容易进行优化,因为您可以从多个磁盘并行读取。

           (3)CPU周期。当数据在主存中时,我们必须处理它以获得结果。与内存数量相比,拥有大表是最常见的限制因素。但是对于小表,速度通常不是问题。

            (4)内存带宽。当CPU需要的数据超过了CPU缓存时,主内存带宽就成为了瓶颈。对于大多数系统来说,这是一个不常见的瓶颈,但是值得注意的瓶颈。

            Balancing Portability and Performance(平衡可移植性和性能)

                To use performance-oriented SQL extensions in a portable MySQL program, you can wrap MySQL specific
keywords in a statement within /*! */ comment delimiters. Other SQL servers ignore the commented keywords.
                

              要在一个可移植的MySQL程序中使用性能导向的SQL扩展,你可以在SQL语句中使用/!* /注释分隔符隐藏特殊关键字。其他SQL服务器也会忽略了注释的关键字。


               有很多位置理解不到位,希望有同道帮忙指正!非常感谢。