在Linux上分析数据库密集型代码

时间:2023-01-18 23:16:49

I have a set of real time financial trading programs that run on Linux. The code is written in C++ and is very database intensive (MySQL). We've tried to use Memory tables where important. While I always care about latency, at certain times of day just raw throughput is the bottleneck.

我有一套在Linux上运行的实时金融交易程序。代码是用C ++编写的,并且是数据库密集型的(MySQL)。我们尝试使用重要的内存表。虽然我一直关心延迟,但在一天的某些时候,原始吞吐量只是瓶颈。

How can I properly profile this system? I'd like to be able to see percentage of time spent (a) running my application code, i.e. my app code CPU bound, (b) running in MySQL, or (c) running in OS system calls, such as networking related calls. I'd also like to see, for the database at least, time spent waiting on disk.

如何正确分析此系统?我希望能够看到花费的时间百分比(a)运行我的应用程序代码,即我的应用程序代码CPU绑定,(b)在MySQL中运行,或(c)在OS系统调用中运行,例如网络相关的调用。我还想看看,至少对于数据库来说,等待磁盘的时间。

I also realize that profiling and optimizing for latency is very different than profiling and optimizing for throughput. To optimize for throughput I imagine a traditional profiler that can measure the above would be appropriate. To optimize for latency, I think just logging microsecond accurate time stamps is sufficient, but still makes it hard to pinpoint where all the time is spent when I see outliers.

我还意识到,对延迟进行分析和优化与分析和优化吞吐量非常不同。为了优化吞吐量,我想象一个可以测量上述内容的传统分析器是合适的。为了优化延迟,我认为仅记录微秒的准确时间戳就足够了,但是当我看到异常值时,仍然难以确定所有时间花在哪里。

Am I thinking about this the right way? What tools are out there that might help me out?

我是否正确地思考这个问题?哪些工具可以帮助我?

1 个解决方案

#1


1  

It may be worth wild to attempt to examine what the database is doing while processing the queries. For instance, understanding how the query optimizer is executing the query. Sometimes forcing the use of an index, removing an index or rearranging the query can have a huge impact.

尝试在处理查询时检查数据库正在执行的操作可能是值得的。例如,了解查询优化器如何执行查询。有时强制使用索引,删除索引或重新排列查询会产生巨大影响。

https://dev.mysql.com/doc/refman/5.1/en/show-profile.html

Also, it may also be advisable to look at how other RMDBs handle your work load. With out looking at the type of queries or the complexity, it is difficult to say if mysql or PostgreSQL will work better for you. But in any event, taking a scientific approach to measuring the data is step 1.

此外,建议您查看其他RMDB如何处理您的工作量。在查看查询类型或复杂性时,很难说mysql或PostgreSQL是否能更好地为您工作。但无论如何,采用科学的方法来测量数据是第1步。

Like Vlad said, your probably want to look at using stap but it is not for the faint of heart, I would recommend it if you're still seeing performance issues.

就像弗拉德说的那样,你可能想看看使用金色但不适合胆小的人,如果你仍然看到性能问题,我会推荐它。

#1


1  

It may be worth wild to attempt to examine what the database is doing while processing the queries. For instance, understanding how the query optimizer is executing the query. Sometimes forcing the use of an index, removing an index or rearranging the query can have a huge impact.

尝试在处理查询时检查数据库正在执行的操作可能是值得的。例如,了解查询优化器如何执行查询。有时强制使用索引,删除索引或重新排列查询会产生巨大影响。

https://dev.mysql.com/doc/refman/5.1/en/show-profile.html

Also, it may also be advisable to look at how other RMDBs handle your work load. With out looking at the type of queries or the complexity, it is difficult to say if mysql or PostgreSQL will work better for you. But in any event, taking a scientific approach to measuring the data is step 1.

此外,建议您查看其他RMDB如何处理您的工作量。在查看查询类型或复杂性时,很难说mysql或PostgreSQL是否能更好地为您工作。但无论如何,采用科学的方法来测量数据是第1步。

Like Vlad said, your probably want to look at using stap but it is not for the faint of heart, I would recommend it if you're still seeing performance issues.

就像弗拉德说的那样,你可能想看看使用金色但不适合胆小的人,如果你仍然看到性能问题,我会推荐它。