什么更有效 - 将日志存储在SQL数据库或文件中?

时间:2022-10-17 14:32:12

I have few scripts loaded by cron quite often. Right now I don't store any logs, so if any script fails to load, I won't know it till I see results - and even when I notice that results are not correct, I can't do anything since I don't know which script failed.

我经常用cron加载的脚本很少。现在我不存储任何日志,所以如果任何脚本无法加载,我会看到结果之前不会知道它 - 即使我发现结果不正确,我也做不了什么,因为我不知道知道哪个脚本失败了。

I've decided to store logs, but I am still not sure how to do it. So, my question is - what's more efficient - storing logs in sql database or files?

我决定存储日志,但我仍然不知道该怎么做。所以,我的问题是 - 什么更有效 - 将日志存储在sql数据库或文件中?

I can create 'logs' table in my mysql database and store each log in separate row, or I can just use php's file_put_contents or fopen/fwrite to store logs in separate files.

我可以在我的mysql数据库中创建'logs'表并将每个日志存储在单独的行中,或者我可以使用php的file_put_contents或fopen / fwrite将日志存储在单独的文件中。

My scripts would approximately add 5 logs (in total) per minute while working. I've done few tests to determine what's faster - fopen/fwrite or mysql's insert. I looped an "insert" statement 3000 times to make 3000 rows and looped fopen/fwrite 3000 times to make 3000 files with sample text. Fwrite executed 4-5 times faster than sql's insert. I made a second loop - I looped a 'select' statement and assigned it to a string 3000 times - I also opened 3000 files using 'fopen' and assigned the results to the string. Result was the same - fopen/fwrite finished the task 4-5 times faster.

我的脚本在工作时每分钟大约会增加5个日志(总计)。我做了很少的测试来确定什么更快--fopen / fwrite或mysql的插入。我将一个“插入”语句循环3000次以制作3000行并循环fopen / fwrite 3000次以制作带有示例文本的3000个文件。 Fwrite的执行速度比sql的插入快4-5倍。我做了第二个循环 - 我循环'select'语句并将其分配给字符串3000次 - 我还使用'fopen'打开了3000个文件并将结果分配给字符串。结果是一样的 - fopen / fwrite完成任务的速度提高了4-5倍。

So, to all experienced programmers - what's your experience with storing logs? Any advice?

那么,对于所有有经验的程序员 - 您在存储日志方面的经验是什么?任何建议?

// 04.09.2011 EDIT - Thank you all for your answers, they helped ma a lot. Each post were valuable, so it was quite hard to accept only one answer ;-)

// 04.09.2011编辑 - 谢谢大家的答案,他们帮助了很多。每个帖子都很有价值,所以很难接受只有一个答案;-)

9 个解决方案

#1


5  

You can use a component such as Zend_Log which natively supports the concept of writers attached to the same log instance. In that way you can log the same message to one or more different place with no need to change your logging code. And you can always change your code to replace the log system or add a new one in a simple way.

您可以使用Zend_Log等组件,它本身支持附加到同一日志实例的编写器概念。通过这种方式,您可以将相同的消息记录到一个或多个不同的位置,而无需更改日志记录代码。您可以随时更改代码以替换日志系统或以简单的方式添加新系统。

For your question I think that log to files is simpler and more appropriate if you (developer) is the only one who needs to read log messages.

对于您的问题,我认为如果您(开发人员)是唯一需要阅读日志消息的人,那么日志到文件会更简单,更合适。

Log to db instead if you need other people needs to read logs in a web interface or if you need the ability to search through logs. As someone else has pointed out also concurrency matters, if you have a lot of users log to db could scale better.

如果您需要其他人需要在Web界面中读取日志或者您需要能够搜索日志,请记录到db。正如其他人已经指出并发问题一样,如果你有很多用户登录到db可以更好地扩展。

Finally, a log frequency of 5 messages per minute requires almost no cpu for your application, so you don't need to worry about performances. In your case I'd start with logfiles and then change (or add more writers) if your requisites will change.

最后,每分钟5条消息的日志频率几乎不需要您的应用程序的CPU,因此您不必担心性能。在您的情况下,我将从日志文件开始,然后更改(或添加更多编写器),如果您的必要条件将更改。

#2


11  

Logs using files are more efficient, however logs stored in the database are easier to read, even remotely (you can write a web frontend if required, for example).

使用文件的日志更有效,但是存储在数据库中的日志更容易阅读,甚至是远程的(例如,如果需要,您可以编写Web前端)。

Note however that connecting and inserting rows into the database is error prone (database server down, password wrong, out-of-resources) so where would you log those errors if you decided to use the database?

但请注意,在数据库中连接和插入行是容易出错的(数据库服务器关闭,密码错误,资源不足),那么如果您决定使用数据库,那么您将在哪里记录这些错误?

#3


6  

Commenting on your findings.

评论你的发现。

Regarding the writing to the file you are probably right.
Regarding the reading you are dead wrong.

关于写入文件你可能是对的。关于阅读你是错的。

Writing to a database:

写入数据库:

  1. MyISAM locks the whole table on inserts, causing a lock contention. Use InnoDB, which has row locking.
  2. MyISAM将整个表锁定在插入上,导致锁争用。使用InnoDB,它具有行锁定功能。
  3. Contrary to 1. If you want to do fulltext searches on the log. Use MyISAM, it supports fulltext indexes.
  4. 与1.相反如果您想对日志进行全文搜索。使用MyISAM,它支持全文索引。
  5. If you want to be really fast you can use the memory engine, this writes the table in RAM. Transfer the data to a disk-based table when CPU load is low.
  6. 如果你想要非常快,你可以使用内存引擎,这会将表写入RAM。当CPU负载较低时,将数据传输到基于磁盘的表。

Reading from the database

从数据库中读取

This is where the database truly shines.
You can combine all sorts of information from different entries, much much faster and easier than you can ever do from a flat file.

这是数据库真正闪耀的地方。您可以将来自不同条目的各种信息组合起来,比平面文件中的更快更容易。

SELECT logdate, username, action FROM log WHERE userid = '1' /*root*/ AND error = 10;

If you have indexes on the fields used in the where clause the result will return almost instantly, try doing that on a flat file.

如果在where子句中使用的字段上有索引,结果几乎立即返回,请尝试在平面文件上执行此操作。

SELECT username, count(*) as error_count 
FROM log 
WHERE error <> 0 
GROUP BY user_id WITH ROLLUP

Never mind the fact that the table is not normalized, this will be much much slower and harder to do with a flat file.
It's a no brainer really.

没关系这个表没有规范化的事实,这对于平面文件要慢得多,也更难。真的没脑子。

#4


1  

Writing the filesystem should always be faster.

编写文件系统应该总是更快。

That however shouldent be your concern. Both doing a simple insert and writing to a file system are quick operations. What you need to be worried about is what happens when your database goes down. I personaly like to write to both so there is always a log if anything goes wrong but also you have the ease of searching from a database.

然而,这是你的担忧。进行简单的插入和写入文件系统都是快速操作。您需要担心的是数据库出现故障时会发生什么。我个人喜欢写两个,所以如果出现任何问题,总会有一个日志,但你也可以轻松地从数据库中搜索。

#5


1  

It depends on the size of the logs and on the concurrency level. Because of the latest, your test is completely invalid - if there are 100 users on the site, and you have lets say 10 threads writing to the same file, fwrite won't be so faster. One of the things RDBMS provides is concurrency control.

它取决于日志的大小和并发级别。由于最新的,您的测试完全无效 - 如果站点上有100个用户,并且您可以说10个线程写入同一个文件,则fwrite将不会那么快。 RDBMS提供的一个功能是并发控制。

It depends on the requirements and lot king of analysis you want to perform. Just reading records is easy, but what about aggregating some data over a defined period.

这取决于您想要执行的分析要求和批次。只是阅读记录很容易,但是如何在定义的时间段内聚合一些数据。

Large scale web sites use systems like Scribe for writing their logs.

大型网站使用Scribe等系统来编写日志。

If you are talking about 5 record per minute however, this is really low load, so the main question is how you are going to read them. If file is suitable for your needs, go with the file. Generally, append-only writes (usual for logs) are really fast.

如果你说的是每分钟5个记录,那么这实际上是低负荷,所以主要的问题是你将如何阅读它们。如果文件适合您的需要,请使用该文件。通常,仅附加写入(通常用于日志)非常快。

#6


1  

Speed isn't everything. Yes, it's faster to write to files but it's far faster for you to find what you need in the logs if they are in a database. Several years ago I converted our CMS from a file-based log to a Mysql table. Table is better.

速度不是万能的。是的,写入文件的速度更快,但如果它们位于数据库中,您可以更快地找到日志中所需的内容。几年前,我将CMS从基于文件的日志转换为Mysql表。表更好。

#7


1  

I think storing logs in database is not a good idea. The pros of storing logs to databases over files is that you can analyse your logs much more easily with the power of SQL, the cons, however, is that you have to pay much more time for database maintainence. You'd better to set up a seperate database server to store your logs or your might get too much log INSERT which will decrease your database performance to production use; also, it's not easy to migrate, archive logs in database, compared with files(logrotate, etc).

我认为将日志存储在数据库中并不是一个好主意。通过文件将日志存储到数据库的优点是,您可以使用SQL的强大功能更轻松地分析日志,但是,您需要花费更多的时间来维护数据库。您最好设置一个单独的数据库服务器来存储您的日志,否则您可能会获得太多日志INSERT,这会降低您的数据库性能以适应生产使用;此外,与文件(logrotate等)相比,迁移,归档数据库中的日志并不容易。

Nowadays you should use some special feature-rich logging system to handling your logs, for example, logstash(http://logstash.net/) has log collector, filter, and it can store log in external systems such as elasticsearch, combined with a beautiful frontend for visualizing and analyzing your logs.

现在你应该使用一些特殊的功能丰富的日志系统来处理你的日志,例如,logstash(http://logstash.net/)有日志收集器,过滤器,它可以存储外部系统中的日志,例如elasticsearch,结合一个美丽的前端,用于可视化和分析您的日志。

Ref:

参考:

#8


0  

Error logging is best limited to files in my opinion, because if there is a problem with the database, you can still log that. Obviously that's not an option if your error logging requires a connection to the database!

在我看来,错误记录最好限于文件,因为如果数据库出现问题,您仍然可以记录它。显然,如果您的错误记录需要连接到数据库,那么这不是一个选项!

What I will also say however, is that general logging is something I leave within the database, however this only applies if you are doing lots of logging for audit trails etc.

然而,我还要说的是,一般日志记录是我在数据库中留下的内容,但这仅适用于您为审计跟踪等进行大量日志记录的情况。

#9


0  

Personally, I prefer log files so I've created two functions:

就个人而言,我更喜欢日志文件,所以我创建了两个函数:

<?php
function logMessage($message=null, $filename=null)
{
    if (!is_null($filename))
    {
        $logMsg=date('Y/m/d H:i:s').": $message\n";
        error_log($logMsg, 3, $filename);
    }
}

function logError($message=null, $filename=null)
{
    if (!is_null($message))
    {
        logMessage("***ERROR*** {$message}", $filename);
    }
}
?>

I define a constant or two (I use ACTIVITY_LOG and ERROR_LOG both set to the same file so you don't need to refer to two files side by side to get an overall view of the running) and call as appropriate. I've also created a dedicated folder (/var/log/phplogs) and each application that I write has its own log file. Finally, I rotate logs so that I have some history to refer back to for customers.

我定义了一个或两个常量(我使用ACTIVITY_LOG和ERROR_LOG都设置为同一个文件,因此您不需要并排引用两个文件以获得正在运行的整体视图)并根据需要调用。我还创建了一个专用文件夹(/ var / log / phplogs),我编写的每个应用程序都有自己的日志文件。最后,我转动日志,以便我有一些历史可以回顾客户。

Liberal use of the above functions means that I can trace the execution of apps fairly easily.

*使用上述功能意味着我可以非常轻松地跟踪应用程序的执行情况。

#1


5  

You can use a component such as Zend_Log which natively supports the concept of writers attached to the same log instance. In that way you can log the same message to one or more different place with no need to change your logging code. And you can always change your code to replace the log system or add a new one in a simple way.

您可以使用Zend_Log等组件,它本身支持附加到同一日志实例的编写器概念。通过这种方式,您可以将相同的消息记录到一个或多个不同的位置,而无需更改日志记录代码。您可以随时更改代码以替换日志系统或以简单的方式添加新系统。

For your question I think that log to files is simpler and more appropriate if you (developer) is the only one who needs to read log messages.

对于您的问题,我认为如果您(开发人员)是唯一需要阅读日志消息的人,那么日志到文件会更简单,更合适。

Log to db instead if you need other people needs to read logs in a web interface or if you need the ability to search through logs. As someone else has pointed out also concurrency matters, if you have a lot of users log to db could scale better.

如果您需要其他人需要在Web界面中读取日志或者您需要能够搜索日志,请记录到db。正如其他人已经指出并发问题一样,如果你有很多用户登录到db可以更好地扩展。

Finally, a log frequency of 5 messages per minute requires almost no cpu for your application, so you don't need to worry about performances. In your case I'd start with logfiles and then change (or add more writers) if your requisites will change.

最后,每分钟5条消息的日志频率几乎不需要您的应用程序的CPU,因此您不必担心性能。在您的情况下,我将从日志文件开始,然后更改(或添加更多编写器),如果您的必要条件将更改。

#2


11  

Logs using files are more efficient, however logs stored in the database are easier to read, even remotely (you can write a web frontend if required, for example).

使用文件的日志更有效,但是存储在数据库中的日志更容易阅读,甚至是远程的(例如,如果需要,您可以编写Web前端)。

Note however that connecting and inserting rows into the database is error prone (database server down, password wrong, out-of-resources) so where would you log those errors if you decided to use the database?

但请注意,在数据库中连接和插入行是容易出错的(数据库服务器关闭,密码错误,资源不足),那么如果您决定使用数据库,那么您将在哪里记录这些错误?

#3


6  

Commenting on your findings.

评论你的发现。

Regarding the writing to the file you are probably right.
Regarding the reading you are dead wrong.

关于写入文件你可能是对的。关于阅读你是错的。

Writing to a database:

写入数据库:

  1. MyISAM locks the whole table on inserts, causing a lock contention. Use InnoDB, which has row locking.
  2. MyISAM将整个表锁定在插入上,导致锁争用。使用InnoDB,它具有行锁定功能。
  3. Contrary to 1. If you want to do fulltext searches on the log. Use MyISAM, it supports fulltext indexes.
  4. 与1.相反如果您想对日志进行全文搜索。使用MyISAM,它支持全文索引。
  5. If you want to be really fast you can use the memory engine, this writes the table in RAM. Transfer the data to a disk-based table when CPU load is low.
  6. 如果你想要非常快,你可以使用内存引擎,这会将表写入RAM。当CPU负载较低时,将数据传输到基于磁盘的表。

Reading from the database

从数据库中读取

This is where the database truly shines.
You can combine all sorts of information from different entries, much much faster and easier than you can ever do from a flat file.

这是数据库真正闪耀的地方。您可以将来自不同条目的各种信息组合起来,比平面文件中的更快更容易。

SELECT logdate, username, action FROM log WHERE userid = '1' /*root*/ AND error = 10;

If you have indexes on the fields used in the where clause the result will return almost instantly, try doing that on a flat file.

如果在where子句中使用的字段上有索引,结果几乎立即返回,请尝试在平面文件上执行此操作。

SELECT username, count(*) as error_count 
FROM log 
WHERE error <> 0 
GROUP BY user_id WITH ROLLUP

Never mind the fact that the table is not normalized, this will be much much slower and harder to do with a flat file.
It's a no brainer really.

没关系这个表没有规范化的事实,这对于平面文件要慢得多,也更难。真的没脑子。

#4


1  

Writing the filesystem should always be faster.

编写文件系统应该总是更快。

That however shouldent be your concern. Both doing a simple insert and writing to a file system are quick operations. What you need to be worried about is what happens when your database goes down. I personaly like to write to both so there is always a log if anything goes wrong but also you have the ease of searching from a database.

然而,这是你的担忧。进行简单的插入和写入文件系统都是快速操作。您需要担心的是数据库出现故障时会发生什么。我个人喜欢写两个,所以如果出现任何问题,总会有一个日志,但你也可以轻松地从数据库中搜索。

#5


1  

It depends on the size of the logs and on the concurrency level. Because of the latest, your test is completely invalid - if there are 100 users on the site, and you have lets say 10 threads writing to the same file, fwrite won't be so faster. One of the things RDBMS provides is concurrency control.

它取决于日志的大小和并发级别。由于最新的,您的测试完全无效 - 如果站点上有100个用户,并且您可以说10个线程写入同一个文件,则fwrite将不会那么快。 RDBMS提供的一个功能是并发控制。

It depends on the requirements and lot king of analysis you want to perform. Just reading records is easy, but what about aggregating some data over a defined period.

这取决于您想要执行的分析要求和批次。只是阅读记录很容易,但是如何在定义的时间段内聚合一些数据。

Large scale web sites use systems like Scribe for writing their logs.

大型网站使用Scribe等系统来编写日志。

If you are talking about 5 record per minute however, this is really low load, so the main question is how you are going to read them. If file is suitable for your needs, go with the file. Generally, append-only writes (usual for logs) are really fast.

如果你说的是每分钟5个记录,那么这实际上是低负荷,所以主要的问题是你将如何阅读它们。如果文件适合您的需要,请使用该文件。通常,仅附加写入(通常用于日志)非常快。

#6


1  

Speed isn't everything. Yes, it's faster to write to files but it's far faster for you to find what you need in the logs if they are in a database. Several years ago I converted our CMS from a file-based log to a Mysql table. Table is better.

速度不是万能的。是的,写入文件的速度更快,但如果它们位于数据库中,您可以更快地找到日志中所需的内容。几年前,我将CMS从基于文件的日志转换为Mysql表。表更好。

#7


1  

I think storing logs in database is not a good idea. The pros of storing logs to databases over files is that you can analyse your logs much more easily with the power of SQL, the cons, however, is that you have to pay much more time for database maintainence. You'd better to set up a seperate database server to store your logs or your might get too much log INSERT which will decrease your database performance to production use; also, it's not easy to migrate, archive logs in database, compared with files(logrotate, etc).

我认为将日志存储在数据库中并不是一个好主意。通过文件将日志存储到数据库的优点是,您可以使用SQL的强大功能更轻松地分析日志,但是,您需要花费更多的时间来维护数据库。您最好设置一个单独的数据库服务器来存储您的日志,否则您可能会获得太多日志INSERT,这会降低您的数据库性能以适应生产使用;此外,与文件(logrotate等)相比,迁移,归档数据库中的日志并不容易。

Nowadays you should use some special feature-rich logging system to handling your logs, for example, logstash(http://logstash.net/) has log collector, filter, and it can store log in external systems such as elasticsearch, combined with a beautiful frontend for visualizing and analyzing your logs.

现在你应该使用一些特殊的功能丰富的日志系统来处理你的日志,例如,logstash(http://logstash.net/)有日志收集器,过滤器,它可以存储外部系统中的日志,例如elasticsearch,结合一个美丽的前端,用于可视化和分析您的日志。

Ref:

参考:

#8


0  

Error logging is best limited to files in my opinion, because if there is a problem with the database, you can still log that. Obviously that's not an option if your error logging requires a connection to the database!

在我看来,错误记录最好限于文件,因为如果数据库出现问题,您仍然可以记录它。显然,如果您的错误记录需要连接到数据库,那么这不是一个选项!

What I will also say however, is that general logging is something I leave within the database, however this only applies if you are doing lots of logging for audit trails etc.

然而,我还要说的是,一般日志记录是我在数据库中留下的内容,但这仅适用于您为审计跟踪等进行大量日志记录的情况。

#9


0  

Personally, I prefer log files so I've created two functions:

就个人而言,我更喜欢日志文件,所以我创建了两个函数:

<?php
function logMessage($message=null, $filename=null)
{
    if (!is_null($filename))
    {
        $logMsg=date('Y/m/d H:i:s').": $message\n";
        error_log($logMsg, 3, $filename);
    }
}

function logError($message=null, $filename=null)
{
    if (!is_null($message))
    {
        logMessage("***ERROR*** {$message}", $filename);
    }
}
?>

I define a constant or two (I use ACTIVITY_LOG and ERROR_LOG both set to the same file so you don't need to refer to two files side by side to get an overall view of the running) and call as appropriate. I've also created a dedicated folder (/var/log/phplogs) and each application that I write has its own log file. Finally, I rotate logs so that I have some history to refer back to for customers.

我定义了一个或两个常量(我使用ACTIVITY_LOG和ERROR_LOG都设置为同一个文件,因此您不需要并排引用两个文件以获得正在运行的整体视图)并根据需要调用。我还创建了一个专用文件夹(/ var / log / phplogs),我编写的每个应用程序都有自己的日志文件。最后,我转动日志,以便我有一些历史可以回顾客户。

Liberal use of the above functions means that I can trace the execution of apps fairly easily.

*使用上述功能意味着我可以非常轻松地跟踪应用程序的执行情况。