存储和分析数据库选择日志

时间:2022-10-17 17:54:02

I am building an internal tool, which will be open-sourced, to take logs and put them into a database - to put it simply. From there, the tool will also analyze the logs and help alert the sys-admins and developers of issues going on, all in real-time. This is a lot of CPU to process this, more than the scope of this question.

我正在构建一个内部工具,它将是开源的,将日志记录并放到数据库中——简单地说。从那里,该工具还将分析日志,并帮助系统管理员和开发人员实时地了解正在发生的问题。处理这个问题需要大量的CPU,而不是这个问题的范围。

What I would like to know is what Database to choose that will allow and perform quickly a number of key tasks:

我想知道的是,要选择什么数据库才能允许并快速执行一些关键任务:

  • Store a large number of events categorized by event types
  • 存储大量按事件类型分类的事件。
  • Perform a large number of reads to develop charts to analyze the events that are being logged
  • 执行大量的读取操作来开发图表来分析正在记录的事件
  • Read in real-time to send and trigger automated alerts to the system.
  • 实时读取以向系统发送和触发自动警报。

And any other help would be greatly appreciated, too. Code On.

另外,我们也非常感谢您的帮助。代码上。

2 个解决方案

#1


2  

To my observation MongoDB performs in a magnitude better than RDBS for a task you describe - massive store of logs. Particularly good performers are capped collections. Major performance lag with RDBS I've seen was the insert times. Huge disadvantage of RDBS is the schema which is a major pain to upgrade if needed. Because of these reasons we have started to move towards MongoDB - check out logFaces. If you are building your own tool for the open source community - try to make sure it will work with ANY database, not just a particular brand. But then it becomes a not so trivial task :)

根据我的观察,MongoDB对于您所描述的任务的性能比RDBS要好得多——大量的日志存储。特别优秀的是有封顶的收藏。RDBS的主要性能滞后是插入时间。RDBS的巨大缺点是模式,如果需要的话,升级模式会很麻烦。由于这些原因,我们开始转向MongoDB - check out logFaces。如果您正在为开源社区构建您自己的工具——请尝试确保它将与任何数据库一起工作,而不仅仅是一个特定的品牌。但之后它就变成了一项不那么琐碎的任务:

(for disclosure - I am the original author of logFaces, so the opinion could be biased)

(对于披露——我是日志的原始作者,所以可能会有偏见)

#2


1  

Storing just events sound like a simple model, so you might want to take a look at NoSQL databases. I think key-value stores/bigtables for really large amounts of data will be better than document based databases in this case.

仅仅存储事件听起来像一个简单的模型,所以您可能想看看NoSQL数据库。我认为在这种情况下,大量数据的键值存储/bigtable会比基于文档的数据库更好。

Large number of reads and analysing on the other hand sound like you might want to build a data warehouse system. This is the good old SQL approach, without some normalization for optimised reading. Though it can take some time to design and implement.

另一方面,大量的读取和分析听起来像是您可能想要构建一个数据仓库系统。这是一种很好的老SQL方法,没有对优化的阅读进行一些规范化。虽然设计和实现需要一些时间。

#1


2  

To my observation MongoDB performs in a magnitude better than RDBS for a task you describe - massive store of logs. Particularly good performers are capped collections. Major performance lag with RDBS I've seen was the insert times. Huge disadvantage of RDBS is the schema which is a major pain to upgrade if needed. Because of these reasons we have started to move towards MongoDB - check out logFaces. If you are building your own tool for the open source community - try to make sure it will work with ANY database, not just a particular brand. But then it becomes a not so trivial task :)

根据我的观察,MongoDB对于您所描述的任务的性能比RDBS要好得多——大量的日志存储。特别优秀的是有封顶的收藏。RDBS的主要性能滞后是插入时间。RDBS的巨大缺点是模式,如果需要的话,升级模式会很麻烦。由于这些原因,我们开始转向MongoDB - check out logFaces。如果您正在为开源社区构建您自己的工具——请尝试确保它将与任何数据库一起工作,而不仅仅是一个特定的品牌。但之后它就变成了一项不那么琐碎的任务:

(for disclosure - I am the original author of logFaces, so the opinion could be biased)

(对于披露——我是日志的原始作者,所以可能会有偏见)

#2


1  

Storing just events sound like a simple model, so you might want to take a look at NoSQL databases. I think key-value stores/bigtables for really large amounts of data will be better than document based databases in this case.

仅仅存储事件听起来像一个简单的模型,所以您可能想看看NoSQL数据库。我认为在这种情况下,大量数据的键值存储/bigtable会比基于文档的数据库更好。

Large number of reads and analysing on the other hand sound like you might want to build a data warehouse system. This is the good old SQL approach, without some normalization for optimised reading. Though it can take some time to design and implement.

另一方面,大量的读取和分析听起来像是您可能想要构建一个数据仓库系统。这是一种很好的老SQL方法,没有对优化的阅读进行一些规范化。虽然设计和实现需要一些时间。