在fluentd中使用regex解析json数据。

时间:2022-01-26 19:34:18

I have fluentd working perfectly fine and is able to publish data to elastic search. I modified the fluentd config file to tail a file, get the data and publish. Below is the source:

我的工作非常出色,能够将数据发布到弹性搜索。我将fluentd配置文件修改为跟踪文件,获取数据并发布。下面是来源:

<source>
  @type tail
  format /^\[(?<logtime>[^\]]*)\] (?<name>[^ ]*) (?<title>[^ ]*) (?<id>\d*)$/
  time_key logtime
  time_format %Y-%m-%d %H:%M:%S %z
  path /home/user/file
  tag first
</source>

If input is below data:

如果输入低于数据:

[2013-02-28 12:00:00 +0900] alice engineer 1

This is getting read by fluentd perfectly and is also published to elasticsearch.

这是被fluentd完美地读取,也被发布到弹性搜索。

I then modified the regex pattern to accept json data:

然后修改regex模式,接受json数据:

<source>
  @type tail
  format /(?:"Name":")(.*?)(?:")/ #CHANGE HERE
  time_key logtime
  time_format %Y-%m-%d %H:%M:%S %z
  path /home/user/file
  tag first
</source>

So if input is:

因此,如果输入是:

{
    "Name":"Logger",
    "Type":"Logging"
}

Then there is no data on elasticsearch. Even the logs of the fluentd doesnt show any error or warning message. Is the regex pattern wrong. How can I resolve it.?

那么就没有关于弹性搜索的数据了。甚至fluentd的日志也没有显示任何错误或警告消息。regex模式是否错误?我如何解决这个问题?

Thanks

谢谢

1 个解决方案

#1


2  

It seems you want to get data out of json into elasticsearch. You may use a JSON parser to do the heavy lifting for you, see the Getting Data From Json Into Elasticsearch Using Fluentd with the necessary details to get you started.

看起来你想要从json中获取数据到弹性搜索中。您可以使用JSON解析器来完成繁重的工作,请参阅使用Fluentd将数据从JSON获取到Elasticsearch的过程,并提供必要的详细信息。

If you want to fix the regex approach you have, use

如果您想修复现有的regex方法,请使用

format /"Name"\s*:\s*"(?<name>[^"]*)"/

Note that (?<name>...) is a named capturing group that are used in Elastic Search to create fields with the same names. The pattern matches

注意(? …)是一个命名捕获组,用于弹性搜索以创建具有相同名称的字段。模式匹配

  • "Name" - a literal "Name" substring
  • “Name”——一个字面的“Name”子字符串
  • \s*:\s* - a colon enclosed with 0+ whitespace chars
  • \s*:\s* -一个带有0+空格字符的冒号。
  • " - a double quote
  • ——双重引用
  • (?<name>[^"]*) - Group "name" matching 0+ chars other than "
  • (? <名称> [^ "]*)——集团”的名字“匹配0 +以外的字符”
  • " - a double quote (not necessary though).
  • ——重复引用(虽然不是必需的)。

If you want to have Type as well in the same field, you may use

如果您希望在相同的字段中也有类型,可以使用

format /"(?:Name|Type)"\s*:\s*"(?<name>[^"]*)"/

where (?:Name|Type) is a non-capturing group matching either Name or Type substring (| is an alternation operator).

其中(?:Name|Type)是一个非捕获组,匹配名称或类型子字符串(|是一个交替操作符)。

#1


2  

It seems you want to get data out of json into elasticsearch. You may use a JSON parser to do the heavy lifting for you, see the Getting Data From Json Into Elasticsearch Using Fluentd with the necessary details to get you started.

看起来你想要从json中获取数据到弹性搜索中。您可以使用JSON解析器来完成繁重的工作,请参阅使用Fluentd将数据从JSON获取到Elasticsearch的过程,并提供必要的详细信息。

If you want to fix the regex approach you have, use

如果您想修复现有的regex方法,请使用

format /"Name"\s*:\s*"(?<name>[^"]*)"/

Note that (?<name>...) is a named capturing group that are used in Elastic Search to create fields with the same names. The pattern matches

注意(? …)是一个命名捕获组,用于弹性搜索以创建具有相同名称的字段。模式匹配

  • "Name" - a literal "Name" substring
  • “Name”——一个字面的“Name”子字符串
  • \s*:\s* - a colon enclosed with 0+ whitespace chars
  • \s*:\s* -一个带有0+空格字符的冒号。
  • " - a double quote
  • ——双重引用
  • (?<name>[^"]*) - Group "name" matching 0+ chars other than "
  • (? <名称> [^ "]*)——集团”的名字“匹配0 +以外的字符”
  • " - a double quote (not necessary though).
  • ——重复引用(虽然不是必需的)。

If you want to have Type as well in the same field, you may use

如果您希望在相同的字段中也有类型,可以使用

format /"(?:Name|Type)"\s*:\s*"(?<name>[^"]*)"/

where (?:Name|Type) is a non-capturing group matching either Name or Type substring (| is an alternation operator).

其中(?:Name|Type)是一个非捕获组,匹配名称或类型子字符串(|是一个交替操作符)。