JSON字段具有相同的名称

时间:2022-12-27 11:48:12

In practice, keys have to be unique within a JSON object (e.g. Does JSON syntax allow duplicate keys in an object?). However, suppose I have a file with the following contents:

实际上,键必须在JSON对象中是唯一的(例如,JSON语法是否允许对象中的重复键?)。但是,假设我有一个包含以下内容的文件:

{
    "a" : "1",
    "b" : "2",
    "a" : "3"
}

Is there a simple way of converting the repeated keys to an array? So that the file becomes:

有没有一种简单的方法将重复键转换为数组?这样文件就变成了:

{
    "a" : [ {"key": "1"}, {"key": "3"}],
    "b" : "2"
}

Or something similar, but which combines the repeated keys into an array (or finds and alternative way to extract the repeated key values).

或类似的东西,但它将重复的键组合成一个数组(或查找和替代方法来提取重复的键值)。

Here's a solution in Java: Convert JSON object with duplicate keys to JSON array

这是Java中的解决方案:将带有重复键的JSON对象转换为JSON数组

Is there any way to do it with awk/bash/python?

有没有办法用awk / bash / python做到这一点?

3 个解决方案

#1


5  

If your input is really a flat JSON object with primitives as values, this should work:

如果您的输入实际上是一个平面JSON对象,其基元为值,则应该可以:

jq -s --stream 'group_by(.[0]) | map({"key": .[0][0][0], "value": map(.[1])}) | from_entries'

{
  "a": [
    "1",
    "3"
  ],
  "b": [
    "2"
  ]
}

For more complex outputs, that would require actually understanding how --stream is supposed to be used, which is beyond me.

对于更复杂的输出,这需要实际了解如何使用--stream,这超出了我的范围。

#2


2  

Building on Santiago's answer using -s --stream, the following filter builds up the object one step at a time, thus preserving the order of the keys and of the values for a specific key:

在使用-s --stream的Santiago的答案的基础上,以下过滤器一次一步地构建对象,从而保留键的顺序和特定键的值:

reduce (.[] | select(length==2)) as $kv ({};
      $kv[0][0] as $k
      |$kv[1] as $v
      | (.[$k]|type) as $t
      | if $t == "null" then .[$k] = $v
        elif $t == "array" then .[$k] += [$v]
        else .[$k] = [ .[$k], $v ]
        end)

For the given input, the result is:

对于给定的输入,结果是:

{
  "a": [
    "1",
    "3"
  ],
  "b": "2"
}

To illustrate that the ordering of values for each key is preserved, consider the following input:

为了说明保留每个键的值的顺序,请考虑以下输入:

{
    "c" : "C",
    "a" : "1",
    "b" : "2",
    "a" : "3",
    "b" : "1"
}

The output produced by the filter above is:

上面的过滤器产生的输出是:

{
  "c": "C",
  "a": [
    "1",
    "3"
  ],
  "b": [
    "2",
    "1"
  ]
}

#3


0  

Building up on peak's answer, the following filter also works on multi object-input, with nested objects and without the slurp-Option (-s).

在峰值答案的基础上,以下过滤器也适用于多对象输入,具有嵌套对象且没有slurp-Option(-s)。

This is not an answer to the initial question, but because the jq-FAQ links here it might be useful for some visitors

这不是最初问题的答案,但因为这里的jq-FAQ链接可能对某些访问者有用

File jqmergekeys.txt

def consumestream($arr): # Reads stream elements from stdin until we have enough elements to build one object and returns them as array
input as $inp 
| if $inp|has(1) then consumestream($arr+[$inp]) # input=keyvalue pair => Add to array and consume more
  elif ($inp[0]|has(1)) then consumestream($arr) # input=closing subkey => Skip and consume more
  else $arr end; # input=closing root object => return array

def convert2obj($stream): # Converts an object in stream notation into an object, and merges the values of duplicate keys into arrays
reduce ($stream[]) as $kv ({}; # This function is based on http://*.com/a/36974355/2606757
      $kv[0] as $k
      | $kv[1] as $v
      | (getpath($k)|type) as $t # type of existing value under the given key
      | if $t == "null" then setpath($k;$v) # value not existing => set value
        elif $t == "array" then setpath($k; getpath($k) + [$v] ) # value is already an array => add value to array
        else setpath($k; [getpath($k), $v ]) # single value => put existing and new value into an array
        end);

def mainloop(f):  (convert2obj(consumestream([input]))|f),mainloop(f); # Consumes streams forever, converts them into an object and applies the user provided filter
def mergeduplicates(f): try mainloop(f) catch if .=="break" then empty else error end; # Catches the "break" thrown by jq if there's no more input

#---------------- User code below --------------------------    

mergeduplicates(.) # merge duplicate keys in input, without any additional filters

#mergeduplicates(select(.layers)|.layers.frame) # merge duplicate keys in input and apply some filter afterwards

Example:

tshark -T ek | jq -nc --stream -f ./jqmergekeys.txt

#1


5  

If your input is really a flat JSON object with primitives as values, this should work:

如果您的输入实际上是一个平面JSON对象,其基元为值,则应该可以:

jq -s --stream 'group_by(.[0]) | map({"key": .[0][0][0], "value": map(.[1])}) | from_entries'

{
  "a": [
    "1",
    "3"
  ],
  "b": [
    "2"
  ]
}

For more complex outputs, that would require actually understanding how --stream is supposed to be used, which is beyond me.

对于更复杂的输出,这需要实际了解如何使用--stream,这超出了我的范围。

#2


2  

Building on Santiago's answer using -s --stream, the following filter builds up the object one step at a time, thus preserving the order of the keys and of the values for a specific key:

在使用-s --stream的Santiago的答案的基础上,以下过滤器一次一步地构建对象,从而保留键的顺序和特定键的值:

reduce (.[] | select(length==2)) as $kv ({};
      $kv[0][0] as $k
      |$kv[1] as $v
      | (.[$k]|type) as $t
      | if $t == "null" then .[$k] = $v
        elif $t == "array" then .[$k] += [$v]
        else .[$k] = [ .[$k], $v ]
        end)

For the given input, the result is:

对于给定的输入,结果是:

{
  "a": [
    "1",
    "3"
  ],
  "b": "2"
}

To illustrate that the ordering of values for each key is preserved, consider the following input:

为了说明保留每个键的值的顺序,请考虑以下输入:

{
    "c" : "C",
    "a" : "1",
    "b" : "2",
    "a" : "3",
    "b" : "1"
}

The output produced by the filter above is:

上面的过滤器产生的输出是:

{
  "c": "C",
  "a": [
    "1",
    "3"
  ],
  "b": [
    "2",
    "1"
  ]
}

#3


0  

Building up on peak's answer, the following filter also works on multi object-input, with nested objects and without the slurp-Option (-s).

在峰值答案的基础上,以下过滤器也适用于多对象输入,具有嵌套对象且没有slurp-Option(-s)。

This is not an answer to the initial question, but because the jq-FAQ links here it might be useful for some visitors

这不是最初问题的答案,但因为这里的jq-FAQ链接可能对某些访问者有用

File jqmergekeys.txt

def consumestream($arr): # Reads stream elements from stdin until we have enough elements to build one object and returns them as array
input as $inp 
| if $inp|has(1) then consumestream($arr+[$inp]) # input=keyvalue pair => Add to array and consume more
  elif ($inp[0]|has(1)) then consumestream($arr) # input=closing subkey => Skip and consume more
  else $arr end; # input=closing root object => return array

def convert2obj($stream): # Converts an object in stream notation into an object, and merges the values of duplicate keys into arrays
reduce ($stream[]) as $kv ({}; # This function is based on http://*.com/a/36974355/2606757
      $kv[0] as $k
      | $kv[1] as $v
      | (getpath($k)|type) as $t # type of existing value under the given key
      | if $t == "null" then setpath($k;$v) # value not existing => set value
        elif $t == "array" then setpath($k; getpath($k) + [$v] ) # value is already an array => add value to array
        else setpath($k; [getpath($k), $v ]) # single value => put existing and new value into an array
        end);

def mainloop(f):  (convert2obj(consumestream([input]))|f),mainloop(f); # Consumes streams forever, converts them into an object and applies the user provided filter
def mergeduplicates(f): try mainloop(f) catch if .=="break" then empty else error end; # Catches the "break" thrown by jq if there's no more input

#---------------- User code below --------------------------    

mergeduplicates(.) # merge duplicate keys in input, without any additional filters

#mergeduplicates(select(.layers)|.layers.frame) # merge duplicate keys in input and apply some filter afterwards

Example:

tshark -T ek | jq -nc --stream -f ./jqmergekeys.txt