How to Flatten JSON using jq and Bash into Bash Associative Array where Key=Selector?

无人久伴 提交于 2019-12-11 01:48:43

问题


As a follow-up to Flatten Arbitrary JSON, I'm looking to take the flattened results and make them suitable for doing queries and updates back to the original JSON file.

Motivation: I'm writing Bash (4.2+) scripts (on CentOS 7) that read JSON into a Bash associative array using the JSON selector/filter as the key. I do processing on the associative arrays, and in the end I want to update the JSON with those changes.

The preceding solution gets me close to this goal. I think there are two things that it doesn't do:

  1. It doesn't quote keys that require quoting. For example, the key com.acme would need to be quoted because it contains a special character.
  2. Array indexes are not represented in a form that can be used to query the original JSON.

Existing Solution

The solution from the above is:

$ jq --stream -n --arg delim '.' 'reduce (inputs|select(length==2)) as $i ({};
[$i[0][]|tostring] as $path_as_strings
    | ($path_as_strings|join($delim)) as $key
    | $i[1] as $value
    | .[$key] = $value
)' input.json

For example, if input.json contains:

{
   "a.b":
   [
      "value"
   ]
}

then the output is:

{
  "a.b.0": "value"
}

What is Really Wanted

An improvement would have been:

{
  "\"a.b\"[0]": "value"
}

But what I really want is output formatted so that it could be sourced directly in a Bash program (implying the array name is passed to jq as an argument):

ArrayName['"a.b"[0]']='value'  # Note 'value' might need escapes for Bash

I'm looking to have the more human-readable syntax above as opposed to the more general:

ArrayName['.["a.b"][0]']='value'

I don't know if jq can handle all of this. My present solution is to take the output from the preceding solution and to post-process it to the form that I want. Here's the work in process:

#!/bin/bash
Flatten()                                                                                                                                                                                                                       
{
local -r OPTIONS=$(getopt -o d:m:f: -l "delimiter:,mapname:,file:" -n "${FUNCNAME[0]}" -- "$@")
eval set -- "$OPTIONS"

local Delimiter='.' MapName=map File=
while true ; do
   case "$1" in
   -d|--delimiter)   Delimiter="$2"; shift 2;;
   -m|--mapname)     MapName="$2"; shift 2;;
   -f|--file)        File="$2"; shift 2;;
   --)               shift; break;;
   esac
done

local -a Array=()
readarray -t Array <<<"$(
   jq -c -S --stream -n --arg delim "$Delimiter" 'reduce (inputs|select(length==2)) as $i ({}; .[[$i[0][]|tostring]|join($delim)] = $i[1])' <<<"$(sed 's|^\s*[#%].*||' "$File")" |
   jq -c "to_entries|map(\"\(.key)=\(.value|tostring)\")|.[]" |
   sed -e 's|^"||' -e 's|"$||' -e 's|=|\t|')"

if [[ ! -v $MapName ]]; then
   local -gA $MapName
fi

. <(
   IFS=$'\t'
   while read -r Key Value; do
      printf "$MapName[\"%s\"]=%q\n" "$Key" "$Value"
   done <<<"$(printf "%s\n" "${Array[@]}")"
)
}
declare -A Map
Flatten -m Map -f "$1"
declare -p Map

With the output:

$ ./Flatten.sh <(echo '{"a.b":["value"]}')
declare -A Map='([a.b.0]="value" )'

回答1:


1) jq is Turing complete, so it's all just a question of which hammer to use.

2)

An improvement would have been:

{ "\"a.b\"[0]": "value" }

That is easily accomplished using a helper function along these lines:

def flattenPath(delim):
  reduce .[] as $s ("";
    if $s|type == "number" 
    then ((if . == "" then "." else . end) + "[\($s)]")
    else . + ($s | tostring | if index(delim) then "\"\(.)\"" else . end)
    end );

3)

I do processing on the associative arrays, and in the end I want to update the JSON with those changes.

This suggests you might have posed an xy-problem. However, if you really do want to serialize and unserialize some JSON text, then the natural way to do so using jq is using leaf_paths, as illustrated by the following serialization/deserialization functions:

# Emit (path, value) pairs
# Usage: jq -c -f serialize.jq input.json > serialized.json
def serialize: leaf_paths as $p | ($p, getpath($p));


# Usage: jq -n -f unserialize.jq serialized.json
def unserialize:
  def pairwise(s):
    foreach s as $i ([]; 
      if length == 1 then . + [$i] else [$i] end;
      select(length == 2));
  reduce pairwise(inputs) as $p (null; setpath($p[0]; $p[1]));

If using bash, you could use readarray (mapfile) to read the paths and values into a single array, or if you want to distinguish between the paths and values more easily, you could (for example) use the approach illustrated by the following:

i=0
while read -r line ; do
  path[$i]="$line"; read -r line; value[$i]="$line"
  i=$((i + 1))
done < serialized.json

But there are many other alternatives.



来源:https://stackoverflow.com/questions/42553309/how-to-flatten-json-using-jq-and-bash-into-bash-associative-array-where-key-sele

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!