Read any JSON into list of key-value pairs (EAV format) in SQL Server

佐手、 提交于 2020-05-15 07:56:33

问题


Looking for a way to read any JSON without knowing the internals with SQL Server's JSON methods, I came up with an approach I'd like to share.

Here is the question bringing up this issue.

The question is: How can I transform an unknown JSON into a structured EAV format while keeping all the information about sort order and nest levels.

The ideal output should carry the original row's id as the entity, the Json's key and value as the attribute and value together with the JsonPath for the specific object in a sorted list.

Find the MCVE (sample data from the linked question) embedded into my self-answer.


回答1:


First we create a declared table variable and fill it with some sample JSONs to simulate the issue (I added some arrays to the samples to reflect JSON paths for arrays):

DECLARE @table TABLE(ID INT IDENTITY, AnyJSON NVARCHAR(MAX));
INSERT INTO @table VALUES
(N' {
    "correlationId": "c3xOeEEQQCCA9sEx7-u6FA",
    "eventCreateTime": "2020-05-12T15:38:23.717Z",
    "time": 1589297903717,
    "owner": {
        "ownergeography": {
            "city": "abc",
            "country": "abc"
        },
        "ownername": {
            "firstname": "abc",
            "lastname": "def"
        },
        "clientApiKey": "xxxxx",
        "businessProfileApiKey": null,
        "userId": null
    },
    "campaignType": "Mobile push"
}')
,(N'[{
    "correlationIds": [
        {
            "campaignId": [1,2,3],
            "correlationId": [{"a":"b"},{"c":"d"},{"e":"f"}]
        }
    ],
    "variantId": 1278915,
    "utmCampaign": "",
    "ua.os.major": "8"
    }
    ,{
    "correlationIds": [
        {
            "campaignId": [1,2,3],
            "correlationId": [{"a":"b"},{"c":"d"},{"e":"f"}]
        }
    ],
    "variantId": 1278915,
    "utmCampaign": "",
    "ua.os.major": "8"
    }]')
,(N'{
    "correlationId": "ls7XmuuiThWzktUeewqgWg",
    "eventCreateTime": "2020-05-12T12:40:20.786Z",
    "time": 1589287220786,
    "modifiedBy": {
        "clientId": null,
        "clientApiKey": "xxx",
        "businessProfileApiKey": null,
        "userId": null
    },
    "campaignType": "Mobile push"
}');

--The query

WITH recCTE AS
(
    SELECT ID
          ,NestLevel   = 1                                                            
          ,SortString  = CAST(N'0'                          COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 

          ,JsonPath    = CAST(N'$'                          COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
          ,JsonKey     = CAST(N'$'                          COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
          ,JsonValue   = CAST(AnyJSON                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
          ,JsonType    = CAST(CASE WHEN LEFT(TRIM(AnyJSON),1)=N'[' THEN 4 ELSE 0 END AS TINYINT)
          ,NestedJSON  = CAST(CASE WHEN ISJSON(AnyJSON)=1 
                                   THEN AnyJSON 
                                   ELSE NULL END            COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
    FROM @table t

    UNION ALL

    SELECT r.ID
          ,r.NestLevel+1
          ,CAST(CONCAT(r.SortString,STR(ROW_NUMBER() OVER(ORDER BY (SELECT NULL)),5)) COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))

          ,CAST(CONCAT(r.JsonPath, CASE WHEN r.JsonType=4 --<-- see the docs for OPENJSON()
                                        THEN CONCAT('[',A.[key],']') 
                                        ELSE '.' + A.[key] END)                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
          ,CAST(A.[key]                                                               COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
          ,CAST(r.JsonValue                                                           COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
          ,A.[type] 
          ,CAST(A.[value]                                                             COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
    FROM recCTE r
    CROSS APPLY OPENJSON(r.NestedJSON) A
    WHERE ISJSON(r.NestedJSON)=1
)
SELECT ID
      ,JsonPath
      ,JsonKey
      ,NestedJSON AS JsonValue
      ,SortString --<-- just to illustrate the sorting, not needed in the output
FROM recCTE 
WHERE ISJSON(NestedJSON)=0
ORDER BY ID,SortString;

The result

+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| ID | JsonPath                                  | JsonKey         | JsonValue                | SortString                      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.correlationId                           | correlationId   | c3xOeEEQQCCA9sEx7-u6FA   | 0    1                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.eventCreateTime                         | eventCreateTime | 2020-05-12T15:38:23.717Z | 0    2                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.time                                    | time            | 1589297903717            | 0    3                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownergeography.city               | city            | abc                      | 0    4    1    1                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownergeography.country            | country         | abc                      | 0    4    1    2                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownername.firstname               | firstname       | abc                      | 0    4    2    1                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownername.lastname                | lastname        | def                      | 0    4    2    2                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.clientApiKey                      | clientApiKey    | xxxxx                    | 0    4    3                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.campaignType                            | campaignType    | Mobile push              | 0    5                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].campaignId[0]      | 0               | 1                        | 0    1    1    1    1    1      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].campaignId[1]      | 1               | 2                        | 0    1    1    1    1    2      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].campaignId[2]      | 2               | 3                        | 0    1    1    1    1    3      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].correlationId[0].a | a               | b                        | 0    1    1    1    2    1    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].correlationId[1].c | c               | d                        | 0    1    1    1    2    2    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].correlationId[2].e | e               | f                        | 0    1    1    1    2    3    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].variantId                            | variantId       | 1278915                  | 0    1    2                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].utmCampaign                          | utmCampaign     |                          | 0    1    3                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].ua.os.major                          | ua.os.major     | 8                        | 0    1    4                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].campaignId[0]      | 0               | 1                        | 0    2    1    1    1    1      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].campaignId[1]      | 1               | 2                        | 0    2    1    1    1    2      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].campaignId[2]      | 2               | 3                        | 0    2    1    1    1    3      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].correlationId[0].a | a               | b                        | 0    2    1    1    2    1    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].correlationId[1].c | c               | d                        | 0    2    1    1    2    2    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].correlationId[2].e | e               | f                        | 0    2    1    1    2    3    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].variantId                            | variantId       | 1278915                  | 0    2    2                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].utmCampaign                          | utmCampaign     |                          | 0    2    3                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].ua.os.major                          | ua.os.major     | 8                        | 0    2    4                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.correlationId                           | correlationId   | ls7XmuuiThWzktUeewqgWg   | 0    1                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.eventCreateTime                         | eventCreateTime | 2020-05-12T12:40:20.786Z | 0    2                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.time                                    | time            | 1589287220786            | 0    3                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.modifiedBy.clientApiKey                 | clientApiKey    | xxx                      | 0    4    2                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.campaignType                            | campaignType    | Mobile push              | 0    5                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+

The idea in short:

  • we use a recursive CTE to walk this down.
  • The query will test any fragment ([value] coming from OPENJSON) for being valid JSON.
  • If the fragment is valid, this walks deeper and deeper.
  • The column SortString is needed to get a final sort order.
  • The CAST() and COLLATE helps to avoid data type mismatch. Recursive CTEs are very picky with this...

Hint: If you deal with bigger JSONs you might need to set OPTION (MAXRECURSION 0) at the end of your query.

Enjoy :-)



来源:https://stackoverflow.com/questions/61800927/read-any-json-into-list-of-key-value-pairs-eav-format-in-sql-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!