Can using multiple JSON_VALUE-s in MSSQL query harm performance

為{幸葍}努か 提交于 2021-01-24 19:04:49

问题


I have table in sql server database which have column of json type.

Table - SomeTable
Id | Properties
1  | {"field1":"value1", "field2":"value2", "field3":"value3", "field4":"value4"}
2  | {"field1":"value1", "field2":"value2", "field3":"value3", "field4":"value4"}
...|...

I wrote select query which selects value of each field separately:

SELECT
    JSON_VALUE(Properties, '$.field1') as field1,
    JSON_VALUE(Properties, '$.field2') as field2,
    JSON_VALUE(Properties, '$.field3') as field3,
    JSON_VALUE(Properties, '$.field4') as field4
FROM SomeTable

I found this approach in Microsoft's docs (https://docs.microsoft.com/en-us/sql/relational-databases/json/json-data-sql-server?view=sql-server-ver15)

Can writing many JSON_VALUE-s in query harm performance ? Is SQL doing deserialization of string for each JSON_VALUE written in query.


回答1:


You may try to use OPENJSON() with an explicit schema to parse the JSON stored in the Properties column with one function call (against four or more JSON_VALUE() calls):

Table:

CREATE TABLE SomeTable (
   Id int,
   Properties varchar(1000)
)
INSERT INTO SomeTable (Id, Properties)
VALUES
   (1, '{"field1":"value1", "field2":"value2", "field3":"value3", "field4":"value4"}'),
   (2, '{"field1":"value1", "field2":"value2", "field3":"value3", "field4":"value4"}')

Statement:

SELECT s.Id, j.*
FROM SomeTable s
CROSS APPLY OPENJSON(s.Properties) WITH (
   field1 varchar(100) '$.field1',
   field2 varchar(100) '$.field2',
   field3 varchar(100) '$.field3',
   field4 varchar(100) '$.field4'
) j

Result:

Id  field1  field2  field3  field4
----------------------------------
1   value1  value2  value3  value4
2   value1  value2  value3  value4

As an additional note, the result from the JSON_VALUE() is a scalar value of type nvarchar(4000). With OPENJSON() and explicit schema you may define the appropriate data type for the returned columns.




回答2:


I thought I would do a test in regards to your question:

Can writing many JSON_VALUE-s in query harm performance ? Is SQL doing deserialization of string for each JSON_VALUE written in query.

So I create the below script and tested on my home work station:

/*
Id | Properties
1  | {"field1":"value1", "field2":"value2", "field3":"value3", "field4":"value4"}
2  | {"field1":"value1", "field2":"value2", "field3":"value3", "field4":"value4"}
*/

CREATE TABLE dbo.YourTable (ID int IDENTITY,
                            Properties nvarchar(4000));

WITH N AS(
    SELECT N
    FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
    FROM N N1, N N2, N N3, N N4, N N5, N N6)
INSERT INTO dbo.YourTable (Properties)
SELECT N'{"field1":"value1", "field2":"value2", "field3":"value3", "field4":"value4"}'
FROM Tally;
GO

DECLARE @StartTime datetime2(7) = SYSDATETIME();

SELECT
    JSON_VALUE(Properties, '$.field1') as field1,
    JSON_VALUE(Properties, '$.field2') as field2,
    JSON_VALUE(Properties, '$.field3') as field3,
    JSON_VALUE(Properties, '$.field4') as field4
FROM dbo.YourTable;

PRINT DATEDIFF(MILLISECOND,@StartTime, SYSDATETIME());

SET @StartTime = SYSDATETIME();

SELECT YT.Id,
       OJ.field1,
       OJ.field2,
       OJ.field3,
       OJ.field4
FROM dbo.YourTable YT
     CROSS APPLY OPENJSON(YT.Properties)
                 WITH(field1 varchar(50),
                      field2 varchar(50),
                      field3 varchar(50),
                      field4 varchar(50)) OJ;

PRINT DATEDIFF(MILLISECOND,@StartTime, SYSDATETIME());

GO 10

DROP TABLE dbo.YourTable

This uses your solution, with JSON_VALUE and also an OPENJSON solution (which I'd written around the same time as Zhorov posted his answer). On my workstation, this resulted in the below output from the messages:

(1000000 rows affected)
5273
(1000000 rows affected)
3560
(1000000 rows affected)
5196
(1000000 rows affected)
3329
(1000000 rows affected)
5097
(1000000 rows affected)
3320
(1000000 rows affected)
5219
(1000000 rows affected)
3379
(1000000 rows affected)
5133
(1000000 rows affected)
3239
(1000000 rows affected)
5137
(1000000 rows affected)
3352
(1000000 rows affected)
5080
(1000000 rows affected)
3348
(1000000 rows affected)
5126
(1000000 rows affected)
3320
(1000000 rows affected)
5146
(1000000 rows affected)
3360
(1000000 rows affected)
5310
(1000000 rows affected)
3354

Notice that the second row, with OPENJSON, is about 2 second faster for every run. This is, however, with 1,000,000 million rows, but it does confirm that a single parse of the JSON is far faster than 4.



来源:https://stackoverflow.com/questions/63153959/can-using-multiple-json-value-s-in-mssql-query-harm-performance

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!