Remove comments from JSON data

你离开我真会死。 提交于 2019-12-02 13:05:01

问题


I need to remove all /*...*/ style comments from JSON data. How do I do it with regular expressions so that string values like this

{
    "propName": "Hello \" /* hi */ there."
}

remain unchanged?


回答1:


You must first avoid all the content that is inside double quotes using the backtrack control verbs SKIP and FAIL (or a capture)

$string = <<<'LOD'
{
    "propName": "Hello \" /* don't remove **/ there." /*this must be removed*/
}
LOD;

$result = preg_replace('~"(?:[^\\\"]+|\\\.)*+"(*SKIP)(*FAIL)|/\*(?:[^*]+|\*+(?!/))*+\*/~s', '',$string);

// The same with a capture:

$result = preg_replace('~("(?:[^\\\"]+|\\\.)*+")|/\*(?:[^*]+|\*+(?!/))*+\*/~s', '$1',$string);

Pattern details:

"(?:[^\\\"]+|\\\.)*+"

This part describe the possible content inside quotes:

"              # literal quote
(?:            # open a non-capturing group
    [^\\\"]+   # all characters that are not \ or "
  |            # OR
    \\\.)*+    # escaped char (that can be a quote)
"

Then You can make this subpattern fails with (*SKIP)(*FAIL) or (*SKIP)(?!). The SKIP forbid the backtracking before this point if the pattern fails after. FAIL forces the pattern to fail. Thus, quoted part are skipped (and can't be in the result since you make the subpattern fail after).

Or you use a capturing group and you add the reference in the replacement pattern.

/\*(?:[^*]+|\*+(?!/))*+\*/

This part describe content inside comments.

/\*           # open the comment
(?:           
    [^*]+     # all characters except *
  |           # OR
    \*+(?!/)  # * not followed by / (note that you can't use 
              # a possessive quantifier here)
)*+           # repeat the group zero or more times
\*/           # close the comment

The s modifier is used here only when a backslash is before a newline inside quotes.



来源:https://stackoverflow.com/questions/19910002/remove-comments-from-json-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!