Remove fields from JSON dynamically using Json.Net

后端 未结 2 1503
深忆病人
深忆病人 2020-12-07 03:13

I have some JSON input, the shape of which I cannot predict, and I have to make some transformations (to call it something) so that some fields are not logged. For instance,

相关标签:
2条回答
  • 2020-12-07 03:17

    You can parse your JSON to a JContainer (which is either an object or array), then search the JSON hierarchy using DescendantsAndSelf() for properties with names that match some Regex, or string values that match a Regex, and remove those items with JToken.Remove().

    For instance, given the following JSON:

    {
      "Items": [
        {
          "id": 5,
          "name": "Peter",
          "password": "some pwd"
        },
        {
          "id": 5,
          "name": "Peter",
          "password": "some pwd"
        }
      ],
      "RootPasswrd2": "some pwd",
      "SecretData": "This data is secret",
      "StringArray": [
        "I am public",
        "This is also secret"
      ]
    }
    

    You can remove all properties whose name includes "pass.*w.*r.*d" as follows:

    var root = (JContainer)JToken.Parse(jsonString);
    
    var nameRegex = new Regex(".*pass.*w.*r.*d.*", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
    var query = root.DescendantsAndSelf()
        .OfType<JProperty>()
        .Where(p => nameRegex.IsMatch(p.Name));
    query.RemoveFromLowestPossibleParents();
    

    Which results in:

    {
      "Items": [
        {
          "id": 5,
          "name": "Peter"
        },
        {
          "id": 5,
          "name": "Peter"
        }
      ],
      "SecretData": "This data is secret",
      "StringArray": [
        "I am public",
        "This is also secret"
      ]
    }
    

    And you can remove all string values that include the substring secret by doing:

    var valueRegex = new Regex(".*secret.*", RegexOptions.IgnoreCase);
    var query2 = root.DescendantsAndSelf()
        .OfType<JValue>()
        .Where(v => v.Type == JTokenType.String && valueRegex.IsMatch((string)v));
    query2.RemoveFromLowestPossibleParents();
    
    var finalJsonString = root.ToString();
    

    Which when applied after the first transform results in:

    {
      "Items": [
        {
          "id": 5,
          "name": "Peter"
        },
        {
          "id": 5,
          "name": "Peter"
        }
      ],
      "StringArray": [
        "I am public"
      ]
    }
    

    For convenience, I am using the following extension methods:

    public static partial class JsonExtensions
    {
        public static TJToken RemoveFromLowestPossibleParent<TJToken>(this TJToken node) where TJToken : JToken
        {
            if (node == null)
                return null;
            JToken toRemove;
            var property = node.Parent as JProperty;
            if (property != null)
            {
                // Also detach the node from its immediate containing property -- Remove() does not do this even though it seems like it should
                toRemove = property;
                property.Value = null;
            }
            else
            {
                toRemove = node;
            }
            if (toRemove.Parent != null)
                toRemove.Remove();
            return node;
        }
    
        public static IEnumerable<TJToken> RemoveFromLowestPossibleParents<TJToken>(this IEnumerable<TJToken> nodes) where TJToken : JToken
        {
            var list = nodes.ToList();
            foreach (var node in list)
                node.RemoveFromLowestPossibleParent();
            return list;
        }
    }
    

    Demo fiddle here.

    0 讨论(0)
  • 2020-12-07 03:19

    You can parse your JSON into a JToken, then use a recursive helper method to match property names to your regexes. Wherever there's a match, you can remove the property from its parent object. After all sensitive info has been removed, just use JToken.ToString() to get the redacted JSON.

    Here is what the helper method might look like:

    public static string RemoveSensitiveProperties(string json, IEnumerable<Regex> regexes)
    {
        JToken token = JToken.Parse(json);
        RemoveSensitiveProperties(token, regexes);
        return token.ToString();
    }
    
    public static void RemoveSensitiveProperties(JToken token, IEnumerable<Regex> regexes)
    {
        if (token.Type == JTokenType.Object)
        {
            foreach (JProperty prop in token.Children<JProperty>().ToList())
            {
                bool removed = false;
                foreach (Regex regex in regexes)
                {
                    if (regex.IsMatch(prop.Name))
                    {
                        prop.Remove();
                        removed = true;
                        break;
                    }
                }
                if (!removed)
                {
                    RemoveSensitiveProperties(prop.Value, regexes);
                }
            }
        }
        else if (token.Type == JTokenType.Array)
        {
            foreach (JToken child in token.Children())
            {
                RemoveSensitiveProperties(child, regexes);
            }
        }
    }
    

    And here is a short demo of its use:

    public static void Test()
    {
        string json = @"
        {
          ""users"": [
            {
              ""id"": 5,
              ""name"": ""Peter Gibbons"",
              ""company"": ""Initech"",
              ""login"": ""pgibbons"",
              ""password"": ""Sup3rS3cr3tP@ssw0rd!"",
              ""financialDetails"": {
                ""creditCards"": [
                  {
                    ""vendor"": ""Viza"",
                    ""cardNumber"": ""1000200030004000"",
                    ""expDate"": ""2017-10-18"",
                    ""securityCode"": 123,
                    ""lastUse"": ""2016-10-15""
                  },
                  {
                    ""vendor"": ""MasterCharge"",
                    ""cardNumber"": ""1001200230034004"",
                    ""expDate"": ""2018-05-21"",
                    ""securityCode"": 789,
                    ""lastUse"": ""2016-10-02""
                  }
                ],
                ""bankAccounts"": [
                  {
                    ""accountType"": ""checking"",
                    ""accountNumber"": ""12345678901"",
                    ""financialInsitution"": ""1st Bank of USA"",
                    ""routingNumber"": ""012345670""
                  }
                ]
              },
              ""securityAnswers"":
              [
                  ""Constantinople"",
                  ""Goldfinkle"",
                  ""Poppykosh"",
              ],
              ""interests"": ""Computer security, numbers and passwords""
            }
          ]
        }";
    
        Regex[] regexes = new Regex[]
        {
            new Regex("^.*password.*$", RegexOptions.IgnoreCase),
            new Regex("^.*number$", RegexOptions.IgnoreCase),
            new Regex("^expDate$", RegexOptions.IgnoreCase),
            new Regex("^security.*$", RegexOptions.IgnoreCase),
        };
    
        string redactedJson = RemoveSensitiveProperties(json, regexes);
        Console.WriteLine(redactedJson);
    }
    

    Here is the resulting output:

    {
      "users": [
        {
          "id": 5,
          "name": "Peter Gibbons",
          "company": "Initech",
          "login": "pgibbons",
          "financialDetails": {
            "creditCards": [
              {
                "vendor": "Viza",
                "lastUse": "2016-10-15"
              },
              {
                "vendor": "MasterCharge",
                "lastUse": "2016-10-02"
              }
            ],
            "bankAccounts": [
              {
                "accountType": "checking",
                "financialInsitution": "1st Bank of USA"
              }
            ]
          },
          "interests": "Computer security, numbers and passwords"
        }
      ]
    }
    

    Fiddle: https://dotnetfiddle.net/KcSuDt

    0 讨论(0)
提交回复
热议问题