问题
Say there is a string in the loose "format",
string str = "V1,B=V1,C=V1,V2,V3,D=V1,V2,A=V1,=V2,V3";
and a known set of Keys
List<string> lst = new List<string>() { "A", "B", "C", "D" };
How can the Key-Value pairs shown below be extracted? (Any text before the first Key should be treated as the Value for the null Key. Also the Values shown below have any trailing comma removed.)
Key Value
(null) V1
A V1,=V2,V3 (The = here is, unfortunately, part of the value)
B V1
C V1,V2,V3
D V1,V2
This problem is difficult because it is not possible to split immediately on either the =
or ,
.
回答1:
Ignoring the known set of keys, and assuming each key appears only once:
string str = "V1,B=V1,C=V1,V2,V3,D=V1,V2,A=V1,=V2,V3";
var splitByEqual = new[] {'='};
var values = Regex.Split(str, @",(?=\w+=)")
.Select(token => token.Split(splitByEqual, 2))
.ToDictionary(pair => pair.Length == 1 ? "" : pair.First(),
pair => pair.Last());
- The regex is pretty simple: split by commas that are followed by a key (any alphanumeric) and an equal sign. (If we allow
A=V1,V2=V3
this wouldn't work) - Now we have the collection {
V1
,B=V1
,C=V1,V2,V3
,D=V1,V2
,A=V1,=V2,V3
}. We split that by=
, but not more than once. - Next we create a dictionary. This line is a little ugly, but isn't too important - we already have the data we need. I'm also using an empty string instead of null.
If we do want to use the known list of keys, we can change the pattern to:
var splitPattern = @",(?=(?:" + String.Join("|", keys.Select(Regex.Escape))) + ")=)";
and use Regex.Split(str, splitPattern)
.
回答2:
Assuming the keys do not also occur in the values:
- For each key,
- search for the regexp
",|^" + KEY + "="
- search for the regexp
- split the string at the found locations
- then process each split string individually. anything before the first = is the key, anything after is the value
回答3:
Can't you remove the leading =
before you split? Here's an approach using String.Split
and LINQ:
var pairs = str.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries)
.Select(x => new { KeyVals = x.TrimStart('=').Split('=') })
.Select(x => new
{
Key = x.KeyVals.Length == 1 ? null : x.KeyVals[0].Trim(),
Value = x.KeyVals.Last().Trim()
})
.GroupBy(x => x.Key)
.Select(g => new { g.Key, Values=g.Select(x => x.Value) });
Output:
foreach (var keyVal in pairs)
Console.WriteLine("Key:{0} Values:{1}", keyVal.Key, string.Join(",", keyVal.Values));
Key: Values:V1,V2,V3,V2,V2,V3
Key:B Values:V1
Key:C Values:V1
Key:D Values:V1
Key:A Values:V1
The result is different to your desired, so maybe i'm on the wrong track. It's also not clear why you need the "known set of Keys". If you want to filter by them add a Where
before the GroupBy
.
回答4:
I hate myself for going all old-school, but try replacing the leading = with another character before the split then put it back afterwards:
Debug view of result:

private static List<KeyValuePair<string, string>> ExtractData(string dataString, List<string> keys)
{
// Convert any leading "=" to another character avoid losing it :)
dataString = dataString.Replace(",=", ",+");
List<KeyValuePair<string, string>> result = new List<KeyValuePair<string, string>>();
// Split on equals and comma
var entries = dataString.Split(new char[] { '=', ',' }, StringSplitOptions.RemoveEmptyEntries);
// Start with null key
string key = null;
// Start with blank value for each key
string value = "";
foreach (string entry in entries)
{
// Put back any removed '='
string text = entry.Replace('+', '=');
if (keys.Contains(entry))
{
// Save previous key value
if (!string.IsNullOrEmpty(value))
{
result.Add(new KeyValuePair<string, string>(key, value.TrimEnd(new char[] { ',' })));
}
key = entry;
value = "";
}
else
{
value += text + ",";
}
}
// save last result
result.Add(new KeyValuePair<string,string>(key, value.TrimEnd(new char[]{','})));
return result;
}
I know this can be shortened with LINQ etc, but no time to make it pretty :)
来源:https://stackoverflow.com/questions/24136021/how-to-extract-key-value-pairs-from-a-string-when-the-key-itself-is-the-separat