Default values for empty groups in Linq GroupBy query

烈酒焚心 提交于 2019-12-06 10:55:23

问题


I have a data set of values that I want to summarise in groups. For each group, I want to create an array big enough to contain the values of the largest group. When a group contains less than this maximum number, I want to insert a default value of zero for the empty key values.

Dataset

Col1    Col2    Value
--------------------
A       X       10
A       Z       15
B       X       9
B       Y       12
B       Z       6

Desired result

X, [10, 9]
Y, [0, 12]
Z, [15, 6]

Note that value "A" in Col1 in the dataset has no value for "Y" in Col2. Value "A" is first group in the outer series, therefore it is the first element that is missing.

The following query creates the result dataset, but does not insert the default zero values for the Y group.

result = data.GroupBy(item => item.Col2)
             .Select(group => new
             {
                 name = group.Key,
                 data = group.Select(item => item.Value)
                             .ToArray()
             })

Actual result

X, [10, 9]
Y, [12]
Z, [15, 6]

What do I need to do to insert a zero as the missing group value?


回答1:


Here is how I understand it.

Let say we have this

class Data
{
    public string Col1, Col2;
    public decimal Value;
}

Data[] source =
{
    new Data { Col1="A", Col2 = "X", Value = 10 },
    new Data { Col1="A", Col2 = "Z", Value = 15 },
    new Data { Col1="B", Col2 = "X", Value = 9 },
    new Data { Col1="B", Col2 = "Y", Value = 12 },
    new Data { Col1="B", Col2 = "Z", Value = 6 },
};

First we need to determine the "fixed" part

var columns = source.Select(e => e.Col1).Distinct().OrderBy(c => c).ToList();

Then we can process with the normal grouping, but inside the group we will left join the columns with group elements which will allow us to achieve the desired behavior

var result = source.GroupBy(e => e.Col2, (key, elements) => new
{
    Key = key,
    Elements = (from c in columns
             join e in elements on c equals e.Col1 into g
             from e in g.DefaultIfEmpty()
             select e != null ? e.Value : 0).ToList()
})
.OrderBy(e => e.Key)
.ToList();



回答2:


It won't be pretty, but you can do something like this:

var groups = data.GroupBy(d => d.Col2, d => d.Value)
                 .Select(g => new { g, count = g.Count() })
                 .ToList();
int maxG = groups.Max(p => p.count);
var paddedGroups = groups.Select(p => new {
                     name = p.g.Key,
                     data = p.g.Concat(Enumerable.Repeat(0, maxG - p.count)).ToArray() });



回答3:


You can do it like this:-

int maxCount = 0;
var result = data.GroupBy(x => x.Col2)
             .OrderByDescending(x => x.Count())
             .Select(x => 
                {
                   if (maxCount == 0)
                       maxCount = x.Count();
                   var Value = x.Select(z => z.Value);
                   return new 
                   {
                      name = x.Key,
                      data = maxCount == x.Count() ? Value.ToArray() : 
                                 Value.Concat(new int[maxCount - Value.Count()]).ToArray()
                   };
                });

Code Explanation:-

Since you need to append default zeros in case when you have less items in any group, I am storing the maxCount (which any group can produce in a variable maxCount) for this I am ordering the items in descending order. Next I am storing the maximum count which the item can producr in maxCount variable. While projecting I am simply checking if number of items in the group is not equal to maxCount then create an integer array of size (maxCount - x.Count) i.e. maximum count minus number of items in current group and appending it to the array.

Working Fiddle.



来源:https://stackoverflow.com/questions/34334911/default-values-for-empty-groups-in-linq-groupby-query

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!