问题
I have 2 tables A{int id,int grp}, B{int aid,int cat}.
Table B contains list of categories that record of table A belongs to, so B.aid is Foreign Key that references A.id.
A.id is unique primary key of table A.
B.cat contains category number from 1 to 5, A.grp contains numbers from 1 to 1000.
Table A has 3 million of records, table B - about 5 million.
For each group A.grp I need to calculate % of records in A that contain B.cat out of number of records within group A.grp.
So if A:[{1,1},{2,1},{3,2}], B:[{1,3},{1,4},{2,3},{3,4}] then result of the query should be the following 3 column table: R{int grp,int cat,double percent}:[{1,3,100},{1,4,50},{2,4,100}]
How can I do it with one single query in Linq ?
It is desired that A to appear only once in that query because I want to be able to replace A with A.Where(e=>some complicated expression) without duplicating it many times in that single query.
Tables A and B are imported into Linq to Entities with foreign keys so that it's possible to reference from a in A from b in a.B select b.cat
or from b in B select b.A.grp
回答1:
You can combine your queries like this
var query = from g in
(from a in db.A
group a by new
{
grp = a.grp
}
)
join c in
(from a in db.A
from b in a.B
group b by new
{
a.grp,
b.cat
}
)
on g.Key.grp equals c.Key.grp
select new
{
g.Key.grp,
c.Key.cat,
percent = c.Count() * 100 / g.Count()
};
回答2:
Here is SQL code that generates desired result:
with grp as (select a.grp,cnt=count(*) from a group by a.grp)
,cat as(select a.grp,b.cat,cnt=count( * ) * 100/grp.cnt
from a
join b on b.aid=a.id
join grp on grp.grp=a.grp
group by a.grp,b.cat,grp.cnt)
select * from cat
Here is Linq code that generates desired result:
var grp=
from a in db.A
group a by new{grp=a.grp}
;
var cat=
from a in db.A
from b in a.B
group b by new{a.grp,b.cat}
;
var q=from g in grp
join c in cat on g.Key.grp equals c.Key.grp
select new{g.Key.grp,c.Key.cat,percent=c.Count()*100/g.Count()};
But it would be nice to have something like this:
from a in db.A
group a by new{grp=a.grp} into grp
from g in grp
from c in g.B
group c by new{gcnt=grp.Count(),c.cat} into cat
from c in cat
select new{c.A.grp,c.cat,cnt=cat.Count()*100/cat.Key.gcnt}
But it gives me the following runtime exception: The nested query is not supported. Operation1='GroupBy' Operation2='MultiStreamNest'"
来源:https://stackoverflow.com/questions/5981476/using-result-of-aggregate-from-top-level-group-inside-lower-level-group