group by in Matlab to find the value that resulted minimum similar to SQL

感情迁移 提交于 2019-12-12 03:54:03

问题


I have a dataset having columns a, b, c and d I want to group the dataset by a,b and find c such that d is minimum for each group I can do "group by" using 'grpstats" as :

grpstats(M,[M(:,1) M(:,2) ],{'min'});

I don't know how to find the value of M(:,3) that resulted the min in d

In SQL I suppose we use nested queries for that and use the primary keys. How can I solve it in Matlab?

Here is an example:

>> M =[4,1,7,0.3;
2,1,8,0.4;
2,1,9,0.2;
4,2,1,0.2;
2,2,2,0.6;
4,2,3,0.1;
4,3,5,0.8;
5,3,6,0.2;
4,3,4,0.5;]

>> grpstats(M,[M(:,1) M(:,2)],'min')
ans =

2.0000    1.0000    8.0000    0.2000
2.0000    2.0000    2.0000    0.6000
4.0000    1.0000    7.0000    0.3000
4.0000    2.0000    1.0000    0.1000
4.0000    3.0000    4.0000    0.5000
5.0000    3.0000    6.0000    0.2000

But M(1,3) and M(4,3) are wrong. The correct answer that I am looking for is:

2.0000    1.0000    9.0000    0.2000
2.0000    2.0000    2.0000    0.6000
4.0000    1.0000    7.0000    0.3000
4.0000    2.0000    3.0000    0.1000
4.0000    3.0000    4.0000    0.5000
5.0000    3.0000    6.0000    0.2000

To conclude, I don't want the minimum of third column; but I want it's values corresponding to minimum in 4th column


回答1:


grpstats won't do this, and MATLAB doesn't make it as easy as you might hope.

Sometimes brute force is best, even if it doesn't feel like great MATLAB style:

[b,m,n]=unique(M(:,1:2),'rows');
for i =1:numel(m)
    idx=find(n==i);
    [~,subidx] = min(M(idx,4));
    a(i,:) = M(idx(subidx),3:4);
end

>> [b,a]
ans =
        2            1            9          0.2
        2            2            2          0.6
        4            1            7          0.3
        4            2            3          0.1
        4            3            4          0.5
        5            3            6          0.2



回答2:


I believe that

temp = grpstats(M(:, [1 2 4 3]),[M(:,1) M(:,2) ],{'min'});
result = temp(:, [1 2 4 3]);

would do what you require. If it doesn't, please explain in the comments and we can figure it out...

If I understand the documentation correctly, even

temp = grpstats(M(:, [1 2 4 3]), [1 2], {'min'});
result = temp(:, [1 2 4 3]);

should work (giving column numbers rather than full contents of columns)... Can't test right now, so can't vouch for that.



来源:https://stackoverflow.com/questions/18605141/group-by-in-matlab-to-find-the-value-that-resulted-minimum-similar-to-sql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!