问题
I have two vectors of length 16. The first one, r, for example is:
r = [1;3;5;7;1;3;6;7;9;11;13;16;9;11;13;16];
r contains a list of IDs. I want to collect the indices of the duplicate IDs in r so that each group is a list of indices for one ID. I would then use these indices to access a second vector a and find the maximum value incident on the indices for each group.
Therefore, I would like to produce an output vector using r and a such that:
max(a(1),a(5)), max(a(2),a(6)), a(3), a(7), max(a(4),a(8)), max(a(9),a(13)), max(a(10),a(14)), max(a(11),a(15)), max(a(12),a(16))
I also want to keep the indices of the maximum values. How can I efficiently implement this in MATLAB?
回答1:
You can use the third output of unique to assign each unique number in r a unique ID. You can then bin all of the numbers that share the same ID with an accumarray call where the key is the unique ID and the value is the actual value of a for the corresponding position of the key in this unique ID array. Once you collect all of these values, use accumarray so that you can use these values for each unique value in r to reference into a and select out the maximum element:
%// Define r and a
r = [1;3;5;7;1;3;6;7;9;11;13;16;9;11;13;16];
a = [...];
%// Relevant code
[~,~,id] = unique(r, 'stable');
out = accumarray(id(:), a(:), [], @max);
The 'stable' flag in unique is important because we want to assign unique IDs in order of occurrence. Not doing this will sort the values in r first before assigning IDs and that's not what we want.
Here's a quick example. Let me set up your problem with generating a random 16 element array stored in a which you are trying to ultimately index. We'll also set up r:
rng(123);
a = rand(16,1);
r = [1;3;5;7;1;3;6;7;9;11;13;16;9;11;13;16];
This is what a looks like:
>> a
a =
0.6965
0.2861
0.2269
0.5513
0.7195
0.4231
0.9808
0.6848
0.4809
0.3921
0.3432
0.7290
0.4386
0.0597
0.3980
0.7380
After running through the code, we get this:
out =
0.7195
0.4231
0.2269
0.6848
0.9808
0.4809
0.3921
0.3980
0.7380
You can verify for yourself that this gives the right result. Specifically, the first element is the maximum of a(1) and a(5) which is 0.6965 and 0.7195 respectively, and the maximum is 0.7195. Similarly, the second element is the maximum a(2) and a(6), which is 0.2861 and 0.4231, and the maximum is 0.4231 and so on.
If it is your desire to also remember what the indices were used to select out the maximum element, this will be slightly more complicated. What you need to do is call accumarray once again, but the values won't be those of a but the actual index values instead. You'd use the second output of max to get the actual location of the value chosen. However, with the nature of max, we can't just grab the second element of max without explicitly calling the two-output version of max (I really wish there was another way around this... Python has a function in NumPy called numpy.argmax) and this can't be properly encapsulated in an anonymous function (i.e. @(x) ...), so you're going to need to create a custom function to do that.
Create a new function called maxmod and save it to a file called maxmod.m. You'd put this inside the function:
function p = maxmod(vals, ind)
[~,ii] = max(vals(ind));
p = ind(ii);
This takes in an array and a range of indices to access the array, called vals. We'd then find the maximum of these selected results, then return which index gave us the maximum.
After, you'd call accumarray like so:
%// Define r and a
r = [1;3;5;7;1;3;6;7;9;11;13;16;9;11;13;16];
a = [...];
%// Relevant code
[~,~,id] = unique(r, 'stable');
out = accumarray(id(:), (1:numel(r)).', [], @(x) maxmod(a,x));
This is now what I get:
>> out
out =
5
6
3
8
7
9
10
15
16
If you look at each value, this reflects which location of a we chose that corresponds to the maximum of each group.
来源:https://stackoverflow.com/questions/32913945/grouping-elements-with-the-same-id-and-finding-the-maximum-value-as-well-as-its