问题
I have a table as below : (This is a few lines from my table)
T = table({'A';'A';'A';'B';'B';'B';'C';'C';'C';'C'}, {'x';'y';'z';'x';'w';'t';'z';'x';'t';'o'},[5;1;2;2;4;2;2;5;4;1], ...
'VariableNames', {'memberId', 'productId','Rating'});
T:
A x 5
A y 1
Z z 2
B x 2
B w 4
B t 2
C z 2
C x 5
C t 4
C o 1
C u 3
D r 1
D t 2
D w 5
.
.
.
.
I need to take the user A then Create a table like Previous table (Table T) and All rows are related to the user A to enter that table.At this point in the table are the following lines:
A x 5
A y 1
A z 2
Next, consider products related to this user i.e x,y,z . then All lines that contain x and then y and z are adding to the table. At this point in the table are the following lines:
A x 5
A y 1
A z 2
B x 2
C z 2
C x 5
Then, other users have been added to the table to consider i.e B,C . Then The same thing was done for the first user (A) is done for this user (Respectively for B then C). This is done so that the required number of rows add in the table. Here, for example, 8 rows is required. i.e The end result is as follows:
A x 5
A y 1
A z 2
B x 2
C z 2
C x 5
B w 4
B t 2
i.e when work is finished the requested number of rows in the second table row to be imported.
I would be grateful if any body help me in this regard.
回答1:
Here is a way for doing what you ask for (though some cases are not well defined in your question):
% I added user 'D' for the scenario of an unconnected node
T = table({'A';'A';'A';'B';'B';'B';'C';'C';'C';'C';'D';'D';'D';'D'},...
{'x';'y';'z';'x';'w';'t';'z';'x';'t';'o';'q';'p';'f';'v'},...
[5;1;2;2;4;2;2;5;4;1;4;5;2;1], ...
'VariableNames', {'memberId', 'productId','Rating'});
% initial preparations:
rows_limit = 8;
first_user = 'B'; % this is just for readability
newT = table(cell(rows_limit,1),cell(rows_limit,1),zeros(rows_limit,1),...
'VariableNames',{'memberId', 'productId','Rating'});
% We need an index vector so we won't add the same row twice:
added = false(height(T),1);
row_count = 1;
users_list = {first_user};
% now we start adding rows to newT until it's full:
while row_count<rows_limit
while numel(users_list)>=1
% get all the user's rows
next_to_add = strcmp(T.memberId,users_list{1}) & ~added;
% if this user has any rows to be added:
if sum(next_to_add)>0
% if there's enough empty rows in newT add them to it:
if sum(next_to_add) <= rows_limit-row_count+1
newT(row_count:row_count+sum(next_to_add)-1,:) = T(next_to_add,:)
% and update the index vector:
added = added | strcmp(T.memberId,users_list{1});
else
% otherwise - fill the empty rows and quit the loop:
if row_count <= rows_limit
end_to_add = find(next_to_add,rows_limit-row_count+1);
newT(row_count:rows_limit,:) = T(end_to_add,:)
end
row_count = rows_limit+1; % to exit the outer loop
break
end
row_count = row_count+sum(next_to_add);
% Add related products:
% ====================
% save the first new user to be addaed by related products:
last_user_row = row_count;
% get all the products we already added to newT:
products = unique(newT.productId(1:row_count-1),'stable');
% although we want only the last user products, because we add all the
% products the before, our index vector ('added') will eliminate them
for p = 1:numel(products)
% get all the product's rows
next_to_add = strcmp(T.productId,products{p}) & ~added;
% if there's enough empty rows in newT add them to it:
if sum(next_to_add)>0
if sum(next_to_add) <= rows_limit-row_count+1
newT(row_count:row_count+sum(next_to_add)-1,:) = T(next_to_add,:);
% and update the index vector:
added = added | strcmp(T.productId,products{p});
else
% otherwise - fill the empty rows and quit the loop:
if row_count <= rows_limit
end_to_add = find(next_to_add,rows_limit-row_count+1);
newT(row_count:rows_limit,:) = T(end_to_add,:);
end
row_count = rows_limit+1; % to exit the outer loop
break
end
end
row_count = row_count+sum(next_to_add);
end
end
% get the list of new users we just added, and concat to the users
% left in the original list:
users_list = [unique(newT.memberId(last_user_row:row_count-1),'stable');
unique(T.memberId(~added),'stable')];
end
end
Which gives newT:
memberId productId Rating
________ _________ ______
'B' 'x' 2
'B' 'w' 4
'B' 't' 2
'A' 'x' 5
'C' 'x' 5
'C' 't' 4
'A' 'y' 1
'A' 'z' 2
In this implementation, the rows are added user by user, and product by product, and if the next user/product to be added has more rows then what's available in newT, then we add as much rows as we cen, until we get to the rows_limit and then the loop quits.
So for a rows_limit = 4;, you will get newT as:
memberId productId Rating
________ _________ ______
'B' 'x' 2
'B' 'w' 4
'B' 't' 2
'A' 'x' 5
As long as there are connections between users, so each user's related products brings new users to the list, the loop continues with the new users in newT. However, it could be that we start from a node that not all other nodes are parts of its network. For instance, have a look a the following graph figure that illustrates the connections in the extended example I used in the code above:
Node D is not connected to all others, so unless we actively look for new unrelated users in T, we will never get to it. The implementation above does look for this kind of users.
来源:https://stackoverflow.com/questions/39722872/adding-multiple-rows-of-a-table-to-another-table