问题
My data is every three days, but in my cell array, there are sometimes missing days. How can I make the matrix add dates when it skips a day and put a NaN into the Sample Measurement cell?
Here's an example. I put 2 lines from each of the 4 sites. There aren't any empty rows between the different sites - they are just there for clarity.
Latitude Longitude SiteID Date Local Sample Measurement
43.435 -88.527778 027-0007 4/12/2007 4.3
43.435 -88.527778 027-0007 4/15/2007 9.3
43.060975 -87.913504 079-0026 4/12/2007 7.9
43.060975 -87.913504 079-0026 4/15/2007 11.3
45.203885 -90.600123 119-8001 4/12/2007 3.3
45.203885 -90.600123 119-8001 4/18/2007 9.5
43.020075 -88.21507 133-0027 4/12/2007 7.3
43.020075 -88.21507 133-0027 4/18/2007 5.6
Here is sort of what I want - NaN's where there are missing days. As you can see, there are different SiteID's so I will need to maybe do unique to run through the sites separately.
Latitude Longitude SiteID Date Local Sample Measurement
43.435 -88.527778 027-0007 4/12/2007 4.3
43.435 -88.527778 027-0007 4/15/2007 9.3
43.060975 -87.913504 079-0026 4/12/2007 7.9
43.060975 -87.913504 079-0026 4/15/2007 11.3
45.203885 -90.600123 119-8001 4/12/2007 3.3
45.203885 -90.600123 119-8001 4/15/2007 NaN
43.020075 -88.21507 133-0027 4/12/2007 7.3
43.020075 -88.21507 133-0027 4/15/2007 NaN
I began something like this:
Set = datenum(2007,4,12):2:datenum(2007,10,15);
B = cat(2,PM25data(:,1:2), PM25data(:,6), PM25data(:,12), PM25data(:,16)); % Pull out only the columns needed
% B = {'Lat', 'Lon', 'SiteID', 'Date', 'Data'};
E = zeros(63, 5);
i = 1;
j = 1;
k = 1;
while i <= length(PM25site) && j <= length(E) && k <= length(B) % i = 1:4, j = 1:63, k = 1:32
if datenum(B(j,4)) ~= datenum(Set(j))
C = datenum(Set(j));
D = NaN;
E(j,:) = cat(2, str2double(B(j,1:3)), C, D);
j = j+1;
else
E(j,:) = str2double(B(k,:));
k = k+1;
j = j+1;
end
E(:,3) = PM25site(i);
i = i+1;
end
This code is not advancing correctly. It think I'm not indexing it correctly and the else is not correct. It goes and puts what I want down, but only replaces the zeros for the first few rows and then keeps zeros all the way down.
Here's an example section:
45.203885 -90.600123 NaN 733144 3.3
45.203885 -90.600123 NaN 733146 NaN
45.203885 -90.600123 NaN 733148 NaN
45.203885 -90.600123 NaN 733150 NaN
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
I don't know if this is the best way to approach it. I just want to add NaN's where there is no data based on the dates.
回答1:
I don't think you need to iterate through with a while-loop. It will be slow, and doesn't utilise MATLAB's matrix capabilities. Here's how I would do it.
all_dates = datenum(2007,4,12):2:datenum(2007,10,15);
% Note that we take the datenum of column 4 here now
B = cat(2,PM25data(:,1:2), PM25data(:,6), datenum(PM25data(:,12)), PM25data(:,16));
% First, generate a list of all siteIDs
[uID,ia] = unique(B(:,3));
% Now, preallocate the result matrix.
% Use NaNs, since we will overwrite all non-nan values in the final matrix
E = nan(length(all_dates)*length(uID),5);
% Set the date column
E(:,4) = repmat(all_dates,length(uID),1);
% Set the lat, long and ID columns
E(:,1) = reshape(repmat(B(ia,1)',length(all_dates),1),[],1);
E(:,2) = reshape(repmat(B(ia,2)',length(all_dates),1),[],1);
E(:,3) = reshape(repmat(uID',length(all_dates),1),[],1);
% Find the columns which we have data for
data_ind = ismember(E(:,3:4),B(:,3:4),'rows');
% And then set the data values
E(data_ind,5) = B(:,5);
Most of this should be pretty clear, but I'll just clarify a few points.
The second output of unique generates an index matrix which can be used to find the unique results in the original matrix. We means that B(ia,3) generates a list of all the unique siteIDs. Additionally, B(ia,1) will generate a list of the latitudes for these siteIDs, and similarly for longitude.
repmat(all_dates,length(uID),1) repeats the list of all of the dates as many times as as we have siteIDs. Essentially, we're making sure that we have a list containing all date+siteID combinations.
reshape(repmat(uID',length(all_dates),1),[],1) is a neat little one-liner that will generate the list of siteIDs repeating like [1;1;1;2;2;2;3;3;3;...] instead of [1;2;3;1;2;3;1;2;3;...].
Finally, we use the 'rows' option to get ismember to search for a combination of date and siteID. Using this, we determine which date and siteID combinations we have data for, and copy this data to our final matrix. Any date+siteIDs for which we do not have data will be left as NaNs.
来源:https://stackoverflow.com/questions/21390103/matlab-create-table-with-nans-inserted-based-on-date-column