MATLAB Create table with NaN's inserted based on Date column

為{幸葍}努か 提交于 2019-12-11 12:42:57

问题


My data is every three days, but in my cell array, there are sometimes missing days. How can I make the matrix add dates when it skips a day and put a NaN into the Sample Measurement cell?

Here's an example. I put 2 lines from each of the 4 sites. There aren't any empty rows between the different sites - they are just there for clarity.

Latitude     Longitude  SiteID          Date Local  Sample Measurement
43.435  -88.527778  027-0007    4/12/2007   4.3
43.435  -88.527778  027-0007    4/15/2007   9.3

43.060975   -87.913504  079-0026    4/12/2007   7.9
43.060975   -87.913504  079-0026    4/15/2007   11.3

45.203885   -90.600123  119-8001    4/12/2007   3.3
45.203885   -90.600123  119-8001    4/18/2007   9.5

43.020075   -88.21507   133-0027    4/12/2007   7.3
43.020075   -88.21507   133-0027    4/18/2007   5.6

Here is sort of what I want - NaN's where there are missing days. As you can see, there are different SiteID's so I will need to maybe do unique to run through the sites separately.
Latitude Longitude SiteID Date Local Sample Measurement 43.435 -88.527778 027-0007 4/12/2007 4.3 43.435 -88.527778 027-0007 4/15/2007 9.3

43.060975   -87.913504  079-0026    4/12/2007   7.9
43.060975   -87.913504  079-0026    4/15/2007   11.3

45.203885   -90.600123  119-8001    4/12/2007   3.3
45.203885   -90.600123  119-8001    4/15/2007   NaN

43.020075   -88.21507   133-0027    4/12/2007   7.3
43.020075   -88.21507   133-0027    4/15/2007   NaN

I began something like this:

Set = datenum(2007,4,12):2:datenum(2007,10,15);

B = cat(2,PM25data(:,1:2), PM25data(:,6), PM25data(:,12), PM25data(:,16)); % Pull out only the columns needed
% B = {'Lat', 'Lon', 'SiteID', 'Date', 'Data'};
E = zeros(63, 5);

i = 1;
j = 1;
k = 1;
while i <= length(PM25site) && j <= length(E) && k <= length(B) % i = 1:4, j = 1:63, k = 1:32

    if datenum(B(j,4)) ~= datenum(Set(j))
        C = datenum(Set(j));
        D = NaN;
        E(j,:) = cat(2, str2double(B(j,1:3)), C, D);
        j = j+1;
    else
        E(j,:) = str2double(B(k,:));
        k = k+1;
        j = j+1;
    end
    E(:,3) = PM25site(i);
    i = i+1;

end

This code is not advancing correctly. It think I'm not indexing it correctly and the else is not correct. It goes and puts what I want down, but only replaces the zeros for the first few rows and then keeps zeros all the way down.

Here's an example section:

45.203885   -90.600123  NaN 733144  3.3
45.203885   -90.600123  NaN 733146  NaN
45.203885   -90.600123  NaN 733148  NaN
45.203885   -90.600123  NaN 733150  NaN
0   0   0   0   0
0   0   0   0   0
0   0   0   0   0
0   0   0   0   0

I don't know if this is the best way to approach it. I just want to add NaN's where there is no data based on the dates.


回答1:


I don't think you need to iterate through with a while-loop. It will be slow, and doesn't utilise MATLAB's matrix capabilities. Here's how I would do it.

all_dates = datenum(2007,4,12):2:datenum(2007,10,15);
% Note that we take the datenum of column 4 here now
B = cat(2,PM25data(:,1:2), PM25data(:,6), datenum(PM25data(:,12)), PM25data(:,16));

% First, generate a list of all siteIDs
[uID,ia] = unique(B(:,3));
% Now, preallocate the result matrix.
% Use NaNs, since we will overwrite all non-nan values in the final matrix
E = nan(length(all_dates)*length(uID),5);

% Set the date column
E(:,4) = repmat(all_dates,length(uID),1);

% Set the lat, long and ID columns
E(:,1) = reshape(repmat(B(ia,1)',length(all_dates),1),[],1);
E(:,2) = reshape(repmat(B(ia,2)',length(all_dates),1),[],1);
E(:,3) = reshape(repmat(uID',length(all_dates),1),[],1);

% Find the columns which we have data for
data_ind = ismember(E(:,3:4),B(:,3:4),'rows');
% And then set the data values
E(data_ind,5) = B(:,5);

Most of this should be pretty clear, but I'll just clarify a few points.

The second output of unique generates an index matrix which can be used to find the unique results in the original matrix. We means that B(ia,3) generates a list of all the unique siteIDs. Additionally, B(ia,1) will generate a list of the latitudes for these siteIDs, and similarly for longitude.

repmat(all_dates,length(uID),1) repeats the list of all of the dates as many times as as we have siteIDs. Essentially, we're making sure that we have a list containing all date+siteID combinations.

reshape(repmat(uID',length(all_dates),1),[],1) is a neat little one-liner that will generate the list of siteIDs repeating like [1;1;1;2;2;2;3;3;3;...] instead of [1;2;3;1;2;3;1;2;3;...].

Finally, we use the 'rows' option to get ismember to search for a combination of date and siteID. Using this, we determine which date and siteID combinations we have data for, and copy this data to our final matrix. Any date+siteIDs for which we do not have data will be left as NaNs.



来源:https://stackoverflow.com/questions/21390103/matlab-create-table-with-nans-inserted-based-on-date-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!