Calculating rolling correlations in SAS

后端 未结 1 1392
旧时难觅i
旧时难觅i 2020-12-22 12:29

I have a data set here.

An excerpt of my data looks like this: (For an enlarged version: http://puu.sh/79NCK.jpg)

相关标签:
1条回答
  • 2020-12-22 13:08

    Think about how you might do if you had immense amount of patience.

    proc corr data = mm.rolling out = correlation_as_of_01jan2008;
    where date between '01jan2007'd and '01jan2008'd;
    run;
    

    Similarly,

    proc corr data = mm.rolling out = correlation_as_of_02jan2008;
    where date between '02jan2007'd and '02jan2008'd;
    run;
    

    Thankfully you can use SAS macro programming to achieve a similar effect as shown in this macro:

    %macro rollingCorrelations(inputDataset=, refDate=);
    /*first get a list of unique dates on or after the reference date*/
    proc freq data = &inputDataset. noprint;
    where date >="&refDate."d;
    table date/out = dates(keep = date);
    run;
    
    
    /*for each date calculate what the window range is, here using a year's length*/
    data dateRanges(drop = date);
    set dates end = endOfFile 
                    nobs= numDates;
    format toDate fromDate date9.;
    
    toDate=date;
    fromDate = intnx('year', toDate, -1, 's');
    
    call symputx(compress("toDate"!!_n_), put(toDate,date9.));
    call symputx(compress("fromDate"!!_n_), put(fromDate, date9.) );
    
    /*find how many times(numberOfWindows) we need to iterate through*/
    if endOfFile then do;
    call symputx("numberOfWindows", numDates);
    end;
    
    run;
    %do i = 1 %to &numberOfWindows.;
    /*create a temporary view which has the filtered data that is passed to PROC CORR*/
    data windowedDataview / view = windowedDataview;
    set  &inputDataset.;
    where date between "&&fromDate&i."d and "&&toDate&i."d;
    drop date;
    run;
        /*the output dataset from each PROC CORR run will be 
    correlation_DDMMMYYY<from date>_DDMMMYY<start date>*/
    proc corr data = windowedDataview 
    outp = correlations_&&fromDate&i.._&&toDate&i. (where=(_type_ = 'CORR'))
    
            noprint;
    run;
    
    %end;
    
    /*append all datasets into a single table*/
    data all_correlations;
    format from to date9.;
    set correlations_:
         indsname = datasetname
    ;
    from = input(substr(datasetname,19,9),date9.);
    to = input(substr(datasetname,29,9), date9.);
    run;
    
    
    %mend rollingCorrelations;
    %rollingCorrelations(inputDataset=rolling, refDate=01JAN2008)
    

    The final output from the above macro will have from & to identifier to identify which date range each correlation matrix refers. Run it and examine the results.

    I dont think excel can accomodate over 1500 tabs anyway, so best to keep it in a single table. The final table had 81K rows and the whole process ran in 2.5 mins.

    update: to sort them by from & to

    proc sort data = ALL_CORRELATIONS;
    by from to;
    run;
    
    0 讨论(0)
提交回复
热议问题