SAS creating a dynamic interval

天涯浪子 提交于 2020-01-15 11:04:29

问题


This is somewhat complex (well to me at least).

Here is what I have to do: Say that I have the following dataset:

date    price   volume
02-Sep  40  100
03-Sep  45  200
04-Sep  46  150
05-Sep  43  300

Say that I have a breakpoint where I wish to create an interval in my dataset. For instance, let my breakpoint = 200 volume transaction.

What I want is to create an ID column and record an ID variable =1,2,3,... for every breakpoint = 200. When you sum all the volume per ID, the value must be constant across all ID variables.

So using my example above, my final dataset should look like the following:

date    price   volume  id
02-Sep  40  100 1
03-Sep  45  100 1
03-Sep  45  100 2
04-Sep  46  100 2
04-Sep  46  50  3
05-Sep  43  150 3
05-Sep  43  150 4 

(last row can miss some value but that is fine. I will kick out the last id)

As you can see, I had to "decompose" some rows (like the second row for instance, I break the 200 into two 100 volume) in order to have constant value of the sum, 200, of volume across all ID.


回答1:


Looks like you're doing volume bucketing for a flow toxicity VPIN calculation. I think this works:

%let bucketsize = 200;

data buckets(drop=bucket volume rename=(vol=volume));
    set tmp;
    retain bucket &bucketsize id 1;

    do until(volume=0);
        vol=min(volume,bucket);
        output;
        volume=volume-vol;
        bucket=bucket-vol;
        if bucket=0 then do;
            bucket=&bucketsize;
            id=id+1;
        end;
    end;
run;

I tested this with your dataset and it looks right, but I would check carefully several cases to confirm that it works right.




回答2:


If you have a variable which indicates 'Buy' or 'Sell', then you can try this. Let's say this variable is called type and takes the values 'B' or 'S'. One advantage of using this method would be that it is easier to process 'by-groups' if any.

%let bucketsize = 200;

data tmp2;
  set tmp;
  retain volsumb idb volusums ids;

  /* Initialize. */
  volusumb = 0; idb = 1; volsums = 0; ids = 1;

  /* Store the current total for each type. */
  if type = 'B' then volsumb = volsumb + volume;
  else if type = 'S' then volsums = volsums + volume;

  /* If the total has reached 200, then reset and increment id. */
  /* You have not given the algorithm if the volume exceeds 200, for example the first two values are 150 and 75. */
  if volsumb = &bucketsize then do; idb = idb + 1; volsumb = 0; end;
  if volsums = &bucketsize then do; ids = ids + 1; volsums = 0; end;

  drop volsumb volsums;
run;


来源:https://stackoverflow.com/questions/11067853/sas-creating-a-dynamic-interval

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!