SAS PROC IMPORT Multiple SAV Files- Force SPSS Value Labels to Create UNIQUE SAS Format Names

放肆的年华 提交于 2021-01-07 02:52:14

问题


Sometimes if I import multiple SAV files into the SAS work library, one variable imported later on overwrites the display text (i.e., the format) of an earlier imported variable with a similar name.

I've determined that this is because the later dataset's variable produces a format name for the custom format (from SPSS Values Labels) that is identical to format name from the earlier variable, even though the variables have different definitions in the Value Labels attributes in the SAV files.

Is there a way to force SAS to not re-use the same format names by automatically checking at PROC IMPORT whether a format name already exists in the work library format library before auto-naming a new custom format? Or is there any other way of preventing this from happening?

Here is my code as well as an example of the variable names, format names, etc.

proc import out=Dataset1 datafile="S:\folder\Dataset1.SAV"
dbms=SAV replace; 
run;
proc import out=DatasetA datafile="S:\folder\DatasetA.SAV"
dbms=SAV replace; 
run;

Dataset1 contains variable Question_1. The original SPSS Values Labels are 1=Yes 2=No. When this dataset is imported, SAS automatically generates the Format Name QUESTION., for Question_1. When only Dataset1 is imported, the definition of format QUESTION. corresponds to the SPSS Value Labels for Question_1 in Dataset1.SAV

DatasetA contains variable Question_A with SPSS Value Labels 1=Agree 2=Unsure 3=Disagree. When this dataset is imported after Dataset1, SAS automatically generates the Format Name QUESTION. for Question_A, even though the work library already contains a format named QUESTION.. Therefore, this overwrites the definition of format QUESTION. that was generated when Dataset1 was imported. Once DatasetA is imported, the definition of format QUESTION. corresponds to the SPSS Value Labels for Question_A in DatasetA.SAV

Therefore, when Dataset1 and DatasetA are both imported, Variable Question_1 and Question_A both have the format name QUESTION assigned to them - And the definition of the format QUESTION. in the SAS work folder corresponds to the SPSS Value Labels in DatasetA.SAV, not Dataset1.SAV. Therefore, Question_1 will display as 1=Agree 2=Unsure, even though the variable values actually mean 1=Yes 2=No.

I would ideally like for these two variables to produce distinct custom format names at their import step, automatically. Is there any way to make this happen? Alternatively, is there any other way that prevent this type of overwriting from occurring?

Thank you.


回答1:


The way to prevent literal overwriting is to point to a different format catalog for each SPSS file that is being read using the FMTLIB= optional statement.

proc import out=dataset1 replace 
   datafile="S:\folder\Dataset1.SAV" dbms=SAV 
; 
   fmtlib=work.fmtcat1;
run;
proc import out=dataset2 replace 
   datafile="S:\folder\Dataset2.SAV" dbms=SAV 
; 
   fmtlib=work.fmtcat2;
run;

You can then work later to rename the conflicting formats (and change the attached format in the dataset to use the new name).

So if the member name and format name are short enough you should be able to generate a unique new name by appending the two (add something in between to avoid conflict). So something like this will rename the formats, change the format name attached to the variables and rebuild the formats into the WORK.FORMATS catalog.

%macro sav_import(file,memname);
%if 0=%length(&memname) %then %let memname=%scan(&file,-2,\./);

proc import datafile=%sysfunc(quote(&file)) dbms=save
  out=&memname replace
; 
  fmtlib=work.&memname ;
run;

proc format lib=work.&memname cntlout=formats;
run;

data formats ;
  set formats end=eof;
  by fmtname type notsorted;
  oldname=fmtname;
  fmtname=catx('_',"&memname",oldname);
run;

proc contents data=&memname noprint out=contents;
run;

proc sql noprint;
  select distinct catx(' ',c.name,cats(f.fmtname,'.'))
    into :fmtlist separated by ' '
  from contents c inner join formats f
  on c.format = f.oldname
  ;
quit;

proc datasets nolist lib=work;
  modify &memname;
    format &fmtlist ;
  run;
quit;

proc format lib=work.formats cntlin=formats;
run;

%mend sav_import;

%sav_import(S:\folder\Dataset1.SAV);
%sav_import(S:\folder\Dataset2.SAV);


来源:https://stackoverflow.com/questions/62999858/sas-proc-import-multiple-sav-files-force-spss-value-labels-to-create-unique-sas

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!