sas

SAS not recognizing date format

假如想象 提交于 2020-02-24 10:26:07
问题 I have the following character date format: "3/1990" "4/1990" "5/1990" ... I tried the following code: data work.temps; set indata; newdate = input(strip(Date), MMYYSw.); rename newdate = date; run; I keep on getting the following error meassage: Informat MMYYSW was not found or could not be loaded. 回答1: You may have to use a different informat to read in the character dates so that SAS can interpret them as numeric (since dates in SAS are actually numeric values), and then format them as

SAS数据清洗之字符和数字处理

孤街醉人 提交于 2020-02-23 16:02:21
SAS数据清洗: 由于SAS数据集之间的关系一般不会用到,只是在proc sql中有所涉及,至今尚未运用过用于数据分析,所以在这里只讲单个数据集的处理。 在proc sql中我们可以看到:在定义数据集时涉及到字段名,字段属性,字段标签这三个最常用。我们在数据清洗时涉及到的数据集字段的处理,主要也就是围绕着字段名、字段属性和字段标签来进行处理。(对数据集结构的处理也就是对字段的增删改) 修改数据集名称和标签(label) 增加字段很简单,只需要在data步新建一个变量,对变量进行赋值即可。 删除变量可以使用drop和keep二者二选一。 下面谈一下最复杂的修改字段: 修改字段名最简单复杂的方法可以利用增加字段和删除字段二者结合。然后就是用sas中的rename关键字进行处理,格式为rename=(原字段名=新字段名)。 修改字段的标签:(定义label attrib label 变量名=label名) 修改使用modify 数据集名;label 变量名=label名。其中modify可以用于修改rename format和label。 修改sas的数据类型。 在sas中只有两种数据类型:数值型和字符型。日期在sas中存储形式是数值型,只是在显示时使用日期的format显示。 所以主要就是涉及到数值型和字符型之间的转换 数值转换为字符: data; x=2557898; y=put(x

SAS Proc Import CSV and missing data

穿精又带淫゛_ 提交于 2020-02-23 06:27:50
问题 So, I'm trying to import some datasets in SAS and join them, the only problem is that I get this error after joining them - proc import datafile='filepath/datasetA.csv' out = dataA dbms= csv replace; run; proc import datafile='filepath\datasetB.csv' out = dataB dbms= csv replace; run; /* combine them all into one dataset*/ data DataC; set &dataA. &dataB; run; ERROR: Variable column_k has been defined as both character and numeric The column in question looks something like this in both of the

SAS Proc Import CSV and missing data

◇◆丶佛笑我妖孽 提交于 2020-02-23 06:26:48
问题 So, I'm trying to import some datasets in SAS and join them, the only problem is that I get this error after joining them - proc import datafile='filepath/datasetA.csv' out = dataA dbms= csv replace; run; proc import datafile='filepath\datasetB.csv' out = dataB dbms= csv replace; run; /* combine them all into one dataset*/ data DataC; set &dataA. &dataB; run; ERROR: Variable column_k has been defined as both character and numeric The column in question looks something like this in both of the

Think in SAS

孤人 提交于 2020-02-18 09:57:33
本文在Creative Commons许可证下发布 首先SAS 可以作为一门职业 从实际的角度来说,有一个工种就叫做SAS程序员(SAS Programer, 或叫做Statistical SAS Programmer、Statistical Analyst)。在全球最大的求职网站 www.monster.com ,分别以SAS等作为skill关键词,搜索结果如下(测试时间:2010-04-13,你现在看到的会有细微的差别): 1645 SAS jobs 577 Matlab jobs 329 SPSS jobs 87 Fortran jobs 59 STATA jobs 59 Maple jobs 24 Mathematica jobs 这是英语世界的大致情况。说说我们身边的机会。SAS使用者大多集中在医药、金融等行业。现在国际上的大药厂(辉瑞、拜耳、诺华,……)纷纷在中国开研发中心,对生物统计师(Biostatistician,包括统计师和程序员。程序员就是SAS程序员,而SAS编程也是统计师的基本要求之一)的需 求渐长。在金融领域,拿我稍熟悉的信用评分领域来讲,熟悉SAS和数据挖掘的人才也很短缺。另外,广泛的机会还能在国内如火如荼的互联网公司和通信行业找到。 其实,即使不把SAS作为一门职业,对一份分析类的工作,你简历中出现SAS也会比出现其他类似的东西(Excel、……

Variable check and summary out

不羁岁月 提交于 2020-02-08 02:47:19
问题 Problem/Question I'm trying to do a simple check on on a list of variables in a data set (revenue, costs, profits, and vcosts) that grabs the largest and second largest from each variable, checks if their total is greater than 90% of the sum of the variable, and if so, flags that variable. I want to also check that the largest variable is not larger than 60% of the total sum. I got a bit of help from this Macro that outputs table with testing results of SAS tableMacro that outputs table with

【SAS BASE】FORMAT Statement及PROC FORMAT

亡梦爱人 提交于 2020-02-06 16:32:43
/*FORMAT语句*/1 FORMAT Profit Loss DOLLAR8.2 Saledate MMDDYY8.; 2 PUT Profit DOLLAR8.2 LOSS DOLLAR8.2 Saledate MMDDYY8.; FORMAT语句 指定每个变量具体的格式; 这里特别要注意的是,FORMAT语句中,指定Profit和Loss同一个格式,为DOLLAR8.2. FORMAT过程 1 DATA Carsurvey; 2 INFILE 'c:\myrawdata\cars.dat'; 3 INPUT Age Sex Income color$; 4 PROC FORMAT; 5 VALUE gender 1='Male' 6 2='Female'; 7 Value agegroup 13-<20='Teen' 8 20<-65='Adult' 9 60-High='Senior'; 10 Value $ color 'W'='Moon White' /*请注意此处$的位置*/ 11 'B'='SKy Blue' 12 'Y'='Sunburst Yellow' 13 'G'='Rain cloud Gray'; 14 PROC PRINT DATA=Carsurvey; 15 FORMAT Sex gender. Age agegroup. color $ col.

SAS . Are variables set to missing at every iteration of a data step?

走远了吗. 提交于 2020-02-03 07:56:27
问题 I always thought that the variables are set to missing for every iteration of the data step . However, in the following code, it looks like the value that the variable gets at the very beginning retains. I can't understand why this happens ? data one; input x $ y; datalines; a 10 a 13 a 14 b 9 ; run; data two; input z; datalines; 45 ; run; data test; if _n_ = 1 then set two; /* when _n_=2 the PDV assigns missing values, right ? */ set one; run; proc print; run; The outcome is z x y 45 a 10 45

SAS . Are variables set to missing at every iteration of a data step?

百般思念 提交于 2020-02-03 07:56:08
问题 I always thought that the variables are set to missing for every iteration of the data step . However, in the following code, it looks like the value that the variable gets at the very beginning retains. I can't understand why this happens ? data one; input x $ y; datalines; a 10 a 13 a 14 b 9 ; run; data two; input z; datalines; 45 ; run; data test; if _n_ = 1 then set two; /* when _n_=2 the PDV assigns missing values, right ? */ set one; run; proc print; run; The outcome is z x y 45 a 10 45

PROC SQL in SAS - All Pairs of Items

∥☆過路亽.° 提交于 2020-01-31 12:05:52
问题 I have a dataset in which I need to look at all pairs of items that are together from within another group. I've created a toy example below to further explain. BUNCH FRUITS 1 apples 1 bananas 1 mangos 2 apples 3 bananas 3 apples 4 bananas 4 apples What I want is a listing of all possible pairs and sum the frequency they occur together within a bunch. My output would ideally look like this: FRUIT1 FRUIT2 FREQUENCY APPLES BANANAS 3 APPLES MANGOS 1 My end goal is to make something that I'll