Importing SPSS dataset into Python

爱⌒轻易说出口 提交于 2019-12-07 02:42:49

问题


Is there any way to import SPSS dataset into Python, preferably NumPy recarray format? I have looked around but could not find any answer.

Joon


回答1:


Maybe this will help: Python reader + writer for spss sav files (Linux, Mac & Windows) http://code.activestate.com/recipes/577811-python-reader-writer-for-spss-sav-files-linux-mac-/




回答2:


SPSS has an extensive integration with Python, but that is meant to be used with SPSS (now known as IBM SPSS Statistics). There is an SPSS ODBC driver that could be used with Python ODBC support to read a sav file.




回答3:


Option 1 As rkbarney pointed out, there is the Python savReaderWriter available via pypi. I've run into two issues:

  1. It relies on a lot of extra libraries beyond the seemingly pure-python implementation. SPSS files are read and written in nearly every case by the IBM provided SPSS I/O modules. These modules differ by platform and in my experience "pip install savReaderWriter" doesn't get them running out of the box (on OS X).
  2. Development on savReaderWriter is, while not dead, less up-to-date than one might hope. This complicates the first issue. It relies on some deprecated packages to increase speed and gives some warnings any time you import savReaderWriter if they're not available. Not a huge issue today but it could be trouble in the future as IBM continues to update the SPSS I/O modules to deal new SPSS formats (they're on version 21 or 22 already if memory serves).

Option 2 I've chosen to use R as a middle-man. Using rpy2, I set up a simple function to read the file into an R data frame and output it again as a CSV file which I subsequently import into python. It's a bit rube-goldberg but it works. Of course, this requires R which may also be a hassle to install in your environment (and has different binaries for different platforms).




回答4:


gretl claims to import SPSS and export in a variety of formats, as does the R statistical suite. I've never dealt with SPSS data so cannot speak to their relative merits.




回答5:


You could have Python make an external call to spssread, a Perl script that outputs the content of SPSS files in the way you want.




回答6:


To be clear, the SPSS ODBC driver does not require an SPSS installation.




回答7:


Maybe this will be helpful for someone:

http://sourceforge.net/search/?q=python+SPSS

good luck!

Michal



来源:https://stackoverflow.com/questions/3639639/importing-spss-dataset-into-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!