问题
I'm trying to read some data from SQL (using pyodbc) into a numpy structured array (I believe a structured array is required due to the multiple dtypes).
import pyodbc
import numpy as np
cnxn = pyodbc.connect('DRIVER={SQL Server};Server=SERVER;Database=DB;Trusted_Connection=Yes;')
cursor = cnxn.cursor()
sql_ps = "select a, b from table"
cursor.execute(sql_positions)
p_data = cursor.fetchall()
cnxn.close
ndtype = np.dtype([('f1','>f8'),('f2','|S22')])
p_data = np.asarray(p_data, dtype=ndtype)
However this returns:
TypeError: expected a readable buffer object
If I load into the array as a tuple
p_data_tuple = np.asarray([tuple(i) for i in p_data], dtype=ndtype)
It works, however p_data_tuple
is an array of tuples, as opposed to a 2d array, meaning I cannot call elements using p_data_tuple[0,1]
Does anyone know how I can either put the data returned directly into a str array with multiple dtypes, or convert the array of tuples into a 2d array of multiple dtypes, or some other solution?
Thanks
回答1:
Your cursor.fetchall
returns a list of records. A record is 'Row objects are similar to tuples, but they also allow access to columns by name' (http://mkleehammer.github.io/pyodbc/). Sounds like a namedtuple to me, though the class details may be different.
sql_ps = "select a, b from table"
cursor.execute(sql_positions)
p_data = cursor.fetchall()
cnxn.close
just for fun let's change the dtype
to use the same field names as the sql
:
ndtype = np.dtype([('a','>f8'),('b','|S22')])
This doesn't work, presumably because the tuple-like
record isn't a real tuple.
p_data = np.array(p_data, dtype=ndtype)
So instead we convert each record to a tuple. Structured arrays take their data as a list of tuples.
p_data = np.array([tuple(i) for i in p_data], dtype=ndtype)
Now you can access the data by field or by row
p_data['a'] # 1d array of floats
p_data['b'][1] # one string
p_data[10] # one record
A record from p_data
displays as a tuple, though it does actually have a dtype
like the parent array.
There's a variant on structured arrays, recarray
that adds the ability to access fields by attribute name, e.g. p_rec.a
. That's even closer to the dp cursor record, but doesn't add much otherwise.
So this structured array is quite similar to your source sql table - with fields and rows. It's not a 2d array, but indexing by field name is similar to indexing a 2d array by column number.
pandas
does something similar, though it often resorts to using dtype=object
(like the pointers of Python lists). And it keeps track of 'row' labels.
来源:https://stackoverflow.com/questions/35174979/read-data-into-structured-array-with-multiple-dtypes