问题
I have written the following snippet to import a CSV file into an MS SQL Server database but it gives me an error. It is based on code written for Sqlite for Python and changed for MSSQL.
import csv, pyodbc
import logging
def _get_col_datatypes(fin):
dr = csv.DictReader(fin) # comma is default delimiter
fieldTypes = {}
for entry in dr:
feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]
if not feildslLeft: break # We're done
for field in feildslLeft:
data = entry[field]
# Need data to decide
if len(data) == 0:
continue
if data.isdigit():
fieldTypes[field] = "INTEGER"
else:
fieldTypes[field] = "TEXT"
# TODO: Currently there's no support for DATE in sqllite
if len(feildslLeft) > 0:
raise Exception("Failed to find all the columns data types - Maybe some are empty?")
return fieldTypes
def escapingGenerator(f):
for line in f:
yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")
def csvToDb(csvFile, outputToFile = False):
# TODO: implement output to file
with open(csvFile,mode='r') as fin:
dt = _get_col_datatypes(fin)
fin.seek(0)
reader = csv.DictReader(fin)
# Keep the order of the columns name just as in the CSV
fields = reader.fieldnames
cols = []
# Set field and type
for f in fields:
cols.append("%s %s" % (f, dt[f]))
# Generate create table statement:
stmt = "CREATE TABLE ads (%s)" % ",".join(cols)
con = pyodbc.connect('DRIVER={SQL Server};SERVER=localhost;DATABASE=sd;UID=Test;PWD=11')
cur = con.cursor()
cur.execute(stmt)
fin.seek(0)
reader = csv.reader(escapingGenerator(fin))
# Generate insert statement:
stmt = "INSERT INTO ads VALUES(%s);" % ','.join('?' * len(cols))
cur.executemany(stmt, reader)
con.commit()
return con
csvToDb('Books.csv')
The error I am getting is
pyodbc.DataError: ('22018', "[22018] [Microsoft][ODBC SQL Server Driver][SQL Server]Conversion failed when converting the varchar value 'a' to data type int. (245) (SQLExecDirectW)")
Also please suggest if you think there are any other methods to dynamically import CSV or text files into an MSSQL database
回答1:
The error message
Conversion failed when converting the varchar value 'a' to data type int.
reveals that your code can be "fooled" into thinking that a column is integer when it is really text, presumably because it only looks at the first row of data. Testing reveals that both
ID,txt1,txt2,int1
1,foo,123,3
2,bar,abc,4
and
"ID","txt1","txt2","int1"
1,"foo","123",3
2,"bar","abc",4
result in your code producing the CREATE TABLE statement:
CREATE TABLE ads (ID INTEGER,txt1 TEXT,txt2 INTEGER,int1 INTEGER)
which is wrong because the [txt2] column is not really INTEGER.
You could investigate tweaking your code to look at more than the first data row. (Microsoft's own import routines often default to the first eight rows when attempting to auto-detect data types.) You could also just import all columns as text and then convert them later in SQL server.
However, given that there must be hundreds – if not thousands – of examples out there for importing CSV data to SQL Server you should also consider doing a more exhaustive search for existing (debugged) code before you continue investing time and effort into "rolling your own solution".
来源:https://stackoverflow.com/questions/37418527/pyodbc-dataerror-22018-22018-microsoftodbc-sql-server-driversql-serv