问题
I'm trying to import a CSV into MySQL using odo but am getting a datashape error.
My understanding is that datashape takes the format:
var * {
column: type
...
}
where var means a variable number of rows. I'm getting the following error:
AssertionError: datashape must be Record type, got 0 * {
tod: ?string,
interval: ?string,
iops: float64,
mb_per_sec: float64
}
I'm not sure where that 0 number of rows is coming from. I've tried explicitly setting the datashape using dshape()
, but continue to get the same error.
Here's a stripped down version of the code that recreates the error:
from odo import odo
odo('test.csv', mysql_database_uri)
I'm running Ubuntu 16.04 and Python 3.6.1 using Conda.
Thanks for any input.
回答1:
I had this error, needed to specify table
# error
odo('data.csv', 'postgresql://usr:pwd@ip/db')
# works
odo('data.csv', 'postgresql://usr:pwd@ip/db::table')
回答2:
Try replacing
odo('test.csv', mysql_database_uri)
with
odo(pandas.read_csv('test.csv') , mysql_database_uri)
回答3:
Odo seems to be buggy and discontinued. As an alternative you can use d6tstack which has fast pandas to SQL functionality because it uses native DB import commands. It supports Postgres, MYSQL and MS SQL,
cfg_uri_mysql = 'mysql+mysqlconnector://testusr:testpwd@localhost/testdb'
d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'),
apply_after_read=apply_fun).to_mysql_combine(uri_psql, 'table')
Also particularly useful for importing multiple CSV with data schema changes and/or preprocess with pandas before writing to db, see further down in examples notebook
来源:https://stackoverflow.com/questions/44598799/python-odo-sql-assertionerror-datashape-must-be-record-type-got-0