可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I can connect to my local mysql database from python, and I can create, select from, and insert individual rows.
My question is: can I directly instruct mysqldb to take an entire dataframe and insert it into an existing table, or do I need to iterate over the rows?
In either case, what would the python script look like for a very simple table with ID and two data columns, and a matching dataframe?
回答1:
Update:
There is now a to_sql
method, which is the preferred way to do this, rather than write_frame
:
df.to_sql(con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')
Also note: the syntax may change in pandas 0.14...
You can set up the connection with MySQLdb:
from pandas.io import sql import MySQLdb con = MySQLdb.connect() # may need to add some other options to connect
Setting the flavor
of write_frame
to 'mysql'
means you can write to mysql:
sql.write_frame(df, con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')
The argument if_exists
tells pandas how to deal if the table already exists:
if_exists: {'fail', 'replace', 'append'}
, default 'fail'
fail
: If table exists, do nothing.
replace
: If table exists, drop it, recreate it, and insert data.
append
: If table exists, insert data. Create if does not exist.
Although the write_frame
docs currently suggest it only works on sqlite, mysql appears to be supported and in fact there is quite a bit of mysql testing in the codebase.
回答2:
You might output your DataFrame
as a csv file and then use mysqlimport
to import your csv into your mysql
.
EDIT
Seems pandas's build-in sql util provide a write_frame
function but only works in sqlite.
I found something useful, you might try this
回答3:
The to_sql method works for me.
However, keep in mind that the it looks like it's going to be deprecated in favor of SQLAlchemy:
FutureWarning: The 'mysql' flavor with DBAPI connection is deprecated and will be removed in future versions. MySQL will be further supported with SQLAlchemy connectables. chunksize=chunksize, dtype=dtype)
回答4:
Andy Hayden mentioned the correct function (to_sql
). In this answer, I'll give a complete example, which I tested with Python 3.5 but should also work for Python 2.7 (and Python 3.x):
First, let's create the dataframe:
# Create dataframe import pandas as pd import numpy as np np.random.seed(0) number_of_samples = 10 frame = pd.DataFrame({ 'feature1': np.random.random(number_of_samples), 'feature2': np.random.random(number_of_samples), 'class': np.random.binomial(2, 0.1, size=number_of_samples), },columns=['feature1','feature2','class']) print(frame)
Which gives:
feature1 feature2 class 0 0.548814 0.791725 1 1 0.715189 0.528895 0 2 0.602763 0.568045 0 3 0.544883 0.925597 0 4 0.423655 0.071036 0 5 0.645894 0.087129 0 6 0.437587 0.020218 0 7 0.891773 0.832620 1 8 0.963663 0.778157 0 9 0.383442 0.870012 0
To import this dataframe into a MySQL table:
# Import dataframe into MySQL import sqlalchemy database_username = 'ENTER USERNAME' database_password = 'ENTER USERNAME PASSWORD' database_ip = 'ENTER DATABASE IP' database_name = 'ENTER DATABASE NAME' database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'. format(database_username, database_password, database_ip, database_name)) frame.to_sql(con=database_connection, name='table_name_for_df', if_exists='replace')
One trick is that MySQLdb doesn't work with Python 3.x. So instead we use mysqlconnector
, which may be installed as follows:
pip install mysql-connector==2.1.4 # version avoids Protobuf error
Output:

Note that to_sql
creates the table as well as the columns if they do not already exist in the database.
回答5:
Python 2 + 3
Prerequesites
- Pandas
- MySQL server
- sqlalchemy
- pymysql: pure python mysql client
Code
from pandas.io import sql from sqlalchemy import create_engine engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}" .format(user="root", pw="your_password", db="pandas")) df.to_sql(con=engine, name='table_name', if_exists='replace')
回答6:
You can do it by using pymysql:
For example, let's suppose you have a MySQL database with the next user, password, host and port and you want to write in the database 'data_2', if it is already there or not.
import pymysql user = 'root' passw = 'my-secret-pw-for-mysql-12ud' host = '172.17.0.2' port = 3306 database = 'data_2'
If you already have the database created:
conn = pymysql.connect(host=host, port=port, user=user, passwd=passw, db=database, charset='utf8') data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')
If you do NOT have the database created, also valid when the database is already there:
conn = pymysql.connect(host=host, port=port, user=user, passwd=passw) conn.cursor().execute("CREATE DATABASE IF NOT EXISTS {0} ".format(database)) conn = pymysql.connect(host=host, port=port, user=user, passwd=passw, db=database, charset='utf8') data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')
Similar threads:
- Writing to MySQL database with pandas using SQLAlchemy, to_sql
- Writing a Pandas Dataframe to MySQL