How to insert pandas dataframe via mysqldb into database?

匿名 (未验证) 提交于 2019-12-03 02:08:02

问题:

I can connect to my local mysql database from python, and I can create, select from, and insert individual rows.

My question is: can I directly instruct mysqldb to take an entire dataframe and insert it into an existing table, or do I need to iterate over the rows?

In either case, what would the python script look like for a very simple table with ID and two data columns, and a matching dataframe?

回答1:

Update:

There is now a to_sql method, which is the preferred way to do this, rather than write_frame:

df.to_sql(con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')

Also note: the syntax may change in pandas 0.14...

You can set up the connection with MySQLdb:

from pandas.io import sql import MySQLdb  con = MySQLdb.connect()  # may need to add some other options to connect

Setting the flavor of write_frame to 'mysql' means you can write to mysql:

sql.write_frame(df, con=con, name='table_name_for_df',                  if_exists='replace', flavor='mysql')

The argument if_exists tells pandas how to deal if the table already exists:

if_exists: {'fail', 'replace', 'append'}, default 'fail'
     fail: If table exists, do nothing.
     replace: If table exists, drop it, recreate it, and insert data.
     append: If table exists, insert data. Create if does not exist.

Although the write_frame docs currently suggest it only works on sqlite, mysql appears to be supported and in fact there is quite a bit of mysql testing in the codebase.



回答2:

You might output your DataFrame as a csv file and then use mysqlimport to import your csv into your mysql.

EDIT

Seems pandas's build-in sql util provide a write_frame function but only works in sqlite.

I found something useful, you might try this



回答3:

The to_sql method works for me.

However, keep in mind that the it looks like it's going to be deprecated in favor of SQLAlchemy:

FutureWarning: The 'mysql' flavor with DBAPI connection is deprecated and will be removed in future versions. MySQL will be further supported with SQLAlchemy connectables. chunksize=chunksize, dtype=dtype)


回答4:

Andy Hayden mentioned the correct function (to_sql). In this answer, I'll give a complete example, which I tested with Python 3.5 but should also work for Python 2.7 (and Python 3.x):

First, let's create the dataframe:

# Create dataframe import pandas as pd import numpy as np  np.random.seed(0) number_of_samples = 10 frame = pd.DataFrame({     'feature1': np.random.random(number_of_samples),     'feature2': np.random.random(number_of_samples),     'class':    np.random.binomial(2, 0.1, size=number_of_samples),     },columns=['feature1','feature2','class'])  print(frame)

Which gives:

   feature1  feature2  class 0  0.548814  0.791725      1 1  0.715189  0.528895      0 2  0.602763  0.568045      0 3  0.544883  0.925597      0 4  0.423655  0.071036      0 5  0.645894  0.087129      0 6  0.437587  0.020218      0 7  0.891773  0.832620      1 8  0.963663  0.778157      0 9  0.383442  0.870012      0

To import this dataframe into a MySQL table:

# Import dataframe into MySQL import sqlalchemy database_username = 'ENTER USERNAME' database_password = 'ENTER USERNAME PASSWORD' database_ip       = 'ENTER DATABASE IP' database_name     = 'ENTER DATABASE NAME' database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'.                                                format(database_username, database_password,                                                        database_ip, database_name)) frame.to_sql(con=database_connection, name='table_name_for_df', if_exists='replace')

One trick is that MySQLdb doesn't work with Python 3.x. So instead we use mysqlconnector, which may be installed as follows:

pip install mysql-connector==2.1.4  # version avoids Protobuf error

Output:

Note that to_sql creates the table as well as the columns if they do not already exist in the database.



回答5:

Python 2 + 3

Prerequesites

  • Pandas
  • MySQL server
  • sqlalchemy
  • pymysql: pure python mysql client

Code

from pandas.io import sql from sqlalchemy import create_engine  engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}"                        .format(user="root",                                pw="your_password",                                db="pandas")) df.to_sql(con=engine, name='table_name', if_exists='replace')


回答6:

You can do it by using pymysql:

For example, let's suppose you have a MySQL database with the next user, password, host and port and you want to write in the database 'data_2', if it is already there or not.

import pymysql user = 'root' passw = 'my-secret-pw-for-mysql-12ud' host =  '172.17.0.2' port = 3306 database = 'data_2'

If you already have the database created:

conn = pymysql.connect(host=host,                        port=port,                        user=user,                         passwd=passw,                          db=database,                        charset='utf8')  data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')

If you do NOT have the database created, also valid when the database is already there:

conn = pymysql.connect(host=host, port=port, user=user, passwd=passw)  conn.cursor().execute("CREATE DATABASE IF NOT EXISTS {0} ".format(database)) conn = pymysql.connect(host=host,                        port=port,                        user=user,                         passwd=passw,                          db=database,                        charset='utf8')  data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')

Similar threads:

  1. Writing to MySQL database with pandas using SQLAlchemy, to_sql
  2. Writing a Pandas Dataframe to MySQL


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!