how to use `charset` and `encoding` in `create_engine` of SQLAlchemy (to create pandas dataframe)?

坚强是说给别人听的谎言 提交于 2019-12-01 01:50:48

问题


I am very confused with the way charset and encoding work in SQLAlchemy. I understand (and have read) the difference between charsets and encodings, and I have a good picture of the history of encodings.

I have a table in MySQL in latin1_swedish_ci (Why? Possible because of this). I need to create a pandas dataframe in which I get the proper characters (and not weird symbols). Initially, this was in the code:

connect_engine = create_engine('mysql://user:password@1.1.1.1/db')
sql_query = "select * from table1"
df = pandas.read_sql(sql_query, connect_engine)

We started having troubles with the Š character (corresponding to the u'\u0160' unicode, but instead we get '\x8a'). I expected this to work:

connect_engine = create_engine('mysql://user:password@1.1.1.1/db', encoding='utf8') 

but, I continue getting '\x8a', which, I realized, makes sense given that the default of the encoding parameter is utf8. So, then, I tried encoding='latin1' to tackle the problem:

connect_engine = create_engine('mysql://user:password@1.1.1.1/db', encoding='latin1')

but, I still get the same '\x8a'. To be clear, in both cases (encoding='utf8' and encoding='latin1'), I can do mystring.decode('latin1') but not mystring.decode('utf8').

And then, I rediscovered the charset parameter in the connection string, i.e. 'mysql://user:password@1.1.1.1/db?charset=latin1'. And after trying all possible combinations of charset and encoding, I found that this one work:

connect_engine = create_engine('mysql://user:password@1.1.1.1/db?charset=utf8')

I would appreciate if somebody can explain me how to correctly use the charset in the connection string, and the encoding parameter in the create_engine?


回答1:


encoding is the codec used for encoding/decoding within SQLAlchemy. From the documentation:

For those scenarios where the DBAPI is detected as not supporting a Python unicode object, this encoding is used to determine the source/destination encoding. It is not used for those cases where the DBAPI handles unicode directly.

[...]

To properly configure a system to accommodate Python unicode objects, the DBAPI should be configured to handle unicode to the greatest degree as is appropriate [...]

mysql-python handles unicode directly, so there's no need to use this setting.

charset is a setting specific to the mysql-python driver. From the documentation:

This charset is the client character set for the connection.

This setting controls three variables on the server, specifically character_set_results, which is what you are interested in. When set, strings are returned as unicode objects.

Note that this applies only if you have latin1 encoded data in the database. If you've stored utf-8 bytes as latin1, you may have better luck using encoding instead.




回答2:


encoding parameter does not work correctly.

So, as @doru said in this link, you should add ?charset=utf8mb4 at the end of the connection string. like this:

connect_string = 'mysql+pymysql://{}:{}@{}:{}/{}?charset=utf8mb4'.format(DB_USER, DB_PASS, DB_HOST, DB_PORT, DATABASE)



回答3:


This works for me .

from sqlalchemy import create_engine
from sqlalchemy.engine.url import URL

db_url = {
    'database': "dbname",
    'drivername': 'mysql',
    'username': 'myname',
    'password': 'mypassword',
    'host': '127.0.0.1',
    'query': {'charset': 'utf8'},  # the key-point setting
}

engine = create_engine(URL(**db_url), encoding="utf8")



回答4:


I had the same problem. I just added ?charset=utf8mb4 at the end of the url.

Here is mine:

Before

SQL_ENGINE = sqlalchemy.create_engine('mysql+pymysql://'+MySQL.USER+':'+MySQL.PASSWORD+'@'+MySQL.HOST+':'+str(MySQL.PORT)+'/'+MySQL.DB_NAME)

After

SQL_ENGINE = sqlalchemy.create_engine('mysql+pymysql://'+MySQL.USER+':'+MySQL.PASSWORD+'@'+MySQL.HOST+':'+str(MySQL.PORT)+'/'+MySQL.DB_NAME + "?charset=utf8mb4")


来源:https://stackoverflow.com/questions/45279863/how-to-use-charset-and-encoding-in-create-engine-of-sqlalchemy-to-create

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!