Import .csv files into SQL database using SQLite in Python

☆樱花仙子☆ 提交于 2021-02-08 07:32:59

问题


I have 2 .txt files, and I converted them into .csv files using https://convertio.co/csv-xlsx/. Now, I would like to import these two .csv files into two databases using SQLite in Python (UI is Jupyter Notebook). These two .csv files are labeled person.csv and person_votes.csv. So, I did it by following the code given here (Importing a CSV file into a sqlite3 database table using Python):

import sqlite3, csv

con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("CREATE TABLE person (personid STR,age STR,sex STR,primary_voting_address_id STR,state_code STR,state_fips STR,county_name STR,county_fips STR,city STR,zipcode STR, zip4 STR,  PRIMARY KEY(personid))") 

with open('person.csv','r') as person_table: # `with` statement available in 2.5+
    # csv.DictReader uses first line in file for column headings by default
    dr = csv.DictReader(person_table) # comma is default delimiter
#personid   age sex primary_voting_address_id   state_code  state_fips  county_name county_fips city    zipcode zip4
    to_db = [(i['personid'], i['age'], i['sex'], i['primary_voting_address_id'], i['state_code'], i['state_flips'], i['county_name'], i['county_fips'], i['city'], i['zipcode'], i['zip4']) for i in dr]

cur.executemany("INSERT INTO t (age, sex) VALUES (?, ?);", to_db)
con.commit()

I don't understand why when I tried executing the code above, I keep getting the error message: "KeyError: 'personid'". Could someone please help?

Also, if I create another database table named to_db2 for the file person_votes.csv in the same Python file, would the following query give me all the common elements between two tables:

select ID from to_db, to_db2 WHERE to_db.ID ==  to_db2

The link to the two .csv files above is here: https://drive.google.com/open?id=0B-cyvC6eCsyCQThUeEtGcWdBbXc.


回答1:


This works for me on Windows 10, but should work under Linux/Unix too. There are several problems:

  1. The last two rows of person.csv are not correct format, but this does not prevent the program from working. You can fix this with a text editor.
  2. person.csv uses tabs as the delimiter not commas.
  3. There is a typo (spelling) in the line that starts with "to_db ="
  4. There is a mismatch in the number of columns to import (2 instead of 11)
  5. Wrong table name on executemany.

In addition, I create the database in a file rather than in memory. It is small enough that performance should not be a problem and also any changes you make will be saved.

Here is my corrected file (you can do the other table yourself):

import sqlite3, csv

# con = sqlite3.connect(":memory:")
con = sqlite3.connect("person.db")
cur = con.cursor()
cur.execute("CREATE TABLE person (personid STR,age STR,sex STR,primary_voting_address_id STR,state_code STR,state_fips STR,county_name STR,county_fips STR,city STR,zipcode STR, zip4 STR,  PRIMARY KEY(personid))") 

with open('person.csv','r') as person_table:
    dr = csv.DictReader(person_table, delimiter='\t') # comma is default delimiter
    to_db = [(i['personid'], i['age'], i['sex'], i['primary_voting_address_id'], i['state_code'], i['state_fips'], i['county_name'], i['county_fips'], i['city'], i['zipcode'], i['zip4']) for i in dr]

cur.executemany("INSERT INTO person VALUES (?,?,?,?,?,?,?,?,?,?,?);", to_db)
con.commit()



回答2:


Looks like you might be missing some column names in your INSERT INTO ... statement.

Probably not great practice leaving the Primary Key as NULL too.



来源:https://stackoverflow.com/questions/46028456/import-csv-files-into-sql-database-using-sqlite-in-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!