How to import CSV file data into a PostgreSQL table?

不羁岁月 提交于 2019-11-25 22:32:06

问题


How can I write a stored procedure that imports data from a CSV file and populates the table?


回答1:


Take a look at this short article.


Solution paraphrased here:

Create your table:

CREATE TABLE zip_codes 
(ZIP char(5), LATITUDE double precision, LONGITUDE double precision, 
CITY varchar, STATE char(2), COUNTY varchar, ZIP_CLASS varchar);

Copy data from your CSV file to the table:

COPY zip_codes FROM '/path/to/csv/ZIP_CODES.txt' WITH (FORMAT csv);



回答2:


If you don't have permission to use COPY (which work on the db server), you can use \copy instead (which works in the db client). Using the same example as Bozhidar Batsov:

Create your table:

CREATE TABLE zip_codes 
(ZIP char(5), LATITUDE double precision, LONGITUDE double precision, 
CITY varchar, STATE char(2), COUNTY varchar, ZIP_CLASS varchar);

Copy data from your CSV file to the table:

\copy zip_codes FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV

You can also specify the columns to read:

\copy zip_codes(ZIP,CITY,STATE) FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV



回答3:


One quick way of doing this is with the Python pandas library (version 0.15 or above works best). This will handle creating the columns for you - although obviously the choices it makes for data types might not be what you want. If it doesn't quite do what you want you can always use the 'create table' code generated as a template.

Here's a simple example:

import pandas as pd
df = pd.read_csv('mypath.csv')
df.columns = [c.lower() for c in df.columns] #postgres doesn't like capitals or spaces

from sqlalchemy import create_engine
engine = create_engine('postgresql://username:password@localhost:5432/dbname')

df.to_sql("my_table_name", engine)

And here's some code that shows you how to set various options:

# Set it so the raw sql output is logged
import logging
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

df.to_sql("my_table_name2", 
          engine, 
          if_exists="append",  #options are ‘fail’, ‘replace’, ‘append’, default ‘fail’
          index=False, #Do not output the index of the dataframe
          dtype={'col1': sqlalchemy.types.NUMERIC,
                 'col2': sqlalchemy.types.String}) #Datatypes should be [sqlalchemy types][1]



回答4:


You could also use pgAdmin, which offers a GUI to do the import. That's shown in this SO thread. The advantage of using pgAdmin is that it also works for remote databases.

Much like the previous solutions though, you would need to have your table on the database already. Each person has his own solution but what I usually do is open the CSV in Excel, copy the headers, paste special with transposition on a different worksheet, place the corresponding data type on the next column then just copy and paste that to a text editor together with the appropriate SQL table creation query like so:

CREATE TABLE my_table (
    /*paste data from Excel here for example ... */
    col_1 bigint,
    col_2 bigint,
    /* ... */
    col_n bigint 
)



回答5:


Most other solutions here require that you create the table in advance/manually. This may not be practical in some cases (e.g., if you have a lot of columns in the destination table). So, the approach below may come handy.

Providing the path and column count of your csv file, you can use the following function to load your table to a temp table that will be named as target_table:

The top row is assumed to have the column names.

create or replace function data.load_csv_file
(
    target_table text,
    csv_path text,
    col_count integer
)

returns void as $$

declare

iter integer; -- dummy integer to iterate columns with
col text; -- variable to keep the column name at each iteration
col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet

begin
    create table temp_table ();

    -- add just enough number of columns
    for iter in 1..col_count
    loop
        execute format('alter table temp_table add column col_%s text;', iter);
    end loop;

    -- copy the data from csv file
    execute format('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_path);

    iter := 1;
    col_first := (select col_1 from temp_table limit 1);

    -- update the column names based on the first row which has the column names
    for col in execute format('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
    loop
        execute format('alter table temp_table rename column col_%s to %s', iter, col);
        iter := iter + 1;
    end loop;

    -- delete the columns row
    execute format('delete from temp_table where %s = %L', col_first, col_first);

    -- change the temp table name to the name given as parameter, if not blank
    if length(target_table) > 0 then
        execute format('alter table temp_table rename to %I', target_table);
    end if;

end;

$$ language plpgsql;



回答6:


As Paul mentioned, import works in pgAdmin:

right click on table -> import

select local file, format and coding

here is a german pgAdmin GUI screenshot:

similar thing you can do with DbVisualizer (I have a license, not sure about free version)

right click on a table -> Import Table Data...




回答7:


COPY table_name FROM 'path/to/data.csv' DELIMITER ',' CSV HEADER;



回答8:


  1. create a table first

  2. Then use copy command to copy the table details:

copy table_name (C1,C2,C3....)
from 'path to your csv file' delimiter ',' csv header;

Thanks




回答9:


Personal experience with PostgreSQL, still waiting for a faster way.

1. Create table skeleton first if the file is stored locally:

    drop table if exists ur_table;
    CREATE TABLE ur_table
    (
        id serial NOT NULL,
        log_id numeric, 
        proc_code numeric,
        date timestamp,
        qty int,
        name varchar,
        price money
    );
    COPY 
        ur_table(id, log_id, proc_code, date, qty, name, price)
    FROM '\path\xxx.csv' DELIMITER ',' CSV HEADER;

2. When the \path\xxx.csv is on the server, postgreSQL doesn't have the permission to access the server, you will have to import the .csv file through the pgAdmin built in functionality.

Right click the table name choose import.

If you still have problem, please refer this tutorial. http://www.postgresqltutorial.com/import-csv-file-into-posgresql-table/




回答10:


Use this SQL code

    copy table_name(atribute1,attribute2,attribute3...)
    from 'E:\test.csv' delimiter ',' csv header

the header keyword lets the DBMS know that the csv file have a header with attributes

for more visit http://www.postgresqltutorial.com/import-csv-file-into-posgresql-table/




回答11:


IMHO, the most convenient way is to follow "Import CSV data into postgresql, the comfortable way ;-)", using csvsql from csvkit, which is a python package installable via pip.




回答12:


How to import CSV file data into a PostgreSQL table?

steps:

  1. Need to connect postgresql database in terminal

    psql -U postgres -h localhost
    
  2. Need to create database

    create database mydb;
    
  3. Need to create user

    create user siva with password 'mypass';
    
  4. Connect with database

    \c mydb;
    
  5. Need to create schema

    create schema trip;
    
  6. Need to create table

    create table trip.test(VendorID int,passenger_count int,trip_distance decimal,RatecodeID int,store_and_fwd_flag varchar,PULocationID int,DOLocationID int,payment_type decimal,fare_amount decimal,extra decimal,mta_tax decimal,tip_amount decimal,tolls_amount int,improvement_surcharge decimal,total_amount
    );
    
  7. Import csv file data to postgresql

    COPY trip.test(VendorID int,passenger_count int,trip_distance decimal,RatecodeID int,store_and_fwd_flag varchar,PULocationID int,DOLocationID int,payment_type decimal,fare_amount decimal,extra decimal,mta_tax decimal,tip_amount decimal,tolls_amount int,improvement_surcharge decimal,total_amount) FROM '/home/Documents/trip.csv' DELIMITER ',' CSV HEADER;
    
  8. Find the given table data

    select * from trip.test;
    



回答13:


In Python, you can use this code for automatic PostgreSQL table creation with column names:

import pandas, csv

from io import StringIO
from sqlalchemy import create_engine

def psql_insert_copy(table, conn, keys, data_iter):
    dbapi_conn = conn.connection
    with dbapi_conn.cursor() as cur:
        s_buf = StringIO()
        writer = csv.writer(s_buf)
        writer.writerows(data_iter)
        s_buf.seek(0)
        columns = ', '.join('"{}"'.format(k) for k in keys)
        if table.schema:
            table_name = '{}.{}'.format(table.schema, table.name)
        else:
            table_name = table.name
        sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(table_name, columns)
        cur.copy_expert(sql=sql, file=s_buf)

engine = create_engine('postgresql://user:password@localhost:5432/my_db')

df = pandas.read_csv("my.csv")
df.to_sql('my_table', engine, schema='my_schema', method=psql_insert_copy)

It's also relatively fast, I can import more than 3.3 million rows in about 4 minutes.




回答14:


Create table and have required columns that are used for creating table in csv file.

  1. Open postgres and right click on target table which you want to load & select import and Update the following steps in file options section

  2. Now browse your file in filename

  3. Select csv in format

  4. Encoding as ISO_8859_5

Now goto Misc. options and check header and click on import.




回答15:


If you need simple mechanism to import from text/parse multiline CSV you could use:

CREATE TABLE t   -- OR INSERT INTO tab(col_names)
AS
SELECT
   t.f[1] AS col1
  ,t.f[2]::int AS col2
  ,t.f[3]::date AS col3
  ,t.f[4] AS col4
FROM (
  SELECT regexp_split_to_array(l, ',') AS f
  FROM regexp_split_to_table(
$$a,1,2016-01-01,bbb
c,2,2018-01-01,ddd
e,3,2019-01-01,eee$$, '\n') AS l) t;

DBFiddle Demo




回答16:


I created a small tool that imports csv file into PostgreSQL super easy, just a command and it will create and populate the tables, unfortunately, at the moment all fields automatically created uses the type TEXT

csv2pg users.csv -d ";" -H 192.168.99.100 -U postgres -B mydatabase

The tool can be found on https://github.com/eduardonunesp/csv2pg




回答17:


You can also use pgfutter, or, even better, pgcsv.

pgfutter is quite buggy, I'd recomment pgcsv.

Here's how to do it with pgcsv:

sudo pip install pgcsv
pgcsv --db 'postgresql://localhost/postgres?user=postgres&password=...' my_table my_file.csv


来源:https://stackoverflow.com/questions/2987433/how-to-import-csv-file-data-into-a-postgresql-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!