I have a CSV file with 10 columns. After creating a PostgreSQL table with 4 columns, I want to copy some of 10 columns into the table.
the columns of my CSV table ar
Just arrived here on a pursuit for a solution to only load a subset of columns but apparently it's not possible. So, use awk (or cut
) to extract the wanted columns to a new file new_file
:
$ awk '{print $2, $5, $7, $10}' file > new_file
and load the new_file
. You could pipe the output straight to psql
:
$ cut -d \ -f 2,5,7,10 file |
psql -h host -U user -c "COPY table(col1,col2,col3,col4) FROM STDIN DELIMITER ' '" database
Notice COPY
, not \COPY
.
Update:
As it was pointed out in the comments, neither of the above examples can handle quoted delimiters in the data. The same goes for newlines, too, as awk or cut
are not CSV aware. Quoted delimiters can be handled with GNU awk, though.
This is a three-column file:
$ cat file
1,"2,3",4
Using GNU awk's FPAT variable we can change the order of the fields (or get a subset of them) even when the quoted fields have field separators in them:
$ gawk 'BEGIN{FPAT="([^,]*)|(\"[^\"]+\")";OFS=","}{print $2,$1,$3}' file
"2,3",1,4
Explained:
$ gawk '
BEGIN { # instead of field separator FS
FPAT="([^,]*)|(\"[^\"]+\")" # ... we define field pattern FPAT
OFS="," # output field separator OFS
}
{
print $2,$1,$3 # change field order
# print $2 # or get a subset of fields
}' file
Notice that FPAT
is GNU awk only. For other awks it's just a regular variable.