Copy records with dynamic column names

问题

I have two tables with different columns in PostgreSQL 9.3:

CREATE TABLE person1(
   NAME           TEXT    NOT NULL,
   AGE            INT     NOT NULL
);

CREATE TABLE person2(
   NAME           TEXT    NOT NULL,
   AGE            INT     NOT NULL,
   ADDRESS        CHAR(50),
   SALARY         REAL
);

INSERT INTO person2 (Name,    Age, ADDRESS, SALARY)
             VALUES ('Piotr', 20, 'London', 80);

I would like to copy records from person2 to person1, but column names can change in program, so I would like to select joint column names in program. So I create an array containing the intersection of column names. Next I use a function: insert into .... select, but I get an error, when I pass the array variable to the function by name. Like this:

select column_name into name1 from information_schema.columns where table_name = 'person1';
select column_name into name2 from information_schema.columns where table_name = 'person2';
select * into cols from ( select * from name1 intersect select * from name2) as tmp;
-- Create array with name of columns 
select array (select column_name::text from cols) into cols2;

CREATE OR REPLACE FUNCTION f_insert_these_columns(VARIADIC _cols text[])
  RETURNS void AS
$func$
BEGIN
   EXECUTE (
      SELECT 'INSERT INTO person1 SELECT '
          || string_agg(quote_ident(col), ', ')
          || ' FROM person2'
      FROM   unnest(_cols) col
      );
END
$func$  LANGUAGE plpgsql;


select * from cols2;

  array    
------------
 {name,age}
(1 row)

SELECT f_insert_these_columns(VARIADIC cols2);

ERROR:  column "cols2" does not exist

What's wrong here?

回答1:

Answer

You seem to assume that SELECT INTO in SQL would assign a variable. But that is not so.

It creates a new table and its use is discouraged in Postgres. Use the superior CREATE TABLE AS instead. Not least, because the meaning of SELECT INTO inside plpgsql is different:

Combine two tables into a new one so that select rows from the other one are ignored

I have suggested this related question about SQL variables before:

User defined variables in PostgreSQL

Hence you cannot call the function like this:

SELECT f_insert_these_columns(VARIADIC cols2);

This would work:

SELECT f_insert_these_columns(VARIADIC (TABLE cols2 LIMIT 1));

About the short TABLE syntax:

Is there a shortcut for SELECT * FROM in psql?

Better solution

To copy all rows with columns sharing the same name between two tables:

CREATE OR REPLACE FUNCTION f_copy_rows_with_shared_cols(
    IN  _tbl1 regclass
  , IN  _tbl2 regclass
  , OUT rows int
  , OUT columns text) AS
$func$
BEGIN
   SELECT INTO columns                           -- proper use of SELECT INTO!
          string_agg(quote_ident(attname), ', ')
   FROM  (
      SELECT attname
      FROM   pg_attribute
      WHERE  attrelid IN (_tbl1, _tbl2)
      AND    NOT attisdropped  -- no dropped (dead) columns
      AND    attnum > 0        -- no system columns
      GROUP  BY 1
      HAVING count(*) = 2
      ) sub;

   EXECUTE format('INSERT INTO %1$s(%2$s) SELECT %2$s FROM %3$s'
                  , _tbl1, columns, _tbl2);

   GET DIAGNOSTICS rows = ROW_COUNT;  -- return number of rows copied
END
$func$  LANGUAGE plpgsql;

Call:

SELECT * FROM f_copy_rows_with_shared_cols('public.person2', 'public.person1');

Result:

rows | columns
-----+---------
3    | name, age

Major points

Note the proper use of SELECT INTO for assignment inside plpgsql.
Note the use of the data type regclass. This allows to use schema-qualified table names (optionally) and defends against SQL injection attempts:
- Table name as a PostgreSQL function parameter
About GET DIAGNOSTICS:
- Count rows affected by DELETE
About OUT parameters:
- Returning from a function with OUT parameter
The manual about format().
Information schema vs. system catalogs.

来源：https://stackoverflow.com/questions/37907182/copy-records-with-dynamic-column-names

标签

arrays

postgresql

variables

plpgsql

dynamic-sql