Array of arrays in PostgreSQL

柔情痞子 提交于 2019-11-29 07:55:50

From a curiosity standpoint, does anyone know why these are not supported?

One generic answer is because arrays are intrinsically anti-relational. Removing repeating values is how you achieve 1st normal form. To have repeating groups of repeating groups seems quite insane from a relational theoretical standpoint.

In general, the relationally-correct thing to do is to extract a table for your repeating values. So if you modeled something like this:

CREATE TABLE users (
  id integer primary key,
  name varchar,
  favorite_colors varchar[],
  ...
);

it would behoove you to redefine this relationally like so:

CREATE TABLE users (
  id integer primary key,
  name varchar,
  ...
);

CREATE TABLE favorite_colors (
  user_id integer references users,
  color varchar
);

Or even:

CREATE TABLE users (
  id integer primary key,
  name varchar,
  ...
);

CREATE TABLE colors (
  color varchar primary key
);

CREATE TABLE favorite_colors (
  user_id integer references users,
  color varchar references colors,
  primary key (user_id, color)
);

Hstore supports a lot of functions, many of which would make it easy to integrate it into a relational worldview. I think the simplest way to solve your problem would be to use the each function to convert your hstore values into relations you can then use like a normal set of values. This is how you address having multiple values in other databases anyway: querying, and working with result sets.

Peter Krauss

PostgreSQL has limited "array of arrays" support

see manual

It is a restricted form of "array of arrays". As Pavel (answer) says, it is named "multidimensional array" but is really a matrix, so it must have the same number of elements in each dimension.

You can use this kind of structure for map multidimensional and heterogeneous cartesian coordinates in scientific applications, but not to store arbitrary vectors of vectors like a XML or JSON data.

NOTE: a well-known 2-dimensional (2D) homogeneous array is the mathematical matrix. In fact, the scientific applications of matrix that motivated the "PostgreSQL constrained multidimensional array" datatype, and the array functions behaviour with these kind of arrays. Think about "3D array" as a "3D matrix", "4D array" as a "4D matrix", and so on.

EXAMPLES:

SELECT array_cat(ARRAY[[1,2],[3,4]], ARRAY[5,6]);
---------------------
 {{1,2},{3,4},{5,6}}
SELECT array_cat(ARRAY[[1,2],[3,4]], ARRAY[[5,6]]); -- SAME RESULT

SELECT ARRAY[ARRAY[1,2],ARRAY[5,6]];
---------------
 {{1,2},{5,6}}

SELECT array_cat(ARRAY[ARRAY[1,2]],ARRAY[3]); -- ERROR1
SELECT ARRAY[ARRAY[1,2],ARRAY[4]];  -- ERROR2 

The comments of @Daniel_Lyons about "why these are not supported" is about "non-uniform arrays of arrays" (see error cases above). ERROR1 above: because can only concatenate arrays of same dimension ERROR2 above: all arrays for a specific dimension must have the same length, like a matrix.

Another curious thing about build-in functions and operators: the "default behaviour" in PostgreSQL is for single arrays and elements. There are no overload for standard array_append(),

SELECT array_append(ARRAY[1,2],5); -- now ok, 5 is a element
 {1,2,5}

SELECT array_cat(ARRAY[1,2], ARRAY[5,6]);
----------
 {1,2,5,6}

SELECT array_append(ARRAY[[1,2],[3,4]], ARRAY[5,6]); -- ERROR3 
SELECT array_append(ARRAY[1,2],ARRAY[5,6]); -- ERROR4

ERROR3 above: there are NO OVERLOAD to append "array element" (even 9.2 pg version). ERROR4 above: must use array_cat to "merge all in one array".

The "merge behaviour" of the last array_cat example is curious, not produced array of arrays. Use array_cat(a1, ARRAY[a2]) for achieve this result,

SELECT array_cat(ARRAY[1,2], ARRAY[ARRAY[5,6]]);  -- seems illogical...
---------------
{{1,2},{5,6}}

Sparse matrix

To avoid problems with sparse matrix and similar data structures, use the function below. It fills the remaining elements, setting then to NULL (or to any constant value).

 CREATE or replace FUNCTION array_fillTo(
    p_array anyarray, p_len integer, p_null anyelement DEFAULT NULL
 ) RETURNS anyarray AS $f$
   SELECT CASE 
       WHEN len=0 THEN array_fill(p_null,array[p_len])
       WHEN len<p_len THEN p_array || array_fill($3,array[$2-len])
       ELSE $1 END
   FROM ( SELECT COALESCE( array_length(p_array,1), 0) ) t(len)
 $f$ LANGUAGE SQL IMMUTABLE;

PS: please edit this answer to add any corrections/optimizations, it is a Wiki!

Returning to the first examples, now we can avoid errors (see ERROR1),

SELECT array_cat(ARRAY[ARRAY[1,2]],array_fillTo(ARRAY[3],2));
-- {{1,2},{3,NULL}}
SELECT array_cat(
   ARRAY[ARRAY[1.1::float,2.0]],
   array_fillTo(ARRAY[]::float[],2,0::float)
);
-- {{1.1,2},{0,0}}
SELECT array_fillto(array['Hello'],2,'');
-- {Hello,""}

NOTE about old array_fillTo()

The array_fill() become a buildin function with PostgreSQL v8.4, for v8.3 or olds:

 CREATE FUNCTION array_fillTo(anyarray,integer,anyelement DEFAULT NULL) 
 RETURNS anyarray AS $$
   DECLARE
     i integer;
     len integer;
     ret ALIAS FOR $0;
   BEGIN
     len = array_length($1,1);
     ret = $1;
     IF len<$2 THEN
         FOR i IN 1..($2-len) LOOP
           ret = ret || $3;
         END LOOP;
     END IF;
     RETURN ret;
   END;
 $$ LANGUAGE plpgsql IMMUTABLE;

PostgreSQL support a multidimensional arrays instead - arrays are relative very special type in relational databases and it is little bit limited against general programming languages. If you need it, you can use a workaround with row arrays:

postgres=# create table fx(a int[]);
CREATE TABLE
postgres=# insert into fx values(array[1,3,4]);
INSERT 0 1
postgres=# insert into fx values(array[6,7]);
INSERT 0 1
postgres=# select array_agg(row(a)) from fx;
            array_agg            
---------------------------------
 {"(\"{1,3,4}\")","(\"{6,7}\")"}
(1 row)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!