I have a table that looks like this:
id feh bar
1 10 A
2 20 A
3 3 B
4 4 B
5 5 C
6 6 D
7 7
Although this is an old question, I would like to add another solution made possible by recent improvements in PostgreSQL. This solution achieves the same goal of returning a structured result from a dynamic data set without using the crosstab function at all. In other words, this is a good example of re-examining unintentional and implicit assumptions that prevent us from discovering new solutions to old problems. ;)
To illustrate, you asked for a method to transpose data with the following structure:
id feh bar
1 10 A
2 20 A
3 3 B
4 4 B
5 5 C
6 6 D
7 7 D
8 8 D
into this format:
bar val1 val2 val3
A 10 20
B 3 4
C 5
D 6 7 8
The conventional solution is a clever (and incredibly knowledgeable) approach to creating dynamic crosstab queries that is explained in exquisite detail in Erwin Brandstetter's answer.
However, if your particular use case is flexible enough to accept a slightly different result format, then another solution is possible that handles dynamic pivots beautifully. This technique, which I learned of here
uses PostgreSQL's new jsonb_object_agg function to construct pivoted data on the fly in the form of a JSON object.
I will use Mr. Brandstetter's "simpler test case" to illustrate:
CREATE TEMP TABLE tbl (row_name text, attrib text, val int);
INSERT INTO tbl (row_name, attrib, val) VALUES
('A', 'val1', 10)
, ('A', 'val2', 20)
, ('B', 'val1', 3)
, ('B', 'val2', 4)
, ('C', 'val1', 5)
, ('D', 'val3', 8)
, ('D', 'val1', 6)
, ('D', 'val2', 7);
Using the jsonb_object_agg function, we can create the required pivoted result set with this pithy beauty:
SELECT
row_name AS bar,
json_object_agg(attrib, val) AS data
FROM tbl
GROUP BY row_name
ORDER BY row_name;
Which outputs:
bar | data
-----+----------------------------------------
A | { "val1" : 10, "val2" : 20 }
B | { "val1" : 3, "val2" : 4 }
C | { "val1" : 5 }
D | { "val3" : 8, "val1" : 6, "val2" : 7 }
As you can see, this function works by creating key/value pairs in the JSON object from the attrib and value columns in the sample data, all grouped by row_name.
Although this result set obviously looks different, I believe it will actually satisfy many (if not most) real world use cases, especially those where the data requires a dynamically-generated pivot, or where resulting data is consumed by a parent application (e.g., needs to be re-formatted for transmission in a http response).
Benefits of this approach:
Cleaner syntax. I think everyone would agree that the syntax for this approach is far cleaner and easier to understand than even the most basic crosstab examples.
Completely dynamic. No information about the underlying data need be specified beforehand. Neither the column names nor their data types need be known ahead of time.
Handles large numbers of columns. Since the pivoted data is saved as a single jsonb column, you will not run up against PostgreSQL's column limit (≤1,600 columns, I believe). There is still a limit, but I believe it is the same as for text fields: 1 GB per JSON object created (please correct me if I am wrong). That's a lot of key/value pairs!
Simplified data handling. I believe that the creation of JSON data in the DB will simplify (and likely speed up) the data conversion process in parent applications. (You will note that the integer data in our sample test case was correctly stored as such in the resulting JSON objects. PostgreSQL handles this by automatically converting its intrinsic data types to JSON in accordance with the JSON specification.) This will effectively eliminate the need to manually cast data passed to parent applications: it can all be delegated to the application's native JSON parser.
Differences (and possible drawbacks):
It looks different. There's no denying that the results of this approach look different. The JSON object is not as pretty as the crosstab result set; however, the differences are purely cosmetic. The same information is produced--and in a format that is probably more friendly for consumption by parent applications.
Missing keys. Missing values in the crosstab approach are filled in with nulls, while the JSON objects are simply missing the applicable keys. You will have to decide for your self if this is an acceptable trade off for your use case. It seems to me that any attempt to address this problem in PostgreSQL will greatly complicate the process and likely involve some introspection in the form of additional queries.
Key order is not preserved. I don't know if this can be addressed in PostgreSQL, but this issue is mostly cosmetic also, since any parent applications are either unlikely to rely on key order, or have the ability to determine proper key order by other means. The worst case will probably only require an addition query of the database.
Conclusion
I am very curious to hear the opinions of others (especially @ErwinBrandstetter's) on this approach, especially as it pertains to performance. When I discovered this approach on Andrew Bender's blog, it was like getting hit in the side of the head. What a beautiful way to take a fresh approach to a difficult problem in PostrgeSQL. It solved my use case perfectly, and I believe it will likewise serve many others as well.