问题
I'm trying to understand the array_agg function in Postgresql 9.3. I've put together a fun example for everyone who may be interested in participating.
Any fan of American films from the 1980's may be familiar with the "brat pack" who appeared in many hit films together. Using the information about the brat pack films on wikipedia, I've created tables that when joined together, can tell us who worked with each other -- if we have the right query!
/*
See: http://en.wikipedia.org/wiki/Brat_Pack_(actors)
*/
CREATE TABLE actor(
id SERIAL PRIMARY KEY,
name VARCHAR(50)
);
insert into actor(name) values ('Emilio Estevez'),('Anthony Michael Hall'),('Rob Lowe'),('Andrew McCarthy'),('Demi Moore'),('Judd Nelson'),('Molly Ringwald'),('Ally Sheedy')
CREATE TABLE movie(
id SERIAL PRIMARY KEY,
title VARCHAR(200)
);
insert into movie(title) values ('The Outsiders'),('Class'),('Sixteen Candles'),('Oxford Blues'),('The Breakfast Club'),('St. Elmos Fire'),
('Pretty in Pink'),('Blue City'),('About Last Night'),('Wisdom'), ('Fresh Horses'),('Betsys Wedding'),('Hail Caesar');
CREATE TABLE movie_brats(
id SERIAL PRIMARY KEY,
movie_id INT REFERENCES movie(id),
actor_id INT REFERENCES actor(id)
);
insert into movie_brats(movie_id, actor_id) values (1,1),(1,3),(2,3),(2,4),(3,2),(3,7),(4,3),(4,8),(5,1),(5,2),(5,6),
(5,7),(5,8),(6,1),(6,3),(6,4),(6,5),(6,6),(6,8),(7,4),(7,7),(8,6),(8,8),(9,3),(9,5),(10,1),(10,5),(11,4),(11,7),
(12,7),(12,8),(13,2),(13,6);
Query: Show a distinct list of who each member of the brat pack worked with, ordered by name in both columns
Name Worked With
----------------------------------------------------------------------------------------------------------------
Emelio Estevez | Emilio Estevez, Anthony Michael Hall, Rob Lowe, Andrew McCarthy, Demi Moore, Judd Nelson, Molly Ringwald, Ally Sheedy
*/
My broken query:
select a1.name, array_to_string(array_agg(a2.name),', ') as Co_Stars
from actor a1, actor a2, movie m, movie_brats mb
where
m.id = mb.movie_id
and a1.id = mb.actor_id
and a2.id = mb.actor_id
group by a1.id
回答1:
SQL Fiddle
with v as (
select
a.id as actor_id,
a.name as actor_name,
m.id as m_id
from
actor a
inner join
movie_brats mb on a.id = mb.actor_id
inner join
movie m on m.id = mb.movie_id
)
select
v1.actor_name as "Name",
string_agg(
distinct v2.actor_name, ', ' order by v2.actor_name
) as "Worked With"
from
v v1
left join
v v2 on v1.m_id = v2.m_id and v1.actor_id != v2.actor_id
group by 1
order by 1
The distinct aggregation above is necessary to not show repeated names in case they worked together in more than one movie.
The left join
is necessary to not suppress an actor that did not work with any of the others in the list as would happen with an inner join
.
If you want to show in which movie they worked together: SQL Fiddle
with v as (
select
a.id as actor_id,
a.name as actor_name,
m.id as m_id,
m.title as title
from
actor a
inner join
movie_brats mb on a.id = mb.actor_id
inner join
movie m on m.id = mb.movie_id
)
select
a1 as "Name",
string_agg(
format('%s (in %s)', a2, title), ', '
order by format('%s (in %s)', a2, title)
) as "Worked With"
from (
select
v1.actor_name as a1,
v2.actor_name as a2,
string_agg(v1.title, ', ' order by v1.title) as title
from
v v1
left join
v v2 on v1.m_id = v2.m_id and v1.actor_id != v2.actor_id
group by 1, 2
) s
group by 1
order by 1
回答2:
Your query's main problem, that you (cross) join movie_brats
only once, so every actor will be printed by every movie (where he/she played) -- this is more obvious, if you change your query, to use inner joins (instead of cross joins + where
).
Tips:
- there is no need to join the
movie
table, unless you want print all movie titles by actor - use
distinct
to avoid duplicate names - filter by
a1.id <> a2.id
to avoid an actor to be listed as he/she worked with himself/herself.
Here is a working example:
select a1.name, string_agg(distinct a2.name, ', ') as co_names
from actor a1
inner join movie_brats mb1 on a1.id = mb1.actor_id
inner join movie_brats mb2 on mb1.movie_id = mb2.movie_id
inner join actor a2 on a2.id = mb2.actor_id
where a1.id <> a2.id
group by a1.id
来源:https://stackoverflow.com/questions/24431169/postgresql-9-3-array-agg-challenge