Postgresql 9.3 - array_agg challenge

左心房为你撑大大i 提交于 2019-12-23 10:41:48

问题


I'm trying to understand the array_agg function in Postgresql 9.3. I've put together a fun example for everyone who may be interested in participating.

Any fan of American films from the 1980's may be familiar with the "brat pack" who appeared in many hit films together. Using the information about the brat pack films on wikipedia, I've created tables that when joined together, can tell us who worked with each other -- if we have the right query!

/*
See:  http://en.wikipedia.org/wiki/Brat_Pack_(actors)
*/

CREATE TABLE actor(
    id SERIAL PRIMARY KEY, 
    name VARCHAR(50)
);
insert into actor(name) values ('Emilio Estevez'),('Anthony Michael Hall'),('Rob Lowe'),('Andrew McCarthy'),('Demi Moore'),('Judd Nelson'),('Molly Ringwald'),('Ally Sheedy')

CREATE TABLE movie(
    id SERIAL PRIMARY KEY, 
    title VARCHAR(200)
);
insert into movie(title) values ('The Outsiders'),('Class'),('Sixteen Candles'),('Oxford Blues'),('The Breakfast Club'),('St. Elmos Fire'),
('Pretty in Pink'),('Blue City'),('About Last Night'),('Wisdom'), ('Fresh Horses'),('Betsys Wedding'),('Hail Caesar');

CREATE TABLE movie_brats(
    id SERIAL PRIMARY KEY, 
    movie_id INT REFERENCES movie(id), 
    actor_id INT REFERENCES actor(id)
);
insert into movie_brats(movie_id, actor_id) values (1,1),(1,3),(2,3),(2,4),(3,2),(3,7),(4,3),(4,8),(5,1),(5,2),(5,6),
(5,7),(5,8),(6,1),(6,3),(6,4),(6,5),(6,6),(6,8),(7,4),(7,7),(8,6),(8,8),(9,3),(9,5),(10,1),(10,5),(11,4),(11,7),
(12,7),(12,8),(13,2),(13,6);

Query: Show a distinct list of who each member of the brat pack worked with, ordered by name in both columns

 Name                      Worked With
 ----------------------------------------------------------------------------------------------------------------
Emelio Estevez       |  Emilio Estevez, Anthony Michael Hall, Rob Lowe, Andrew McCarthy, Demi Moore, Judd Nelson, Molly Ringwald, Ally Sheedy
*/

My broken query:

select a1.name, array_to_string(array_agg(a2.name),', ') as Co_Stars
from actor a1, actor a2, movie m, movie_brats mb
where 
    m.id = mb.movie_id
    and a1.id = mb.actor_id
    and a2.id = mb.actor_id
group by a1.id

回答1:


SQL Fiddle

with v as (
    select
        a.id as actor_id,
        a.name as actor_name,
        m.id as m_id
    from
        actor a
        inner join
        movie_brats mb on a.id = mb.actor_id
        inner join
        movie m on m.id = mb.movie_id
)
select
    v1.actor_name as "Name",
    string_agg(
        distinct v2.actor_name, ', ' order by v2.actor_name
    ) as "Worked With"
from
    v v1
    left join
    v v2 on v1.m_id = v2.m_id and v1.actor_id != v2.actor_id
group by 1
order by 1

The distinct aggregation above is necessary to not show repeated names in case they worked together in more than one movie.

The left join is necessary to not suppress an actor that did not work with any of the others in the list as would happen with an inner join.

If you want to show in which movie they worked together: SQL Fiddle

with v as (
    select
        a.id as actor_id,
        a.name as actor_name,
        m.id as m_id,
        m.title as title
    from
        actor a
        inner join
        movie_brats mb on a.id = mb.actor_id
        inner join
        movie m on m.id = mb.movie_id
)
select
    a1 as "Name",
    string_agg(
        format('%s (in %s)', a2, title), ', '
        order by format('%s (in %s)', a2, title)
    ) as "Worked With"
from (
    select 
        v1.actor_name as a1,
        v2.actor_name as a2,
        string_agg(v1.title, ', ' order by v1.title) as title    
    from
        v v1
        left join
        v v2 on v1.m_id = v2.m_id and v1.actor_id != v2.actor_id
    group by 1, 2
) s
group by 1
order by 1



回答2:


Your query's main problem, that you (cross) join movie_brats only once, so every actor will be printed by every movie (where he/she played) -- this is more obvious, if you change your query, to use inner joins (instead of cross joins + where).

Tips:

  • there is no need to join the movie table, unless you want print all movie titles by actor
  • use distinct to avoid duplicate names
  • filter by a1.id <> a2.id to avoid an actor to be listed as he/she worked with himself/herself.

Here is a working example:

select a1.name, string_agg(distinct a2.name, ', ') as co_names
from actor a1
inner join movie_brats mb1 on a1.id = mb1.actor_id
inner join movie_brats mb2 on mb1.movie_id = mb2.movie_id
inner join actor a2 on a2.id = mb2.actor_id
where a1.id <> a2.id
group by a1.id


来源:https://stackoverflow.com/questions/24431169/postgresql-9-3-array-agg-challenge

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!