问题
The database looks like this:
CREATE TABLE artists (
artist_id SERIAL PRIMARY KEY,
artist TEXT UNIQUE NOT NULL
);
CREATE TABLE artistalias (
artistalias_id SERIAL PRIMARY KEY,
artist_id SERIAL REFERENCES artists (artist_id),
alias TEXT UNIQUE NOT NULL
);
CREATE TABLE songs (
song_id SERIAL PRIMARY KEY,
song TEXT NOT NULL,
artist_id SERIAL REFERENCES artists (artist_id)
);
- one artist can have zero, one or many aliases
- one alias belongs to exactly one artist
- one song has one artist
- one artist can have one or many songs
My problem is that the database holds artists that are using two or more pseudonymes. For example one artist is using the stage name Assassin for songs that belong to one certain genre and the name Agent Sasco for songs that belong to another. Some artists just randomly change their pseudonymes every now and then.
Later a website should display the data in the following format:
Artist | Song
------------+-------------------
Assassin | Anywhere We Go
Agent Sasco | We Dem A Watch
And when you click on the artist it will link you to a page showing all different aliases the artist has used to perform songs with. It is important that the song is being displayed with the pseudonyme of the artist that it was released with.
The dummy data I work with looks like:
INSERT INTO artists (artist) VALUES
('Assassin'), ('Agent Sasco'), ('Sizzla');
-- This bothers me as its a lot of redundant data
INSERT INTO artistalias (artist_id, alias) VALUES
(1, 'Agent Sasco'), (2, 'Assassin');
INSERT INTO songs (song, artist_id) VALUES
('Anywhere We Go', 1), ('We Dem A Watch', 2), ('Only Takes Love', 3);
What bothers me with this database layout is that I have to add redundant data to artistalias
. There must be a better way to link the table artists
to artistalias
and songs
without having to add one specific artist and his aliases multiple times?
The query to display the data in the desired format looks like:
SELECT
artist AS pseudonyme_song_was_performed_with,
string_agg(alias, ' & ') AS other_pseudonymes,
song
FROM
artists
left JOIN artistalias USING (artist_id)
left JOIN songs USING (artist_id)
GROUP BY artist, song;
Here is a SQLFiddle with the layout and data as described above.
回答1:
You should store the real name of the artist in the artists
table.
Then store all the aliases for that artist in the artistalias
table.
Then store the artistalias_id in the songs
table.
In that way, you won't have any duplicate data.
CREATE TABLE artists (
artist_id SERIAL PRIMARY KEY,
artist TEXT UNIQUE NOT NULL
);
CREATE TABLE artistalias (
artistalias_id SERIAL PRIMARY KEY,
artist_id SERIAL REFERENCES artists (artist_id),
alias TEXT UNIQUE NOT NULL
);
CREATE TABLE songs (
song_id SERIAL PRIMARY KEY,
song TEXT NOT NULL,
artistalias_id SERIAL REFERENCES artistalias (artistalias_id)
);
Then insert the data this way:
INSERT INTO artists (artist) VALUES
('Jeffrey Campbell'), ('Sizzla');
-- THIS REDUNDANCY IS BOTHERING ME
INSERT INTO artistalias (artist_id, alias) VALUES
(1, 'Agent Sasco'), (1, 'Assassin');
INSERT INTO songs (song, artistalias_id) VALUES
('Anywhere We Go', 1), ('We Dem A Watch', 2);
And query this way:
SELECT
a1.alias AS pseudonyme_song_was_performed_with,
string_agg(a2.alias, ' & ') AS other_pseudonymes,
song
FROM
artistalias a1
left JOIN artistalias a2 on a2.artist_id = a1.artist_id
left JOIN songs s on s.artistalias_id = a1.artistalias_id
GROUP BY a1.alias, song;
Fiddle: http://sqlfiddle.com/#!15/3a78c/8/0
来源:https://stackoverflow.com/questions/32419155/how-do-i-remove-redundancy-from-this-database-layouts-1m-relationship-and-stil