I have 2 tables - course that contains id and name of the courses and tagCourse that contains tags for each course.
course tagcourse
------------ ----------------
PK id_course PK tag
name PK, FK id_course
I'd like to write a function that searches courses by given array of tags and returns them ordered by quantity of matching tags. However I don't know how to write it correctly and in an efficient way. Please help me.
ie.
CREATE OR REPLACE FUNCTION searchByTags(tags varchar[])
RETURNS SETOF.....
RETURN QUERY SELECT * FROM course c INNER JOIN tagcourse tc ON c.id_course = tc.id_course
WHERE ??? ORDER BY ???
END....
CREATE OR REPLACE FUNCTION search_by_tags(tags varchar[])
RETURNS TABLE (id_course integer, name text, tag_ct integer) AS
$func$
SELECT id_course, c.name, ct.tag_ct
FROM (
SELECT tc.id_course, count(*)::int AS tag_ct
FROM unnest($1) x(tag)
JOIN tagcourse tc USING (tag)
GROUP BY 1 -- first aggregate ..
) AS ct
JOIN course c USING (id_course) -- .. then join
ORDER BY ct.tag_ct DESC -- more columns to break ties?
$func$ LANGUAGE sql;
Use
unnest()
to produce a table from your input array, like already demonstrated by @Clodoaldo.You don't need plpgsql for this. Simpler with a plain SQL function.
I use
unnest($1)
(with positional parameter) instead ofunnest(tags)
, since the later is only valid for PostgreSQL 9.2+ in SQL functions (unlike plpgsql). I quote the manual here:
In the older numeric approach, arguments are referenced using the syntax
$n
:$1
refers to the first input argument,$2
to the second, and so on. This will work whether or not the particular argument was declared with a name.
count()
returnsbigint
. You need to cast it toint
to match the declared return type or declare the the returned column asbigint
to begin with.Perfect occasion to simplify the syntax a bit with
USING
(equi-joins):USING (tag)
instead ofON tc.tag = c.tag
.It's regularly faster to first aggregate, then join to another table. Reduces the needed join operations.
As per question of @Clodoaldo in the comments, here is an SQL Fiddle to demonstrate the difference.OTOH, if you aggregate after the join, you don't need a subquery. Shorter, but probably slower:
SELECT c.id_course, c.name, count(*)::int AS tag_ct
FROM unnest($1) x(tag)
JOIN tagcourse tc USING (tag)
JOIN course c USING (id_course)
GROUP BY 1
ORDER BY 3 DESC; -- more columns to break ties?
create or replace function searchByTags(tags varchar[])
returns table (id_course integer, name text, quantitiy integer)
as $$
select *
from (
select c.id_course, c.name, count(*) quantity
from
course c
inner join
tagcourse tc on c.id_course = tc.id_course
inner join
unnest(tags) s(tag) on s.tag = tc.tag
group by c.id_course, c.name
) s
order by quantity desc, name
;
$$ language sql;
来源:https://stackoverflow.com/questions/15664373/postgresql-order-by-an-array