Having the following table (conversations
):
id | record_id | is_response | text |
---+------------+---------------+------------
There is a simple and fast solution:
SELECT record_id, string_agg(text, ' ') As context
FROM (
SELECT c.*, count(r.text) OVER (PARTITION BY c.record_id ORDER BY c.id DESC) AS grp
FROM conversations c
LEFT JOIN responses r ON r.text = c.text AND c.is_response
ORDER BY record_id, id
) sub
WHERE grp > 0 -- ignore conversation part that does not end with a response
GROUP BY record_id, grp
ORDER BY record_id, grp;
count()
only counts non-null values. r.text
is NULL if the LEFT JOIN
to responses
comes up empty:
The value in grp
(short for "group") is only increased when a new output row is triggered. All rows belonging to the same output row end up with the same grp
number. It's then easy to aggregate in the outer SELECT
.
The special trick is to count conversation ends in reverse order. Everything after the last end (coming first when starting from the end) gets grp = 0
and is removed in the outer SELECT
.
Similar cases with more explanation: