I\'m working with PostgreSQL 9 and I want to find the nearest neighbor inside table RP for all tuples in RQ, comparing the dates (t), but
The correlated subqueries, without an index, are going to do a cross join anyway. So, another way of expressing the query is:
select rp.*, min(abs(rp.t - rq.t))
from rp cross join
rq
group by -- <== need to replace with all columns
There is another method, which is a bit more complicated. This requires using the cumulative sum.
Here is the idea. Combine all the rp and rq values together. Now, enumerate them by the closest rp value. That is, create a flag for rp and take the cumulative sum. As a result, all the rq values between two rp values have the same rp index.
The closest value to a given rq value has an rp index the same as the rq value or one more. Calculating the the rq_index uses the cumulative sum.
The following query puts this together:
with rqi as (select t.*, sum(isRQ) over (order by t) as rq_index
from (select rq.t, 0 as isRP,
from rq
union all
select rq.t, 1 as isRP, rp.*
from rp
) t
) t
select rp.*,
(case when abs(rqprev.t - rp.t) < abs(rqnext.t - rp.t)
then abs(rqprev.t - rp.t)
else abs(rqnext.t - rp.t)
end) as closest_value
from (select *
from t
where isRP = 0
) rp join
(select *
from t
where isRP = 1
) rqprev
on rp.rp_index = rqprev.rp_index join
(select *
from t
where isRP = 1
) rqnext
on rp.rp_index+1 = rpnext.rq_index
The advantage of this approach is that there is no cross join and no correlated subqueries.