Query one document per association from MongoDB

别来无恙 提交于 2019-12-12 02:21:28

问题


I'm investigating how MongoDB would work for us. One of the most used queries is used to get latest (or from a given time) measurements for each station. There is thousands of stations and each station has tens of thousands of measurements.

So we plan to have one collection for stations and another for measurements.

In SQL we would do the query with

SELECT * FROM measurements 
INNER JOIN ( 
  SELECT max(meas_time) station_id 
  FROM measurements 
  WHERE meas_time <= 'time_to_query' 
  GROUP BY station_id 
) t2 ON t2.station_id = measurements.station_id 
    AND t2.meas_time = measurements.meas_time

This returns one measurement for each station, and the measurement is the newest one before the 'time_to_query'.

What query should be used in MongoDB to produce the same result? We are really using Rails and MongoId, but it should not matter.

update: This question is not about how to perform a JOIN in MongoDB. The fact that in SQL getting the right data out of the table requires a join doesn't necessary mean that in MongoDB we would also need a join. There is only one table used in the query.


回答1:


We came up with this query

db.measurements.aggregate([{$group:{ _id:{'station_id':"$station_id"}, time:{$max:'$meas_time'}}}]);

with indexes

db.measurements.createIndex({ station_id: 1, meas_time: -1 });

Even though it seems to give the right data it is really slow. Takes roughly a minute to get a bit over 3000 documents from a collection of 65 million.

Just found that MongoDB is not using the index in this query even though we are using the 3.2 version.




回答2:


I guess worst case solution would be something like this (out of my head):

meassures = []
StationId.all.each do |station|
  meassurement = Meassurment.where(station_id: station.id, meas_time <= 'time_to_query').order_by(meas_time: -1).limit(1)
  meassures << [station.name, meassurement.measure, ....]
end

It depends on how much time query can take. Data should anyway be indexed by station_id and meas_time.

How much time does the SQL query take?



来源:https://stackoverflow.com/questions/33936129/query-one-document-per-association-from-mongodb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!