问题
I have a database table that holds user's vehicles (cars, motorcycles). I want to get the most similar vehicles out of that table. Lets say the table holds the following columns (with some context to get the idea):
table: vehicles
vehicle_id (pk, auto-increment)
model_id (BMW 3er, Honda Accord)
fuel_type (gasoline, diesel)
body_style (sedan, coupe)
year
engine_size (2.0L)
engine_power (150hp)
So in short I want to select N (usually 3) rows that have the same make_id (at least) and rank them by the amount of similarities they share with the seed vehicle lets say if the fuel_type
matches, I'd have rank points +3, but if the body_style
matches, it would be +1. Ideally I would get N vehicles that have maximum points but the idea is to still get something when I don't.
回答1:
As in my table currently I have only around 5k rows and they are slowly growing, I decided to actually use the following simple approach (it came to me just after I wrote the question).
The seed lets say is Honda Accord (model_id 456), 2004, gasoline, 2.0L, 155hp, sedan with auto-inc ID 123.
SELECT vehicles.*,
(IF(`fuel_type`='gasoline', 3, 0) +
IF(`body_style`='sedan', 1, 0) +
IF(`year` > 2001 AND `year` < 2007, 2, 0) +
IF(`engine_size` >= 1.8 AND `engine_size` <= 2.2, 1, 0) +
IF(`engine_power`=155, 3, IF(`engine_power`>124 AND `engine_power`<186, 1, 0))) AS `rank`
FROM vehicles
WHERE vehicle_id!=123 AND model_id=456
ORDER BY `rank` DESC
LIMIT 3
It will work, as long as I don't too many rows. If the table becomes 50-100k, I probably will have to switch to something like Lucene
?
来源:https://stackoverflow.com/questions/17632113/getting-most-similar-rows-in-mysql-table-and-order-them-by-similarity