问题
I have sphinx index on schools
, and when I do a query I always receive the same results, in the same order. I've tried every imaginable combination of ranking, sorting, and matching, and always get the same sorting.
A sample of the bad data I'm getting is below:
"albany high"
Albany Junior High School | Auckland, NZ | 2001 (shouldn't be first)
Albany High School | Albany, NY | 2001
South Albany High School | Albany, OR | 2001
Albany High School | Albany, CA | 1001 (shouldn't be last)
As you can see, the highest-ranked school is not in a city named "Albany", and should be lower, while the lowest-ranked "Albany High School" should be ranked higher than it is. This problem is replicated across many search terms.
The Sphinx index looks like this:
source schools : root
{
sql_query = \
SELECT schools.id, schools.name, schools.state, schools.country, schools.city, \
(select COUNT(*) from user2school WHERE school_id = schools.id) as user_count \
FROM schools
sql_attr_uint = user_count
}
index schools
{
source = schools
path = /var/db/sphinx/data/schools
min_infix_len = 3
infix_fields = name
}
The code that generates the results is as follows:
$sphinx->SetMatchMode(SPH_MATCH_EXTENDED);
$sphinx->SetRankingMode(SPH_RANK_WORDCOUNT);
$sphinx->SetSortMode(SPH_SORT_RELEVANCE);
$sphinx->SetFieldWeights(array(
'id' => 0,
'name' => 1000,
'city' => 0,
'state' => 0,
'user_count' => 0
));
How can I get Sphinx to recognize my custom weights? Every combination I've tried seems to fail.
Edit:
Here is another example with the same ordering, but totally different settings. The only option I have turned on here is:
$sphinx->SetRankingMode(SPH_RANK_SPH04);
The results:
"albany high"
Albany Junior High School | Auckland, NZ | 3 (still shouldn't be first)
Albany High School | Albany, NY | 3
South Albany High School | Albany, OR | 2
Albany High School | Albany, CA | 1 (still shouldn't be last)
As you can see, the ordering is identical. It is identical in every combination of ranking, sorting, and weighting I have tried. Is there anything I can try to debug this problem?
回答1:
Perhaps its a logic error in your application. Sphinx gives you a list of IDs, which you would then use to retreive data from the original database. Maybe you arent sorting those rows right.
I just tried inserting your data into a test RT index (including a string attribute, so could see the data)
mysql> insert into rttest values (1,'Albany Junior High School','Auckland','NZ','Albany Junior High School, Auckland, NZ');
... etc ...
mysql> select * from rttest where match('albany high');
+------+--------+-----------------------------------------+
| id | weight | value |
+------+--------+-----------------------------------------+
| 2 | 3267 | Albany High School, Albany, NY |
| 3 | 3267 | South Albany High School, Albany, OR |
| 4 | 3267 | Albany High School, Albany, CA |
| 1 | 1304 | Albany Junior High School, Auckland, NZ |
+------+--------+-----------------------------------------+
4 rows in set (0.15 sec)
mysql> select * from rttest where match('albany high') option ranker=sph04;
+------+--------+-----------------------------------------+
| id | weight | value |
+------+--------+-----------------------------------------+
| 2 | 12267 | Albany High School, Albany, NY |
| 4 | 12267 | Albany High School, Albany, CA |
| 3 | 10267 | South Albany High School, Albany, OR |
| 1 | 6304 | Albany Junior High School, Auckland, NZ |
+------+--------+-----------------------------------------+
4 rows in set (0.00 sec)
mysql> select * from rttest where match('albany high') option ranker=wordcount;
+------+--------+-----------------------------------------+
| id | weight | value |
+------+--------+-----------------------------------------+
| 2 | 3 | Albany High School, Albany, NY |
| 3 | 3 | South Albany High School, Albany, OR |
| 4 | 3 | Albany High School, Albany, CA |
| 1 | 2 | Albany Junior High School, Auckland, NZ |
+------+--------+-----------------------------------------+
4 rows in set (0.00 sec)
Changing the ranking mode does work.
回答2:
the 0's in your SetFieldWeights look odd. Either just only note the fields you want to set the weight to, or use 1 as the default. I suspect 0 will cause issues.
Suspect that SPH_RANK_SPH04 would be the most suitable for this particular case.
also shouldnt need your setSelect
来源:https://stackoverflow.com/questions/11518313/sphinx-ignores-ranking-always-sorts-in-the-same-order