问题
I'm currently developing a search, where users need to search people by their first name, last name or their email. For the search I'm using Solr 4.0.0-ALPHA and edismax query.
The problem I am having is that if a user were to search user with a partial email I would need to return only the matches that match exactly that partial email query.
For example query: lastname@gmail
should return only users that match "lastname@gmail".
For example: firstname.lastname@gmail.com
but now instead it matches all that match either "lastname" or "gmail" which in our database will be huge number of results, when there only is one that would match the "lastname@gmail". I know that I can get the exact match if I do a query in double quotes, like "lastname@gmail" and I can of course force the email address to this format on the client before sending the search to Solr, but is it possible to do this somehow in schema.xml.
Here is my current schema.xml
<schema name="example" version="1.5">
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true" />
<field name="firstName" type="string_ci" indexed="true" stored="true" />
<field name="lastName" type="string_ci" indexed="true" stored="true" />
<field name="email" type="string_email" indexed="true" stored="true" />
</fields>
<uniqueKey>id</uniqueKey>
<types>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
<fieldType name="string_ci" class="solr.TextField" sortMissingLast="true" omitNorms="true">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
<fieldType name="string_email" class="solr.TextField" sortMissingLast="true" omitNorms="true">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory" />
</analyzer>
</fieldType>
</types>
</schema>
I know that the issue is here that I'm using StandardTokenizerFactory, which splits the email address into tokens and when doing the query it parses the query like this:
<str name="parsedquery_toString">
+(lastName:lastname@gmail | id:lastname@gmail | (email:lastname email:gmail) | firstName:lastname@gmail)
</str>
Where I would want it to do more like this, which happens when I do the query with double quotes "lastname@gmail":
<str name="parsedquery_toString">
+(lastName:lastname@gmail | id:lastname@gmail | email:"lastname gmail" | firstName:lastname@gmail)
</str>
Here is the search I'm doing:
/select?q=lastname@gmail&qf=id+firstName+lastName+email&defType=edismax&debugQuery=true
回答1:
And from #solr irc-channel I got the answer how to solve this properly. By adding autoGeneratePhraseQueries=true to the field it put the query to double quotes and I got the correct answer.
<fieldType name="text_email" class="solr.TextField" sortMissingLast="true" omitNorms="true" autoGeneratePhraseQueries="true">
来源:https://stackoverflow.com/questions/12101639/solr-partial-email-search-with-exact-match