问题
I want to do queries on double fields where four values are actually grouped and each document can have multiple instances of this. So what I need is a field where I can store something like this
<doc>
<field name="id">id</field>
<field name="valueGroup">1 2 3 4</field>
<field name="valueGroup">5 6 7 8</field>
</doc>
And then do ranged queries in this way: valueGroup: [0,0,0,0 to 3,8,8,8]. I cannot Index this as single fields with multivalued="true" because each group needs to be treated separately. I know there is a fieldtype LatLon but that has only two values. How to get fields with more than 2 dimensions?
回答1:
As I mentioned in a response to your comment on my SO question, I also had quite niche requirements for performing some complex filtering. Eventually, I had to create a custom field class which allowed me to override the method responsible for returning a query object containing the custom logic to filter results. This method should suit you perfectly:
public class MyCustomFieldType extends FieldType {
/**
* {@inheritDoc}
*/
@Override
protected void init(final IndexSchema schema, final Map<String, String> args) {
trueProperties |= TOKENIZED;
super.init(schema, args);
}
/**
* {@inheritDoc}
*/
@Override
public void write(final XMLWriter xmlWriter, final String name, final Fieldable fieldable)
throws IOException
{
xmlWriter.writeStr(name, fieldable.stringValue());
}
/**
* {@inheritDoc}
*/
@Override
public void write(final TextResponseWriter writer, final String name, final Fieldable fieldable)
throws IOException
{
writer.writeStr(name, fieldable.stringValue(), true);
}
/**
* {@inheritDoc}
*/
@Override
public SortField getSortField(final SchemaField field, final boolean reverse) {
return getStringSort(field, reverse);
}
/**
* {@inheritDoc}
*/
@Override
public void setAnalyzer(final Analyzer analyzer) {
this.analyzer = analyzer;
}
/**
* {@inheritDoc}
*/
@Override
public void setQueryAnalyzer(final Analyzer queryAnalyzer) {
this.queryAnalyzer = queryAnalyzer;
}
/**
* {@inheritDoc}
*/
@Override
public Query getFieldQuery(
final QParser parser, final SchemaField field, final String externalVal)
{
// Do some parsing of the user's input (if necessary) from the query string (externalVal)
final String parsedInput = ...
// Instantiate your custom filter, taking note to wrap it in a caching wrapper!
final Filter filter = new CachingWrapperFilter(
new MyCustomFilter(field, parsedValue));
// Return a query that runs your filter against all docs in the index
// NOTE: depending on your needs, you may be able to do a more fine grained query here
// instead of a MatchAllDocsQuery!!
return new FilteredQuery(new MatchAllDocsQuery(), filter);
}
}
Now you need a custom Filter...
public class MyCustomFilter extends Filter {
/**
* The field that is being filtered.
*/
private final SchemaField field;
/**
* The value to filter against.
*/
private final String filterBy;
/**
*
*
* @param field The field to perform filtering against.
* @param filterBy A value to filter by.
*/
public ProgrammeAvailabilityFilter(
final SchemaField field,
final String filterBy)
{
this.field = field;
this.filterBy = filterBy;
}
/**
* {@inheritDoc}
*/
@Override
public DocIdSet getDocIdSet(final IndexReader reader) throws IOException {
final FixedBitSet bitSet = new FixedBitSet(reader.maxDoc());
// find all the docs you want to run the filter against
final Weight weight = new IndexSearcher(reader).createNormalizedWeight(
new SOME_QUERY_TYPE_HERE());
final Scorer docIterator = weight.scorer(reader, true, false);
if (docIterator == null) {
return bitSet;
}
int docId;
while ((docId = docIterator.nextDoc()) != Scorer.NO_MORE_DOCS) {
final Document doc = reader.document(docId);
for (final String indexFieldValue : doc.getValues(field.getName())) {
// CUSTOM LOGIC GOES HERE
// If your criteria are met, consider the doc a match
bitSet.set(docId);
}
}
return bitSet;
}
/**
* {@inheritDoc}
*/
@Override
public boolean equals(final Object other) {
// NEEDED FOR CACHING
}
/**
* {@inheritDoc}
*/
@Override
public int hashCode() {
// NEEDED FOR CACHING
}
}
The example above is obviously very basic, but if you use it as a template and tweak to improve performance and add your custom logic, you should get what you need. Also be sure to implement the hashCode
and equals
methods in your filter, as these will be used for caching. In the query string, you can supply the fq
param like so: `?q=some query&fq=myfield:[0,0,0,0 to 3,8,8,8].
As I mentioned, this approach worked great for me and my team, as we had quite specific requirements around the filtering of our content.
Good luck. :)
来源:https://stackoverflow.com/questions/19862278/solr-index-multiple-values-as-one-field