I\'m trying to understand how fieldNorm
is calculated (at index time) and then used (and apparentlly re-calculated) at query time.
In all the examples I
The documentation of encodeNormValue describes the encoding step (which is where the precision is lost), and particularly the final representation of the value:
The encoding uses a three-bit mantissa, a five-bit exponent, and the zero-exponent point at 15, thus representing values from around 7x10^9 to 2x10^-9 with about one significant decimal digit of accuracy. Zero is also represented. Negative numbers are rounded up to zero. Values too large to represent are rounded down to the largest representable value. Positive values too small to represent are rounded up to the smallest positive representable value.
The most relevant piece to understand that that the mantissa is only 3 bits, which means precision is around one significant decimal digit.
An important note on the rationale comes a few sentences after where your quote ended, where the Lucene docs say:
The rationale supporting such lossy compression of norm values is that given the difficulty (and inaccuracy) of users to express their true information need by a query, only big differences matter.