通译 Lucene's NumericRangeQuery javadoc-数据库教程-爱易网页

通译 Lucene's NumericRangeQuery javadoc

日期：2014-05-16　浏览次数：20479 次

翻译 Lucene's NumericRangeQuery javadoc
我们已开发出Lucene的扩展包以使用特殊的变精度的字符串编码格式存储数字值（所有的诸如 double，long，float，和int的数字值会被转换为字典排序字符串的表示并以不同的精度存储，对于如何存储的细节，可以参看NumericUtils），一个range会被递归的分成多个小段以方便搜索: Range中间部分在Trie树中会以低精度搜索，边界则会以高精度搜索。这样可以急剧减少term的数量。

对于那些比较大的变长的值，我们提供了8种不同的精度（每个减少8位），最低精度的只有一个字节，这样最低精度的只有256个值。总的来说，一个range可以包含最大7*255*2 + 255 = 3825个不同term（当有个term对每个不同值-索引中的8字节数字range 几乎cover所有值；最大使用255个不同值，因为它将总是可能减少到全的256个值-使用低精度从而能用一个term表示）。实际中，我们能看到300个terms（使用500,000元数据记录索引和一个统一的值分布）

We have developed an extension to Apache Lucene that stores the numerical values in a special string-encoded format with variable precision (all numerical values like doubles, longs, floats, and ints are converted to lexicographic sortable string representations and stored with different precisions, for a more detailed description of how the values are stored, see NumericUtils). A range is then divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. This reduces the number of terms dramatically.

For the variant that stores long values in 8 different precisions (each reduced by 8 bits) that uses a lowest precision of 1 byte, the index contains only a maximum of 256 distinct values in the lowest precision. Overall, a range could consist of a theoretical maximum of 7*255*2 + 255 = 3825 distinct terms (when there is a term for every distinct value of an 8-byte-number in the index and the range covers almost all of them; a maximum of 255 distinct values is used because it would always be possible to reduce the full 256 values to one term with degraded precision). In practice, we have seen up to 300 terms in most cases (index with 500,000 metadata records and a uniform value distribution).

免责声明： 本文仅代表作者个人观点，与爱易网无关。其原创性以及文中陈述文字和内容未经本站证实，对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺，请读者仅作参考，并请自行核实相关内容。

通译 Lucene's NumericRangeQuery javadoc

相关资料更多>

推荐阅读更多>