public class DFRSimilarityFactory extends SimilarityFactory
DFRSimilarity
You must specify the implementations for all three components of DFR (strings). In general the models are parameter-free, but two of the normalizations take floating point parameters (see below):
basicModel: Basic model of information content:
Be: Limiting form of Bose-Einstein
G: Geometric approximation of Bose-Einstein
P: Poisson approximation of the Binomial
D: Divergence approximation of the Binomial
I(n): Inverse document frequency
I(ne): Inverse expected document
frequency [mixture of Poisson and IDF]
I(F): Inverse term frequency
[approximation of I(ne)]
afterEffect: First normalization of information
gain:
normalization: Second (length) normalization:
H1: Uniform distribution of term
frequency
1
H2: term frequency density inversely
related to length
1
H3: term frequency normalization
provided by Dirichlet prior
800
Z: term frequency normalization provided
by a Zipfian relation
A/(A+1)
where A measures the specificity of the language.
The default is 0.3
none: no second normalization
Optional settings:
SimilarityBase.setDiscountOverlaps(boolean)CLASS_NAME, params| Constructor and Description |
|---|
DFRSimilarityFactory() |
| Modifier and Type | Method and Description |
|---|---|
Similarity |
getSimilarity() |
void |
init(SolrParams params) |
getClassArg, getNamedPropertyValues, getParamspublic void init(SolrParams params)
init in class SimilarityFactorypublic Similarity getSimilarity()
getSimilarity in class SimilarityFactoryCopyright © 2000-2015 Apache Software Foundation. All Rights Reserved.