public class TikaLanguageIdentifierUpdateProcessor extends LanguageIdentifierUpdateProcessor
allMapFieldsSet, docIdField, enabled, enableMapping, enforceSchema, fallbackFields, fallbackValue, inputFields, langField, langPattern, langsField, langWhitelist, lcMap, log, mapFields, mapIndividual, mapIndividualFieldsSet, mapKeepOrig, mapLcMap, mapOverwrite, mapPattern, mapReplaceStr, maxFieldValueChars, maxTotalChars, overwrite, schema, threshold, tikaSimilarityPatternnextDOCID_FIELD_DEFAULT, DOCID_LANGFIELD_DEFAULT, DOCID_LANGSFIELD_DEFAULT, DOCID_PARAM, DOCID_THRESHOLD_DEFAULT, ENFORCE_SCHEMA, FALLBACK, FALLBACK_FIELDS, FIELDS_PARAM, LANG_FIELD, LANG_WHITELIST, LANGS_FIELD, LANGUAGE_ID, LCMAP, MAP_ENABLE, MAP_FL, MAP_INDIVIDUAL, MAP_INDIVIDUAL_FL, MAP_KEEP_ORIG, MAP_LCMAP, MAP_OVERWRITE, MAP_PATTERN, MAP_PATTERN_DEFAULT, MAP_REPLACE, MAP_REPLACE_DEFAULT, MAX_FIELD_VALUE_CHARS, MAX_FIELD_VALUE_CHARS_DEFAULT, MAX_TOTAL_CHARS, MAX_TOTAL_CHARS_DEFAULT, OVERWRITE, THRESHOLD| Constructor and Description |
|---|
TikaLanguageIdentifierUpdateProcessor(SolrQueryRequest req,
SolrQueryResponse rsp,
UpdateRequestProcessor next) |
| Modifier and Type | Method and Description |
|---|---|
protected String |
concatFields(SolrInputDocument doc)
Concatenates content from multiple fields
|
protected List<DetectedLanguage> |
detectLanguage(SolrInputDocument doc)
Detects language(s) from a string.
|
getMappedField, isEnabled, normalizeLangCode, process, processAdd, resolveLanguage, resolveLanguage, setEnabledfinish, processCommit, processDelete, processMergeIndexes, processRollbackpublic TikaLanguageIdentifierUpdateProcessor(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next)
protected List<DetectedLanguage> detectLanguage(SolrInputDocument doc)
LanguageIdentifierUpdateProcessordetectLanguage in class LanguageIdentifierUpdateProcessordoc - The content to identifyprotected String concatFields(SolrInputDocument doc)
Copyright © 2000-2015 Apache Software Foundation. All Rights Reserved.