T - The type of the tokens returned by the Tokenizerpublic interface TokenizerFactory<T> extends IteratorFromReaderFactory<T>
public static TokenizerFactory<? extends HasWord> newTokenizerFactory();
public static TokenizerFactory<Word> newWordTokenizerFactory(String options);
These are expected by certain JavaNLP code (e.g., LexicalizedParser),
which wants to produce a TokenizerFactory by reflection.| Modifier and Type | Method and Description |
|---|---|
Tokenizer<T> |
getTokenizer(Reader r)
Get a tokenizer for this reader.
|
Tokenizer<T> |
getTokenizer(Reader r,
String extraOptions)
Get a tokenizer for this reader.
|
void |
setOptions(String options)
Sets default options for how tokenizers built from this factory should behave.
|
getIteratorTokenizer<T> getTokenizer(Reader r)
r - A Reader (which is assumed to already by buffered, if appropriate)Tokenizer<T> getTokenizer(Reader r, String extraOptions)
r - A Reader (which is assumed to already by buffered, if appropriate)extraOptions - Options for how this tokenizer should behavevoid setOptions(String options)
options - Options for how this tokenizer should behave