public class DefaultLexicalMapper extends Object implements Mapper, Serializable
| Modifier and Type | Field and Description |
|---|---|
Pattern |
arabicDigit |
Pattern |
arabicPunc |
Pattern |
latinPunc |
Pattern |
segmentationMarker |
| Constructor and Description |
|---|
DefaultLexicalMapper() |
| Modifier and Type | Method and Description |
|---|---|
boolean |
canChangeEncoding(String parent,
String element)
Indicates whether
child can be converted to another encoding. |
static void |
main(String[] args) |
String |
map(String parent,
String element)
Maps from one string representation to another.
|
void |
setup(File path,
String... options)
Perform initialization prior to the first call to
map. |
public final Pattern latinPunc
public final Pattern arabicPunc
public final Pattern arabicDigit
public final Pattern segmentationMarker
public String map(String parent, String element)
Mapperpublic void setup(File path, String... options)
Mappermap.public boolean canChangeEncoding(String parent, String element)
Mapperchild can be converted to another encoding. In the ATB, for example,
if a punctuation character is labeled with the "PUNC" POS tag, then that character should not
be converted from Buckwalter to UTF-8.canChangeEncoding in interface Mapperparent - element's context (e.g., the parent node in a parse tree)element - The string to be transformed.public static void main(String[] args)