new Tokenizer(dic)
Tokenizer
Parameters:
Name | Type | Description |
---|---|---|
dic |
DynamicDictionaries | Dictionaries used by this tokenizer |
- Source:
Methods
-
<static> splitByPunctuation(input) → {Array.<string>}
-
Split into sentence by punctuation
Parameters:
Name Type Description input
string Input text - Source:
Returns:
Sentences end with punctuation- Type
- Array.<string>
-
getLattice(text) → {ViterbiLattice}
-
Build word lattice
Parameters:
Name Type Description text
string Input text to analyze - Source:
Returns:
Word lattice- Type
- ViterbiLattice
-
tokenize(text) → {Array}
-
Tokenize text
Parameters:
Name Type Description text
string Input text to analyze - Source:
Returns:
Tokens- Type
- Array