![]() ![]() If a sentence needs to start with an uppercase If a sentence needs to end with a punctuation If a sentence needs to start with a letter Prefer the blocklist approach when possible.Īrray of matching configurations: each configuration is an Array of two values. Only used when allowed_symbols_regex is not set or is an empty String.Īrray of disallowed words. This could for example disallow two spaces following each other Each character gets matched against this pattern.Īrray of broken whitespaces. Note that the replacements get applied before any other rules are checked. toml file in the rules directory to enable a new locale. The following rules can be configured per language. texts/ > file.en.txt Using language rules Pip3 install -r requirements.txt # can be skipped if your language doesn't use the Python segmenterĬargo run -release - extract-file -l en -d. Note: as long as we're using the current inline-python dependency, we need to use the Nightly version of Rust.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |