Provides the core functionality for spell-checking documents

Package Specification

This package provides the interfaces for the notions of dictionary, edit distance, phonetic hash, spell event and spell-check iterator. For most of these interfaces a default implementation for english languages is provided. These implementations can be reused in custom dictionaries or spell-check iterators, or replaced by more specialized algorithms for a particular group of languages.

Spell Check Engine

The central point to access the spell-checker functionality is the interface ISpellCheckEngine. Implementations of this interface provide support for life-cycle management, registering and unregistering dictionaries, changing the locale of the engine and creating a spell-checker for a specific language.

The following steps are needed to obtain a spell-checker for a specific language:

When requesting a new spell-checker with a different locale via createSpellChecker(Locale), the spell-checker is reconfigured with the new dictionaries. More concretely, the old dictionary is unregistered and a new one registered for the desired locale is associated with the spell-checker. If no such dictionary is available, no spell-checker is returned and the locale of the engine is reset to its default locale.

Dictionaries

Dictionaries are the data structures to hold word lists for a particular language. All implementations of dictionaries must implement the interface ISpellDictionary. It provides support for life-cycle management as well as the facility to query words from the list, add words to the list and get correction proposals for incorrectly spelt words.

This package provides a default implementation of a dictionary (AbstractSpellDictionary) that uses algorithms convenient for english languages.
Every dictionary needs two kinds of algorithms to be plugged in:

By plugging in custom implementations of one or both of these algorithms the abstract implementation AbstractSpellDictionary can be customized to specified languages and alphabets.

Spell Check Iterators

Instances of ISpellChecker are usually language-, locale- and medium independent implementations and therefore need an input provider. The interface ISpellCheckIterator serves this purpose by abstracting the tokenizing of text media to a simple iteration. The actual spell-check process is launched by calling ISpellChecker#execute(ISpellCheckIterator). This method uses the indicated spell-check iterator to determine the words that are to be spell-checked. This package provides no default implementation of a spell-check iterator.

Event Handling

To communicate the results of a spell-check pass, spell-checkers fire spell events that inform listeners about the status of a particular word being spell-checked. Instances that are interested in receiving spell events must implement the interface ISpellEventListener and register with the spell-checker before the spell-check process starts.

A spell event contains the following information:

Spell event listeners are free to handle the events in any way. However, listeners are not allowed to block during the event handling unless the spell-checking process happens in another thread.