1 <!doctype html public "-//w3c//dtd html 4.0 transitional//en">
4 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
5 <meta name="Author" content="IBM">
6 <meta name="GENERATOR" content="Mozilla/4.51 [en] (WinNT; I) [Netscape]">
7 <title>Package-level Javadoc</title>
10 Provides the core functionality for spell-checking documents
12 Package Specification</h2>
13 This package provides the interfaces for the notions of dictionary, edit distance, phonetic hash,
14 spell event and spell-check iterator. For most of these interfaces a default implementation
15 for english languages is provided. These implementations can be reused in custom dictionaries or
16 spell-check iterators, or replaced by more specialized algorithms for a particular group of languages.
18 Spell Check Engine</h3>
19 The central point to access the spell-checker functionality is the interface <tt>ISpellCheckEngine</tt>.
20 Implementations of this interface provide support for life-cycle management, registering and unregistering
21 dictionaries, changing the locale of the engine and creating a spell-checker for a specific language.
23 The following steps are needed to obtain a spell-checker for a specific language:
25 <li>Create an instance of <tt>ISpellCheckEngine</tt>. In this package, no default implementation is provided,
26 since the management of the dictionary registering and loading is application dependent. Usually, instances
27 of <tt>ISpellCheckEngine</tt> are implemented as singletons.</li>
28 <li>Create the appropriate dictionaries that should be used during the spell-check process. All dictionaries that
29 can be registered with <tt>ISpellCheckEngine</tt> must implement the interface <tt>ISpellCheckDictionary</tt>.
30 For this interface, an abstract implementation is provided in the class <tt>AbstractSpellDictionary</tt>.
31 Depending on the language of the words contained in this dictionary, custom algorithms for the phonetic hash
32 (<tt>IPhoneticHashProvider</tt>) and the edit distance (<tt>IPhoneticDistanceAlgorithm</tt>) should be implemented
33 and registered with the dictionary.</li>
34 <li>Instances of spell-checkers can now be created by calling <tt>createSpellChecker(Locale)</tt>, where the locale
35 denotes the language that the spell-checker should use while executing.</li>
37 When requesting a new spell-checker with a different locale via <tt>createSpellChecker(Locale)</tt>, the spell-checker is
38 reconfigured with the new dictionaries. More concretely, the old dictionary is unregistered and a new one registered for the
39 desired locale is associated with the spell-checker. If no such dictionary is available, no spell-checker is returned and
40 the locale of the engine is reset to its default locale.
43 Dictionaries are the data structures to hold word lists for a particular language. All implementations of dictionaries must
44 implement the interface <tt>ISpellDictionary</tt>. It provides support for life-cycle management as well as the facility to query
45 words from the list, add words to the list and get correction proposals for incorrectly spelt words.
47 This package provides a default implementation of a dictionary (<tt>AbstractSpellDictionary</tt>) that uses algorithms
48 convenient for english languages. <br>
49 Every dictionary needs two kinds of algorithms to be plugged in:
51 <li>An edit distance algorithm: Edit distance algorithms implement the interface <tt>IPhoneticDistanceAlgorithm</tt>. The algorithm
52 is used to determine the similarity between two words. This package provides a default implementation for languages using the latin alphabet (<tt>DefaultPhoneticDistanceAlgorithm</tt>).
53 The default algorithm uses the Levenshtein text edit distance.</li>
54 <li>A hash algorithm: Phonetic hash providers implement the interface <tt>IPhoneticHashProvider</tt>. The purpose of
55 phonetic hashes is to have a representation of words which allows comparing it to other, similar words. This package provides a default
56 implementation which is convenient for slavic and english languages. It uses the double metaphone algorithm by published
57 Lawrence Philips.</li>
59 By plugging in custom implementations of one or both of these algorithms the abstract implementation <tt>AbstractSpellDictionary</tt> can
60 be customized to specified languages and alphabets.
62 Spell Check Iterators</h3>
63 Instances of <tt>ISpellChecker</tt> are usually language-, locale- and medium independent implementations and therefore need an input provider. The
64 interface <tt>ISpellCheckIterator</tt> serves this purpose by abstracting the tokenizing of text media to a simple iteration. The actual spell-check process
65 is launched by calling <tt>ISpellChecker#execute(ISpellCheckIterator)</tt>. This method uses the indicated spell-check iterator to determine the
66 words that are to be spell-checked. This package provides no default implementation of a spell-check iterator.
69 To communicate the results of a spell-check pass, spell-checkers fire spell events that inform listeners about the status
70 of a particular word being spell-checked. Instances that are interested in receiving spell events must implement
71 the interface <tt>ISpellEventListener</tt> and register with the spell-checker before the spell-check process starts.<p>
72 A spell event contains the following information:
74 <li>The word being spell-checked</li>
75 <li>The begin index of the current word in the text medium</li>
76 <li>The end index in the text medium</li>
77 <li>A flag whether this word was found in one of the registered dictionaries</li>
78 <li>A flag that indicates whether this word starts a new sentence</li>
79 <li>The set of proposals if the word was not correctly spelt. This information is lazily computed.</li>
81 Spell event listeners are free to handle the events in any way. However, listeners are not allowed to block during
82 the event handling unless the spell-checking process happens in another thread.