public class Soundex extends Object implements StringEncoder
maxLength
field is not actually used.Modifier and Type | Field and Description |
---|---|
static char |
SILENT_MARKER
The marker character used to indicate a silent (ignored) character.
|
static Soundex |
US_ENGLISH
An instance of Soundex using the US_ENGLISH_MAPPING mapping.
|
static Soundex |
US_ENGLISH_GENEALOGY
An instance of Soundex using the mapping as per the Genealogy site:
http://www.genealogy.com/articles/research/00000060.html
|
static String |
US_ENGLISH_MAPPING_STRING
This is a default mapping of the 26 letters used in US English.
|
static Soundex |
US_ENGLISH_SIMPLIFIED
An instance of Soundex using the Simplified Soundex mapping, as described here:
http://west-penwith.org.uk/misc/soundex.htm
|
Constructor and Description |
---|
Soundex()
Creates an instance using US_ENGLISH_MAPPING
|
Soundex(char[] mapping)
Creates a soundex instance using the given mapping.
|
Soundex(String mapping)
Creates a refined soundex instance using a custom mapping.
|
Soundex(String mapping,
boolean specialCaseHW)
Creates a refined soundex instance using a custom mapping.
|
Modifier and Type | Method and Description |
---|---|
int |
difference(String s1,
String s2)
Encodes the Strings and returns the number of characters in the two encoded Strings that are the same.
|
Object |
encode(Object obj)
Encodes an Object using the soundex algorithm.
|
String |
encode(String str)
Encodes a String using the soundex algorithm.
|
String |
soundex(String str)
Retrieves the Soundex code for a given String object.
|
public static final char SILENT_MARKER
Note: the US_ENGLISH_MAPPING_STRING
does not use this mechanism
because changing it might break existing code. Mappings that don't contain
a silent marker code are treated as though H and W are silent.
To override this, use the Soundex(String, boolean)
constructor.
public static final Soundex US_ENGLISH
US_ENGLISH_MAPPING
,
US_ENGLISH_MAPPING_STRING
public static final Soundex US_ENGLISH_GENEALOGY
This treats vowels (AEIOUY), H and W as silent letters. Such letters are ignored (after the first) and do not act as separators when dropping duplicate codes.
The codes for consonants are otherwise the same as for
US_ENGLISH_MAPPING_STRING
and US_ENGLISH_SIMPLIFIED
public static final String US_ENGLISH_MAPPING_STRING
0
for a letter position
means do not encode, but treat as a separator when it occurs between consonants with the same code.
(This constant is provided as both an implementation convenience and to allow Javadoc to pick up the value for the constant values page.)
Note that letters H and W are treated specially. They are ignored (after the first letter) and don't act as separators between consonants with the same code.
US_ENGLISH_MAPPING
,
Constant Field Valuespublic static final Soundex US_ENGLISH_SIMPLIFIED
This treats H and W the same as vowels (AEIOUY).
Such letters aren't encoded (after the first), but they do
act as separators when dropping duplicate codes.
The mapping is otherwise the same as for US_ENGLISH
public Soundex()
Soundex(char[])
,
US_ENGLISH_MAPPING
public Soundex(char[] mapping)
If the mapping contains an instance of SILENT_MARKER
then H and W are not given special treatment
mapping
- Mapping array to use when finding the corresponding code for a given characterpublic Soundex(String mapping)
If the mapping contains an instance of SILENT_MARKER
then H and W are not given special treatment
mapping
- Mapping string to use when finding the corresponding code for a given characterpublic Soundex(String mapping, boolean specialCaseHW)
mapping
- Mapping string to use when finding the corresponding code for a given characterspecialCaseHW
- if true, thenpublic int difference(String s1, String s2) throws EncoderException
s1
- A String that will be encoded and compared.s2
- A String that will be encoded and compared.EncoderException
- if an error occurs encoding one of the stringsSoundexUtils.difference(StringEncoder,String,String)
,
MS
T-SQL DIFFERENCE public Object encode(Object obj) throws EncoderException
encode
in interface Encoder
obj
- Object to encodeEncoderException
- if the parameter supplied is not of type java.lang.StringIllegalArgumentException
- if a character is not mappedpublic String encode(String str)
encode
in interface StringEncoder
str
- A String object to encodeIllegalArgumentException
- if a character is not mappedpublic String soundex(String str)
str
- String to encode using the Soundex algorithmIllegalArgumentException
- if a character is not mapped