Understanding
the SOUNDEX code
Introduction
The
Soundex is a system begun during Franklin Roosevelt's presidency.
His administration wanted to put many people to work in government
programs. One of these programs was the WPA, which employed people
to do many things for the government including organizing the Soundex.
These
workers each took the census records, one county at a time, and wrote
out a filing card for each household, naming all people listed in
the census. Included on the card are: name, age, location, birthday
and relationship to head of household for every person in every household.
Not all information on the original census is given, but enough to
identify the members of each household. Every surname was given a
code (see Soundex code). These codes give numerical value to each
letter of the alphabet.
The
workers began with the 1900 census and listed each household
in the entire United States on a separate file card. This system enables
us to look at one surname in a whole state and find the exact locality
of all persons in the state with that surname. Then we can go to the
census records for that county, find the page listed in the Soundex
and view all of the information collected by the census taker. No
longer is it necessary to spend hours and hours looking at every entry,
county by county to find the right entry.
Census
records from 1880, 1900 and 1910 have been soundexed as of 1995. Others
records will be processed in the future.
Example
Soundex
algorithms are methods by which one encodes the letters of a name to
a alphanumeric code in order to store and retrieve similar-sounding
names. For example, the names Castellano Kastelyano Chasteliano Qastelano
might conceivably be names that were all spelled in the Castilian way
at one time, but now have a variety of spellings because of the vagaries
of different alphabets in various countries. Typically,
the Russell Soundex code strips all vowels (except when the first letter
of the name), and strips duplicate consonants: Castellano becomes CSTLN
Next, similar phonemes are encoded with similar numbers. For example,
M and N, both nasal letters, are encoded with the same digit. For instance,
the Russell Soundex code for CSTLN is C234.
Any double
letters side by side should be treated as one letter. For example
HASSON is coded as if it were spelled HASON (H250). ALHADEFF is
coded as if it were ALHADEF (A431).
Variations
in spellings or misspellings should produce the same code number:
BENSON = B525 BENSION = B525 BENZION = B525 BENCION = B525.
1
= B P F V |
2
= C S K G J Q X Z |
3
= D T |
4
= L |
5
= M N |
6
= R |
FASSAC
SOUNDEX Code Generator
|
Instructions:
- Enter
the surname for which to generate SOUNDEX code.
- Click
on the SOUNDEX button.
- See
the corresponding SOUNDEX code in the field.
|
Additional
Rules for Understanding the Conversion
-
Disregard
the letters A E I O U W Y H
- If
your name has a prefix like Van, Von, De, Di, Le or La - code
it both with and without the prefix. It might be listed
under either code. (Mc and Mac are not considered prefixes.)
-
If your surname has double letters, they should be treated as
one letter. Example: Hasson. The second S should be
slashed out. In the name Alhadeff, the second F should be slashed
out.
-
Your surname may have different letters side-by-side which have
the same coding number. Example: Hamner. (5 is the
number for both M & N.) These letters should be treated as one
letter and the N should be slashed out. Another example: Jackson.
(2 is the number for C, K and S.) The K and S should be slashed.
This rule applies when the letters are at the beginning of the
surname, also. Example: Pfister. Both P and F are in the #1
category, therefore the letter F should be slashed out.
|
|