|
Smalltalk/X WebserverDocumentation of class 'PhoneticStringUtilities::Caverphone2StringComparator': |
|
|
|
Class: Caverphone2StringComparator (private in PhoneticStringUtilitiesThis class is only visible from within PhoneticStringUtilities.Inheritance:Object | +--PhoneticStringUtilities::PhoneticStringComparator | +--PhoneticStringUtilities::SingleResultPhoneticStringComparator | +--PhoneticStringUtilities::Caverphone2StringComparator
Description:
Caverphone (2) Algorithm:
see http://caversham.otago.ac.nz/files/working/ctp150804.pdf
Caverphone 2.0 is being made available for free use for the benefit of anyone who has a use for it,
with the proviso that the Caversham Project at the University of Otago should be acknowledged as the
original source (which is hereby done ;-).
• Start with a Surname or Firstname
• Convert to lowercase
This coding system is case sensitive, implementations should acknowledge that a is not the same as A
• Remove anything not A-Z
The main intention of this is to remove spaces, hyphens, and apostrophes.
example: o'brian becomes obrian
• If the name starts with cough make it cou2f
2 is being used as a temporary placeholder to indicate a consonant which we are no longer interested in.
• If the name starts with rough make it rou2f
• If the name starts with tough make it tou2f
• If the name starts with enough make it enou2f
• If the name starts with gn make it 2n
• If the name ends with mb make it m2
• replace cq with 2q
• replace ci with si
• replace ce with se
• replace cy with sy
• replace tch with 2ch
• replace c with k
• replace q with k
• replace x with k
• replace v with f
• replace dg with 2g
• replace tio with sio
• replace tia with sia
• replace d with t
• replace ph with fh
• replace b with p
• replace sh with s2
• replace z with s
• replace and initial vowel with an A
• replace all other vowels with a 3
3 is a temporary placeholder marking a vowel
• replace 3gh3 with 3kh3
Exceptions are dealt with before the general case. gh between vowels is an except of the more general gh rule.
• replace gh with 22
• replace g with k
• replace groups of the letter s with a S
Continuous strings of s are replace by a single S
• replace groups of the letter t with a T
• replace groups of the letter p with a P
• replace groups of the letter k with a K
• replace groups of the letter f with a F
• replace groups of the letter m with a M
• replace groups of the letter n with a N
• replace w3 with W3
• replace wy with Wy
• replace wh3 with Wh3
• replace why with Why
• replace w with 2
• replace and initial h with an A
• replace all other occurrences of h with a 2
• replace r3 with R3
• replace ry with Ry
• replace r with 2
• replace l3 with L3
• replace ly with Ly
• replace l with 2
• replace j with y
• replace y3 with Y3
• replace y with 2
• remove all 2s
• remove all 3s
• put six (v1) / ten (v2) 1s on the end
• take the first six characters as the code (caverphone 1);
/ take the first ten characters as the code (caverphone 2);
self new encode:'david' -> 'TFT1111111'
self new encode:'whittle' -> 'WTA1111111'
self new encode:'Stevenson' -> 'STFNSN1111'
self new encode:'Peter' -> 'PTA1111111'
self new encode:'washington' -> 'WSNKTN1111'
self new encode:'lee' -> 'LA11111111'
self new encode:'Gutierrez' -> 'KTRS111111'
self new encode:'Pfister' -> 'PFSTA11111'
self new encode:'Jackson' -> 'YKSN111111'
self new encode:'Tymczak' -> 'TMKSK11111'
self new encode:'add' -> 'AT11111111'
self new encode:'aid' -> 'AT11111111'
self new encode:'at' -> 'AT11111111'
self new encode:'art' -> 'AT11111111'
self new encode:'earth' -> 'AT11111111'
self new encode:'head' -> 'AT11111111'
self new encode:'old' -> 'AT11111111'
self new encode:'ready' -> 'RTA1111111'
self new encode:'rather' -> 'RTA1111111'
self new encode:'able' -> 'APA1111111'
self new encode:'appear' -> 'APA1111111'
self new encode:'Deedee' -> 'TTA1111111'
Instance protocol:api
|
|
|
|
ST/X 7.1.0.0; WebServer 1.663 at exept.de:8081; Wed, 17 Dec 2025 08:33:35 GMT
|