12Feb/113

## Damerau – Levenshtein String Comparison

Fuzzy logic is something which I dabbled into during my stint at Computime, for reasons of which you'll pardon me for not publicizing and it has been a long time since I've had the pleasure of working with Levenshtein Distance algorithms for fuzzy string comparison. What is Levenshtein Distance? The long and short of it is that it is a methodology for measuring the number of differences between two strings. This counts for deletions, insertions and substitutions. An adaptation by Damerau takes into consideration transpositions (writing 'teh' instead of 'the') as a single difference, rather than two as the original Levenshtein algorithm specifies. My adaptation of the DamerauLevenshtein algorithm:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
public static class FuzzyLogic { public static int DamerauLevenshteinDistance(string str1, string str2) { if (str1 == str2) return 0; if (String.IsNullOrEmpty(str1) || String.IsNullOrEmpty(str2)) return (str1 ?? String.Empty).Length + (str2 ?? String.Empty).Length; if (str1.Length > str2.Length) { var tmp = str1; str1 = str2; str2 = tmp; } if (str2.Contains(str1)) return str2.Length - str1.Length; int m = str1.Length; int n = str2.Length; int[,] H = new int[m + 2, n + 2]; int INF = m + n; H[0, 0] = INF; for (int i = 0; i <= m; i++) { H[i + 1, 1] = i; H[i + 1, 0] = INF; } for (int j = 0; j <= n; j++) { H[1, j + 1] = j; H[0, j + 1] = INF; } SortedDictionary<char, int> sd = new SortedDictionary<char, int>(); foreach (char l in (str1 + str2)) if (!sd.ContainsKey(l)) sd.Add(l, 0); for (int i = 1; i <= m; i++) { int db = 0; for (int j = 1; j <= n; j++) { int i1 = sd[str2[j - 1]]; int j1 = db; if (str1[i - 1] == str2[j - 1]) { H[i + 1, j + 1] = H[i, j]; db = j; } else H[i + 1, j + 1] = Math.Min(H[i, j], Math.Min(H[i + 1, j], H[i, j + 1])) + 1; H[i + 1, j + 1] = Math.Min(H[i + 1, j + 1], H[i1, j1] + (i - i1 - 1) + 1 + (j - j1 - 1)); } sd[str1[i - 1]] = i; } return H[m + 1, n + 1]; } } } |

1 2 3 4 5 6 7 8 |
public static bool Like(this string arg, string comparison) { int distance = FuzzyLogic.DamerauLevenshtein(arg, comparison); int length = arg.Length > comparison.Length ? arg.Length : comparison.Length; double score = 1.0 - (double)distance / length; return score >= 0.6; } |

1 |
return score <= 0.4 |

- Behaviour
- Projection
- Transposition
- Rebellious
- Precipitation
- Precipice
- Kleptomaniac
- Serotonin
- Bicycle
- Queasy
- Flabbergasted
- Destroyers