Class: Babushka::Spell
Class Method Summary collapse
-
.for(string, choices:) ⇒ Object
Return a new array containing the terms from this array that were determined to be 'similar to'
string.
Class Method Details
.for(string, choices:) ⇒ Object
Return a new array containing the terms from this array that were
determined to be 'similar to' string. A string is considered to
be similar to another if its Levenshtein distance is less than
either the string's length minus one, or one fifth is length plus
two, whichever is less.
word length 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
This means that:
- a little over one fifth of strings longer than 4 characters can be misspelt;
- strings 3 or 4 characters long can have 1 or 2 misspelt characters respectively;
- strings 1 or 2 characters long must be spelt correctly.
17 18 19 20 21 22 23 24 25 26 27 |
# File 'lib/babushka/spell.rb', line 17 def self.for(string, choices:) choices.map {|term| [term, Babushka::Levenshtein.distance(term, string)] }.select {|(i, similarity)| similarity <= [i.length - 2, (i.length / 5) + 2].min }.sort_by {|(_, similarity)| similarity }.map {|(i, _)| i } end |