Lazy Levenshtein: Using Abbreviations and Spellchecked Inputs in Ruby
I have been spending a lot of time writing Ruby programs that take in data through the terminal. One of the problems is that mis-spelling something can cause the program to crash, and I want to be as quick as possible when doing data entry.
One of my programs asks which server environment I would like to use before I start messing with any data (development, integration, staging, production). It would be great if all of the following abbreviations or misspellings would choose the development environment, and keep the program rolling:
-
dev
-
development
-
devel
-
deevleopmnt
You get the idea- abbreviations and spellchecking from known inputs. To accomplish this I leverage the Levenshtein distance algorithm, more commonly known as “edit distance”. This algorithm compares two strings and returns an integer that is equal to the amount of edits needed to transform the first string into the second.
Here is the Github Gist for Lazy Levenshtein, with the sample code below so we can dig through it.
The three parameters are the input itself, an array of possible matches, and a boolean that tells the method whether or not you want to match abbreviations. The method sets up a Levenshtein comparison for each potential match (using the Ruby Amatch library), and scores the comparison. We are playing golf here, because the lowest score wins the game!
The method also reverses the array in the main loop, which puts priority to the first items in the array if there happens to be a tie between matches. Unlike typical “spellcheck”, this method will never return “not found”, it will always return a match, and if the “matches” array is empty, it simply returns the provided input.
This has helped me make inputting much faster with smarter defaults, and given me the piece of mind that my misspellings will always turn into known/safe values.
Recent Comments
Archives
- April 2023
- January 2023
- November 2022
- May 2022
- March 2022
- January 2022
- December 2021
- April 2021
- December 2020
- October 2020
- August 2020
- July 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- October 2019
- January 2019
- December 2018
- November 2018
- August 2018
- July 2018
- April 2018
- March 2018
- November 2017
- October 2017
- February 2017
- October 2016
- August 2016
- July 2016
- November 2015
- October 2013
- February 2013
- January 2013
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- February 2012
- December 2011