The importance of math for software engineering
The other day I had the chance to attend a presentation where Peter Novig showed everyone just how important is math for someone who wants to be a really great Software Engineer.
In his presentation he mentioned an example of a Word Segmentation Program. The main function of this program is to figure out “where the word boundaries are”. This program is useful for spelling corrections, and is more notoriously used in Google Search Suggestions.
Ex.: If you input “thisisatest”. The expected result of the program is “this is a test”
Norvig showed everyone how one programmer had tackled the problem by coming up with a solution that was made up of —over 2,000 lines of code. It included all the rules of grammar in the English language. Indeed, this solution was an extremely accurate Word Segmenter.
However, Norvig explained everyone how he tackled the Word Segmenter problem in a much easier way, using math. To summarize it: he defined a probabilistic model that looks up all possible segmentations of a given text, and the likelihood of its existence in a trillion word data-set. The program enumerates the most probable solution candidates, and chooses the one with the highest probability.
Norvig’s program was written in just a little over 30 lines of code and achieved a similar accuracy as the 2,000 line solution. And best of all, if you wanted to make it work in other languages, just get a word data-set in that other language and you are good to go.
How about that? Now ask the first programmer to make his program work in German, Spanish, … [insert_language].
Get the source code and learn more about Peter Norvig’s work on Natural Language. You can also read the chapter of his book: Beautiful Data.
3 Notes/ Hide
-
porunmundomejor reblogged this from javierlara
-
maskindg likes this
-
javierlara posted this

