Big Data and making the Tehran Salami
When it comes to explaining the algorithms and implications of data, Google’s Peter Norvig is one of the most enlightening and entertaining. His 2011 lecture at the University of British Columbia, “The Unreasonable Effectiveness of Data”, is a must-watch no matter your technical level:
The phrase, “Tehran Salami” comes from a point in Norvig’s lecture in which he uses his colleague Mehran Sahami as an example of the limits of dictionary-based spell checkers.
This is also a fun way to explain n-gram sequences in the context of textual segmentation and the power of simple counting:
Check out Norvig’s homepage for a year’s worth of readings to study, including “How to Write a Spelling Corrector” and a much fuller explanation of ngrams.