This essay is written to provide a detailed look at stochastic language models and alternatives. Firstly I intend to explain the problem language models seek to solve in the field of speech recognition then to outline the various approaches taken. Before I do this I should mention that prior to the language model becoming useful, a speech recognizer will already have endeavored to identify the phonemes (atomic sound units) present in an utterance. It will most likely have done this by matching them against Gaussian models of sound contained in Hidden Markov Models (HMMs) (see [Cassidy, 2002] at pages 63 to 68 for a fuller explanation of HMMs).
Ever since [Baker, 1975] first proposed the use of network representations for
Interesting work has come out of Cambridge, England. [Everman & Woodland, 2000] wrote on the improvement of robust estimates of confidence scores for recognizers through the use of word level posterior probabilities.
There are ways of organizing the search space that can help reduce the complexity. First, the number of possible next phonemes is much smaller than the number of possible next words. By sharing this fact across all the words the number of paths to be grown at each word end can be drastically reduced. This concept resulted in the implementation of lexically tree-based searches. For example if you think of the letters overleaf as phonemes: