edu.berkeley.nlp.lm
Class AbstractArrayEncodedNgramLanguageModel<W>

java.lang.Object
  extended by edu.berkeley.nlp.lm.AbstractNgramLanguageModel<W>
      extended by edu.berkeley.nlp.lm.AbstractArrayEncodedNgramLanguageModel<W>
Type Parameters:
W -
All Implemented Interfaces:
ArrayEncodedNgramLanguageModel<W>, NgramLanguageModel<W>, Serializable
Direct Known Subclasses:
ArrayEncodedCachingLmWrapper, ArrayEncodedProbBackoffLm, StupidBackoffLm

public abstract class AbstractArrayEncodedNgramLanguageModel<W>
extends AbstractNgramLanguageModel<W>
implements ArrayEncodedNgramLanguageModel<W>, Serializable

Default implementation of all NGramLanguageModel functionality except getLogProb(int[], int, int).

Author:
adampauls
See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.ArrayEncodedNgramLanguageModel
ArrayEncodedNgramLanguageModel.DefaultImplementations
 
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
NgramLanguageModel.StaticMethods
 
Field Summary
 
Fields inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
lmOrder, oovWordLogProb
 
Constructor Summary
AbstractArrayEncodedNgramLanguageModel(int lmOrder, WordIndexer<W> wordIndexer, float oovWordLogProb)
           
 
Method Summary
 float getLogProb(int[] ngram)
          Equivalent to getLogProb(ngram, 0, ngram.length)
abstract  float getLogProb(int[] ngram, int startPos, int endPos)
          Calculate language model score of an n-gram.
 float getLogProb(List<W> phrase)
          Scores an n-gram.
 float scoreSentence(List<W> sentence)
          Scores a complete sentence, taking appropriate care with the start- and end-of-sentence symbols.
 
Methods inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
getLmOrder, getWordIndexer, setOovWordLogProb
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
getLmOrder, getWordIndexer, setOovWordLogProb
 

Constructor Detail

AbstractArrayEncodedNgramLanguageModel

public AbstractArrayEncodedNgramLanguageModel(int lmOrder,
                                              WordIndexer<W> wordIndexer,
                                              float oovWordLogProb)
Method Detail

scoreSentence

public float scoreSentence(List<W> sentence)
Description copied from interface: NgramLanguageModel
Scores a complete sentence, taking appropriate care with the start- and end-of-sentence symbols. This is a convenience method and will generally be inefficient.

Specified by:
scoreSentence in interface NgramLanguageModel<W>
Returns:

getLogProb

public float getLogProb(List<W> phrase)
Description copied from interface: NgramLanguageModel
Scores an n-gram. This is a convenience method and will generally be relatively inefficient. More efficient versions are available in ArrayEncodedNgramLanguageModel.getLogProb(int[], int, int) and ContextEncodedNgramLanguageModel.getLogProb(long, int, int, edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel.LmContextInfo) .

Specified by:
getLogProb in interface NgramLanguageModel<W>

getLogProb

public float getLogProb(int[] ngram)
Description copied from interface: ArrayEncodedNgramLanguageModel
Equivalent to getLogProb(ngram, 0, ngram.length)

Specified by:
getLogProb in interface ArrayEncodedNgramLanguageModel<W>
See Also:
ArrayEncodedNgramLanguageModel.getLogProb(int[], int, int)

getLogProb

public abstract float getLogProb(int[] ngram,
                                 int startPos,
                                 int endPos)
Description copied from interface: ArrayEncodedNgramLanguageModel
Calculate language model score of an n-gram. Warning: if you pass in an n-gram of length greater than getLmOrder(), this call will silently ignore the extra words of context. In other words, if you pass in a 5-gram (endPos-startPos == 5) to a 3-gram model, it will only score the words from startPos + 2 to endPos.

Specified by:
getLogProb in interface ArrayEncodedNgramLanguageModel<W>
Parameters:
ngram - array of words in integer representation
startPos - start of the portion of the array to be read
endPos - end of the portion of the array to be read.
Returns: