edu.berkeley.nlp.lm
Class AbstractArrayEncodedNgramLanguageModel<W>
java.lang.Object
edu.berkeley.nlp.lm.AbstractNgramLanguageModel<W>
edu.berkeley.nlp.lm.AbstractArrayEncodedNgramLanguageModel<W>
- Type Parameters:
W
-
- All Implemented Interfaces:
- ArrayEncodedNgramLanguageModel<W>, NgramLanguageModel<W>, Serializable
- Direct Known Subclasses:
- ArrayEncodedCachingLmWrapper, ArrayEncodedProbBackoffLm, StupidBackoffLm
public abstract class AbstractArrayEncodedNgramLanguageModel<W>
- extends AbstractNgramLanguageModel<W>
- implements ArrayEncodedNgramLanguageModel<W>, Serializable
Default implementation of all NGramLanguageModel functionality except
getLogProb(int[], int, int)
.
- Author:
- adampauls
- See Also:
- Serialized Form
Method Summary |
float |
getLogProb(int[] ngram)
Equivalent to getLogProb(ngram, 0, ngram.length) |
abstract float |
getLogProb(int[] ngram,
int startPos,
int endPos)
Calculate language model score of an n-gram. |
float |
getLogProb(List<W> phrase)
Scores an n-gram. |
float |
scoreSentence(List<W> sentence)
Scores a complete sentence, taking appropriate care with the start- and
end-of-sentence symbols. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
AbstractArrayEncodedNgramLanguageModel
public AbstractArrayEncodedNgramLanguageModel(int lmOrder,
WordIndexer<W> wordIndexer,
float oovWordLogProb)
scoreSentence
public float scoreSentence(List<W> sentence)
- Description copied from interface:
NgramLanguageModel
- Scores a complete sentence, taking appropriate care with the start- and
end-of-sentence symbols. This is a convenience method and will generally
be inefficient.
- Specified by:
scoreSentence
in interface NgramLanguageModel<W>
- Returns:
getLogProb
public float getLogProb(List<W> phrase)
- Description copied from interface:
NgramLanguageModel
- Scores an n-gram. This is a convenience method and will generally be
relatively inefficient. More efficient versions are available in
ArrayEncodedNgramLanguageModel.getLogProb(int[], int, int)
and
ContextEncodedNgramLanguageModel.getLogProb(long, int, int, edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel.LmContextInfo)
.
- Specified by:
getLogProb
in interface NgramLanguageModel<W>
getLogProb
public float getLogProb(int[] ngram)
- Description copied from interface:
ArrayEncodedNgramLanguageModel
- Equivalent to
getLogProb(ngram, 0, ngram.length)
- Specified by:
getLogProb
in interface ArrayEncodedNgramLanguageModel<W>
- See Also:
ArrayEncodedNgramLanguageModel.getLogProb(int[], int, int)
getLogProb
public abstract float getLogProb(int[] ngram,
int startPos,
int endPos)
- Description copied from interface:
ArrayEncodedNgramLanguageModel
- Calculate language model score of an n-gram. Warning: if you
pass in an n-gram of length greater than
getLmOrder()
,
this call will silently ignore the extra words of context. In other
words, if you pass in a 5-gram (endPos-startPos == 5
) to
a 3-gram model, it will only score the words from startPos + 2
to endPos
.
- Specified by:
getLogProb
in interface ArrayEncodedNgramLanguageModel<W>
- Parameters:
ngram
- array of words in integer representationstartPos
- start of the portion of the array to be readendPos
- end of the portion of the array to be read.
- Returns: