edu.berkeley.nlp.lm
Class AbstractContextEncodedNgramLanguageModel<W>
java.lang.Object
edu.berkeley.nlp.lm.AbstractNgramLanguageModel<W>
edu.berkeley.nlp.lm.AbstractContextEncodedNgramLanguageModel<W>
- Type Parameters:
W
-
- All Implemented Interfaces:
- ContextEncodedNgramLanguageModel<W>, NgramLanguageModel<W>, Serializable
- Direct Known Subclasses:
- ContextEncodedCachingLmWrapper, ContextEncodedProbBackoffLm
public abstract class AbstractContextEncodedNgramLanguageModel<W>
- extends AbstractNgramLanguageModel<W>
- implements ContextEncodedNgramLanguageModel<W>, Serializable
Default implementation of all ContextEncodedNgramLanguageModel functionality
except ContextEncodedNgramLanguageModel.getLogProb(long, int, int, LmContextInfo)
,
{@link #getOffsetForNgram(int[], int, int), and {
- Author:
- adampauls
- See Also:
- Serialized Form
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
AbstractContextEncodedNgramLanguageModel
public AbstractContextEncodedNgramLanguageModel(int lmOrder,
WordIndexer<W> wordIndexer,
float oovWordLogProb)
scoreSentence
public float scoreSentence(List<W> sentence)
- Description copied from interface:
NgramLanguageModel
- Scores a complete sentence, taking appropriate care with the start- and
end-of-sentence symbols. This is a convenience method and will generally
be inefficient.
- Specified by:
scoreSentence
in interface NgramLanguageModel<W>
- Returns:
getLogProb
public float getLogProb(List<W> phrase)
- Description copied from interface:
NgramLanguageModel
- Scores an n-gram. This is a convenience method and will generally be
relatively inefficient. More efficient versions are available in
ArrayEncodedNgramLanguageModel.getLogProb(int[], int, int)
and
ContextEncodedNgramLanguageModel.getLogProb(long, int, int, edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel.LmContextInfo)
.
- Specified by:
getLogProb
in interface NgramLanguageModel<W>
getLogProb
public abstract float getLogProb(long contextOffset,
int contextOrder,
int word,
ContextEncodedNgramLanguageModel.LmContextInfo outputContext)
- Description copied from interface:
ContextEncodedNgramLanguageModel
- Get the score for an n-gram, and also get the context offset of the
n-gram's suffix.
- Specified by:
getLogProb
in interface ContextEncodedNgramLanguageModel<W>
- Parameters:
contextOffset
- Offset of context (prefix) of an n-gramcontextOrder
- The (0-based) length of context
(i.e.
order == 0
iff context
refers to a
unigram).word
- Last word of the n-gramoutputContext
- Offset of the suffix of the input n-gram. If the parameter is
null
it will be ignored. This can be passed to
future queries for efficient access.
- Returns:
getOffsetForNgram
public abstract ContextEncodedNgramLanguageModel.LmContextInfo getOffsetForNgram(int[] ngram,
int startPos,
int endPos)
- Description copied from interface:
ContextEncodedNgramLanguageModel
- Gets the offset which refers to an n-gram. If the n-gram is not in the
model, then it returns the shortest suffix of the n-gram which is. This
operation is not necessarily fast.
- Specified by:
getOffsetForNgram
in interface ContextEncodedNgramLanguageModel<W>
getNgramForOffset
public abstract int[] getNgramForOffset(long contextOffset,
int contextOrder,
int word)
- Description copied from interface:
ContextEncodedNgramLanguageModel
- Gets the n-gram referred to by a context-encoding. This operation is not
necessarily fast.
- Specified by:
getNgramForOffset
in interface ContextEncodedNgramLanguageModel<W>