|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
W
- public interface ContextEncodedNgramLanguageModel<W>
Interface for language models which expose the internal context-encoding for
more efficient queries. (Note: language model implementations may internally
use a context-encoding without implementing this interface). A
context-encoding encodes an n-gram as a integer representing the last word,
and an offset which serves as a logical pointer to the (n-1) prefix words.
The integers represent words of type W
in the vocabulary, and the mapping
from the vocabulary to integers is managed by an instance of the WordIndexer
class.
Nested Class Summary | |
---|---|
static class |
ContextEncodedNgramLanguageModel.DefaultImplementations
|
static class |
ContextEncodedNgramLanguageModel.LmContextInfo
Simple class for returning context offsets |
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel |
---|
NgramLanguageModel.StaticMethods |
Method Summary | |
---|---|
float |
getLogProb(long contextOffset,
int contextOrder,
int word,
ContextEncodedNgramLanguageModel.LmContextInfo outputContext)
Get the score for an n-gram, and also get the context offset of the n-gram's suffix. |
int[] |
getNgramForOffset(long contextOffset,
int contextOrder,
int word)
Gets the n-gram referred to by a context-encoding. |
ContextEncodedNgramLanguageModel.LmContextInfo |
getOffsetForNgram(int[] ngram,
int startPos,
int endPos)
Gets the offset which refers to an n-gram. |
Methods inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel |
---|
getLmOrder, getLogProb, getWordIndexer, scoreSentence, setOovWordLogProb |
Method Detail |
---|
float getLogProb(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo outputContext)
contextOffset
- Offset of context (prefix) of an n-gramcontextOrder
- The (0-based) length of context
(i.e.
order == 0
iff context
refers to a
unigram).word
- Last word of the n-gramoutputContext
- Offset of the suffix of the input n-gram. If the parameter is
null
it will be ignored. This can be passed to
future queries for efficient access.
ContextEncodedNgramLanguageModel.LmContextInfo getOffsetForNgram(int[] ngram, int startPos, int endPos)
int[] getNgramForOffset(long contextOffset, int contextOrder, int word)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |