edu.berkeley.nlp.lm.io
Class KneserNeyFileWritingLmReaderCallback<W>

java.lang.Object
  extended by edu.berkeley.nlp.lm.io.KneserNeyFileWritingLmReaderCallback<W>
Type Parameters:
W -
All Implemented Interfaces:
ArpaLmReaderCallback<ProbBackoffPair>, LmReaderCallback<ProbBackoffPair>, NgramOrderedLmReaderCallback<ProbBackoffPair>

public class KneserNeyFileWritingLmReaderCallback<W>
extends Object
implements ArpaLmReaderCallback<ProbBackoffPair>

Class for producing a Kneser-Ney language model in ARPA format from raw text.

Author:
adampauls

Constructor Summary
KneserNeyFileWritingLmReaderCallback(File outputFile, WordIndexer<W> wordIndexer)
           
KneserNeyFileWritingLmReaderCallback(PrintWriter out, WordIndexer<W> wordIndexer)
           
 
Method Summary
 void call(int[] ngram, int startPos, int endPos, ProbBackoffPair value, String words)
          Called for each n-gram
 void cleanup()
          Called once all reading is done.
 void handleNgramOrderFinished(int order)
          Called when all n-grams of a given order are finished
 void handleNgramOrderStarted(int order)
          Called when n-grams of a given order are started
 void initWithLengths(List<Long> numNGrams)
          Called initially with a list of how many n-grams will appear for each order.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

KneserNeyFileWritingLmReaderCallback

public KneserNeyFileWritingLmReaderCallback(File outputFile,
                                            WordIndexer<W> wordIndexer)

KneserNeyFileWritingLmReaderCallback

public KneserNeyFileWritingLmReaderCallback(PrintWriter out,
                                            WordIndexer<W> wordIndexer)
Method Detail

handleNgramOrderFinished

public void handleNgramOrderFinished(int order)
Description copied from interface: NgramOrderedLmReaderCallback
Called when all n-grams of a given order are finished

Specified by:
handleNgramOrderFinished in interface NgramOrderedLmReaderCallback<ProbBackoffPair>

handleNgramOrderStarted

public void handleNgramOrderStarted(int order)
Description copied from interface: NgramOrderedLmReaderCallback
Called when n-grams of a given order are started

Specified by:
handleNgramOrderStarted in interface NgramOrderedLmReaderCallback<ProbBackoffPair>

call

public void call(int[] ngram,
                 int startPos,
                 int endPos,
                 ProbBackoffPair value,
                 String words)
Description copied from interface: LmReaderCallback
Called for each n-gram

Specified by:
call in interface LmReaderCallback<ProbBackoffPair>
Parameters:
ngram - The integer representation of the words as given by the provided WordIndexer
value - The value of the n-gram
words - The string representation of the n-gram (space separated)

cleanup

public void cleanup()
Description copied from interface: LmReaderCallback
Called once all reading is done.

Specified by:
cleanup in interface LmReaderCallback<ProbBackoffPair>

initWithLengths

public void initWithLengths(List<Long> numNGrams)
Description copied from interface: ArpaLmReaderCallback
Called initially with a list of how many n-grams will appear for each order.

Specified by:
initWithLengths in interface ArpaLmReaderCallback<ProbBackoffPair>
Parameters:
numNGrams - maps n-gram orders to number of n-grams (i.e. numNGrams.get(0) is the number of unigrams)