edu.berkeley.nlp.lm.io
Class TextReader<W>

java.lang.Object
  extended by edu.berkeley.nlp.lm.io.TextReader<W>
Type Parameters:
W -
All Implemented Interfaces:
LmReader<LongRef,LmReaderCallback<LongRef>>

public class TextReader<W>
extends Object
implements LmReader<LongRef,LmReaderCallback<LongRef>>

Class for reading raw text files.

Author:
adampauls

Constructor Summary
TextReader(Iterable<String> lineIterator, WordIndexer<W> wordIndexer)
           
TextReader(List<String> inputFiles, WordIndexer<W> wordIndexer)
           
 
Method Summary
 void parse(LmReaderCallback<LongRef> callback)
          Reads newline-separated plain text from inputFiles, and writes an ARPA lm file to outputFile.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextReader

public TextReader(List<String> inputFiles,
                  WordIndexer<W> wordIndexer)

TextReader

public TextReader(Iterable<String> lineIterator,
                  WordIndexer<W> wordIndexer)
Method Detail

parse

public void parse(LmReaderCallback<LongRef> callback)
Reads newline-separated plain text from inputFiles, and writes an ARPA lm file to outputFile. If files have a .gz suffix, then they will be (un)zipped as necessary.

Specified by:
parse in interface LmReader<LongRef,LmReaderCallback<LongRef>>
Parameters:
inputFiles -
outputFile -