edu.berkeley.nlp.lm.io
Class TextReader<W>
java.lang.Object
edu.berkeley.nlp.lm.io.TextReader<W>
- Type Parameters:
W
-
- All Implemented Interfaces:
- LmReader<LongRef,LmReaderCallback<LongRef>>
public class TextReader<W>
- extends Object
- implements LmReader<LongRef,LmReaderCallback<LongRef>>
Class for reading raw text files.
- Author:
- adampauls
Method Summary |
void |
parse(LmReaderCallback<LongRef> callback)
Reads newline-separated plain text from inputFiles, and writes an ARPA lm
file to outputFile. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TextReader
public TextReader(List<String> inputFiles,
WordIndexer<W> wordIndexer)
TextReader
public TextReader(Iterable<String> lineIterator,
WordIndexer<W> wordIndexer)
parse
public void parse(LmReaderCallback<LongRef> callback)
- Reads newline-separated plain text from inputFiles, and writes an ARPA lm
file to outputFile. If files have a .gz suffix, then they will be
(un)zipped as necessary.
- Specified by:
parse
in interface LmReader<LongRef,LmReaderCallback<LongRef>>
- Parameters:
inputFiles
- outputFile
-