edu.berkeley.nlp.lm.io
Class MakeLmBinaryFromGoogle
java.lang.Object
edu.berkeley.nlp.lm.io.MakeLmBinaryFromGoogle
public class MakeLmBinaryFromGoogle
- extends Object
Given a directory in Google n-grams format, builds a binary representation of
a stupid-backoff language model language model and writes it to disk.
Language model binaries are significantly smaller and faster to load. Note:
actually running this code on the full Google-ngrams corpus can be very slow
and memory intensive -- on our machines, it takes about 32GB of memory and 15
hours.
Note that if the input/output files have a .gz
suffix, they will
be unzipped/zipped as necessary.
- Author:
- adampauls
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
MakeLmBinaryFromGoogle
public MakeLmBinaryFromGoogle()
main
public static void main(String[] argv)