|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjml.topics.TopicModel
jml.topics.LDA
public class LDA
Field Summary | |
---|---|
(package private) LdaGibbsSampler |
gibbsSampler
|
Fields inherited from class jml.topics.TopicModel |
---|
dataMatrix, indicatorMatrix, nTopic, topicMatrix |
Constructor Summary | |
---|---|
LDA()
|
|
LDA(int nTopic)
|
|
LDA(LDAOptions LDAOptions)
|
Method Summary | |
---|---|
static void |
main(java.lang.String[] args)
|
void |
readCorpus(java.util.ArrayList<java.util.TreeMap<java.lang.Integer,java.lang.Integer>> docTermCountArray)
Load corpus and documents from a ArrayList<TreeMap<Integer, Integer>> instance. |
void |
readCorpus(int[][] documents)
Feed documents from a 2D integer array. |
void |
readCorpus(org.apache.commons.math.linear.RealMatrix X)
Load corpus and documents from a RealMatrix instance. |
void |
readCorpus(java.lang.String LDAInputDataFilePath)
Load corpus and documents from a LDAInput file. |
void |
readCorpusFromDocTermCountFile(java.lang.String docTermCountFilePath)
Load corpus and documents from a text file located at String docTermCountFilePath. |
void |
train()
Train this topic model to fit the given corpus. |
Methods inherited from class jml.topics.TopicModel |
---|
getIndicatorMatrix, getTopicMatrix |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
LdaGibbsSampler gibbsSampler
Constructor Detail |
---|
public LDA(LDAOptions LDAOptions)
public LDA()
public LDA(int nTopic)
Method Detail |
---|
public static void main(java.lang.String[] args)
args
- public void train()
TopicModel
train
in class TopicModel
public void readCorpus(java.util.ArrayList<java.util.TreeMap<java.lang.Integer,java.lang.Integer>> docTermCountArray)
corpus
and documents
from a ArrayList<TreeMap<Integer, Integer>>
instance.
Each element of the ArrayList
is a doc-term count mapping.
docTermCountArray
- A ArrayList<TreeMap<Integer, Integer>>
instance,
each element of the ArrayList
records the doc-term
count mapping for the corresponding document.public void readCorpus(java.lang.String LDAInputDataFilePath)
corpus
and documents
from a LDAInput file.
Term indices must start from 0.
LDAInputDataFilePath
- The file path specifying the path of the LDAInput file.public void readCorpusFromDocTermCountFile(java.lang.String docTermCountFilePath)
corpus
and documents
from a text file located at String
docTermCountFilePath.
docTermCountFilePath
- A String
specifying the location of the text file holding doc-term-count matrix data.public void readCorpus(int[][] documents)
documents
- a 2D integer array where documents[m][n] is
the term index in the vocabulary for the n-th
word of the m-th document. Indices always start
from 0.public void readCorpus(org.apache.commons.math.linear.RealMatrix X)
corpus
and documents
from a RealMatrix
instance.
readCorpus
in class TopicModel
X
- a matrix with each column being a term count vector for a document
with X(i, j) being the number of occurrence for the i-th vocabulary
term in the j-th document
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |