Package weka.classifiers.bayes
Class ComplementNaiveBayes
- java.lang.Object
-
- weka.classifiers.Classifier
-
- weka.classifiers.bayes.ComplementNaiveBayes
-
- All Implemented Interfaces:
java.io.Serializable,java.lang.Cloneable,CapabilitiesHandler,OptionHandler,RevisionHandler,TechnicalInformationHandler,WeightedInstancesHandler
public class ComplementNaiveBayes extends Classifier implements OptionHandler, WeightedInstancesHandler, TechnicalInformationHandler
Class for building and using a Complement class Naive Bayes classifier.
For more information see,
Jason D. Rennie, Lawrence Shih, Jaime Teevan, David R. Karger: Tackling the Poor Assumptions of Naive Bayes Text Classifiers. In: ICML, 616-623, 2003.
P.S.: TF, IDF and length normalization transforms, as described in the paper, can be performed through weka.filters.unsupervised.StringToWordVector. BibTeX:@inproceedings{Rennie2003, author = {Jason D. Rennie and Lawrence Shih and Jaime Teevan and David R. Karger}, booktitle = {ICML}, pages = {616-623}, publisher = {AAAI Press}, title = {Tackling the Poor Assumptions of Naive Bayes Text Classifiers}, year = {2003} }Valid options are:-N Normalize the word weights for each class
-S Smoothing value to avoid zero WordGivenClass probabilities (default=1.0).
- Version:
- $Revision: 5516 $
- Author:
- Ashraf M. Kibriya (amk14@cs.waikato.ac.nz)
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description ComplementNaiveBayes()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidbuildClassifier(Instances instances)Generates the classifier.doubleclassifyInstance(Instance instance)Classifies a given instance.CapabilitiesgetCapabilities()Returns default capabilities of the classifier.booleangetNormalizeWordWeights()Returns true if the word weights for each class are to be normalizedjava.lang.String[]getOptions()Gets the current settings of the classifier.java.lang.StringgetRevision()Returns the revision string.doublegetSmoothingParameter()Gets the smoothing value to be used to avoid zero WordGivenClass probabilities.TechnicalInformationgetTechnicalInformation()Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.java.lang.StringglobalInfo()Returns a string describing this classifierjava.util.EnumerationlistOptions()Returns an enumeration describing the available options.static voidmain(java.lang.String[] argv)Main method for testing this class.java.lang.StringnormalizeWordWeightsTipText()Returns the tip text for this propertyvoidsetNormalizeWordWeights(boolean doNormalize)Sets whether if the word weights for each class should be normalizedvoidsetOptions(java.lang.String[] options)Parses a given list of options.voidsetSmoothingParameter(double val)Sets the smoothing value used to avoid zero WordGivenClass probabilitiesjava.lang.StringsmoothingParameterTipText()Returns the tip text for this propertyjava.lang.StringtoString()Prints out the internal model built by the classifier.-
Methods inherited from class weka.classifiers.Classifier
debugTipText, distributionForInstance, forName, getDebug, makeCopies, makeCopy, setDebug
-
-
-
-
Method Detail
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceOptionHandler- Overrides:
listOptionsin classClassifier- Returns:
- an enumeration of all the available options.
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the classifier.- Specified by:
getOptionsin interfaceOptionHandler- Overrides:
getOptionsin classClassifier- Returns:
- an array of strings suitable for passing to setOptions
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.ExceptionParses a given list of options. Valid options are:-N Normalize the word weights for each class
-S Smoothing value to avoid zero WordGivenClass probabilities (default=1.0).
- Specified by:
setOptionsin interfaceOptionHandler- Overrides:
setOptionsin classClassifier- Parameters:
options- the list of options as an array of strings- Throws:
java.lang.Exception- if an option is not supported
-
getNormalizeWordWeights
public boolean getNormalizeWordWeights()
Returns true if the word weights for each class are to be normalized- Returns:
- true if the word weights are normalized
-
setNormalizeWordWeights
public void setNormalizeWordWeights(boolean doNormalize)
Sets whether if the word weights for each class should be normalized- Parameters:
doNormalize- whether the word weights are to be normalized
-
normalizeWordWeightsTipText
public java.lang.String normalizeWordWeightsTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getSmoothingParameter
public double getSmoothingParameter()
Gets the smoothing value to be used to avoid zero WordGivenClass probabilities.- Returns:
- the smoothing value
-
setSmoothingParameter
public void setSmoothingParameter(double val)
Sets the smoothing value used to avoid zero WordGivenClass probabilities- Parameters:
val- the new smooting value
-
smoothingParameterTipText
public java.lang.String smoothingParameterTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing this classifier- Returns:
- a description of the classifier suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformationin interfaceTechnicalInformationHandler- Returns:
- the technical information about this class
-
getCapabilities
public Capabilities getCapabilities()
Returns default capabilities of the classifier.- Specified by:
getCapabilitiesin interfaceCapabilitiesHandler- Overrides:
getCapabilitiesin classClassifier- Returns:
- the capabilities of this classifier
- See Also:
Capabilities
-
buildClassifier
public void buildClassifier(Instances instances) throws java.lang.Exception
Generates the classifier.- Specified by:
buildClassifierin classClassifier- Parameters:
instances- set of instances serving as training data- Throws:
java.lang.Exception- if the classifier has not been built successfully
-
classifyInstance
public double classifyInstance(Instance instance) throws java.lang.Exception
Classifies a given instance.The classification rule is:
MinC(forAllWords(ti*Wci))
where
ti is the frequency of word i in the given instance
Wci is the weight of word i in Class c.For more information see section 4.4 of the paper mentioned above in the classifiers description.
- Overrides:
classifyInstancein classClassifier- Parameters:
instance- the instance to classify- Returns:
- the index of the class the instance is most likely to belong.
- Throws:
java.lang.Exception- if the classifier has not been built yet.
-
toString
public java.lang.String toString()
Prints out the internal model built by the classifier. In this case it prints out the word weights calculated when building the classifier.- Overrides:
toStringin classjava.lang.Object
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevisionin interfaceRevisionHandler- Overrides:
getRevisionin classClassifier- Returns:
- the revision
-
main
public static void main(java.lang.String[] argv)
Main method for testing this class.- Parameters:
argv- the options
-
-