Class RemoveFrequentValues
- java.lang.Object
-
- weka.filters.Filter
-
- weka.filters.unsupervised.instance.RemoveFrequentValues
-
- All Implemented Interfaces:
java.io.Serializable,CapabilitiesHandler,OptionHandler,RevisionHandler,UnsupervisedFilter
public class RemoveFrequentValues extends Filter implements OptionHandler, UnsupervisedFilter
Determines which values (frequent or infrequent ones) of an (nominal) attribute are retained and filters the instances accordingly. In case of values with the same frequency, they are kept in the way they appear in the original instances object. E.g. if you have the values "1,2,3,4" with the frequencies "10,5,5,3" and you chose to keep the 2 most common values, the values "1,2" would be returned, since the value "2" comes before "3", even though they have the same frequency. Valid options are:-C <num> Choose attribute to be used for selection.
-N <num> Number of values to retain for the sepcified attribute, i.e. the ones with the most instances (default 2).
-L Instead of values with the most instances the ones with the least are retained.
-H When selecting on nominal attributes, removes header references to excluded values.
-V Invert matching sense.
- Version:
- $Revision: 8972 $
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description RemoveFrequentValues()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.StringattributeIndexTipText()Returns the tip text for this propertybooleanbatchFinished()Signifies that this batch of input to the filter is finished.voiddetermineValues(Instances inst)determines the values to retain, it is always at least 1 and up to the maximum number of distinct valuesjava.lang.StringgetAttributeIndex()Get the index of the attribute used.CapabilitiesgetCapabilities()Returns the Capabilities of this filter.booleangetInvertSelection()Get whether the supplied columns are to be removed or keptbooleangetModifyHeader()Gets whether the header will be modified when selecting on nominal attributes.intgetNumValues()Gets how many values are retainedjava.lang.String[]getOptions()Gets the current settings of the filter.java.lang.StringgetRevision()Returns the revision string.booleangetUseLeastValues()Gets whether to use values with least or most instancesjava.lang.StringglobalInfo()Returns a string describing this filterbooleaninput(Instance instance)Input an instance for filtering.java.lang.StringinvertSelectionTipText()Returns the tip text for this propertybooleanisNominal()Returns true if selection attribute is nominal.java.util.EnumerationlistOptions()Returns an enumeration describing the available options.static voidmain(java.lang.String[] argv)Main method for testing this class.java.lang.StringmodifyHeaderTipText()Returns the tip text for this propertyjava.lang.StringnumValuesTipText()Returns the tip text for this propertyvoidsetAttributeIndex(java.lang.String attIndex)Sets index of the attribute used.booleansetInputFormat(Instances instanceInfo)Sets the format of the input instances.voidsetInvertSelection(boolean invert)Set whether selected values should be removed or kept.voidsetModifyHeader(boolean newModifyHeader)Sets whether the header will be modified when selecting on nominal attributes.voidsetNumValues(int numValues)Sets how many values are retainedvoidsetOptions(java.lang.String[] options)Parses a given list of options.voidsetUseLeastValues(boolean leastValues)Sets whether to use values with least or most instancesjava.lang.StringuseLeastValuesTipText()Returns the tip text for this property-
Methods inherited from class weka.filters.Filter
batchFilterFile, filterFile, getCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, numPendingOutput, output, outputPeek, toString, useFilter, wekaStaticWrapper
-
-
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing this filter- Returns:
- a description of the classifier suitable for displaying in the explorer/experimenter gui
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceOptionHandler- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.ExceptionParses a given list of options. Valid options are:-C <num> Choose attribute to be used for selection.
-N <num> Number of values to retain for the sepcified attribute, i.e. the ones with the most instances (default 2).
-L Instead of values with the most instances the ones with the least are retained.
-H When selecting on nominal attributes, removes header references to excluded values.
-V Invert matching sense.
- Specified by:
setOptionsin interfaceOptionHandler- Parameters:
options- the list of options as an array of strings- Throws:
java.lang.Exception- if an option is not supported
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the filter.- Specified by:
getOptionsin interfaceOptionHandler- Returns:
- an array of strings suitable for passing to setOptions
-
attributeIndexTipText
public java.lang.String attributeIndexTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getAttributeIndex
public java.lang.String getAttributeIndex()
Get the index of the attribute used.- Returns:
- the index of the attribute
-
setAttributeIndex
public void setAttributeIndex(java.lang.String attIndex)
Sets index of the attribute used.- Parameters:
attIndex- the index of the attribute
-
numValuesTipText
public java.lang.String numValuesTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getNumValues
public int getNumValues()
Gets how many values are retained- Returns:
- how many values are retained
-
setNumValues
public void setNumValues(int numValues)
Sets how many values are retained- Parameters:
numValues- the number of values to retain
-
useLeastValuesTipText
public java.lang.String useLeastValuesTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getUseLeastValues
public boolean getUseLeastValues()
Gets whether to use values with least or most instances- Returns:
- true if values with least instances are retained
-
setUseLeastValues
public void setUseLeastValues(boolean leastValues)
Sets whether to use values with least or most instances- Parameters:
leastValues- whether values with least or most instances are retained
-
modifyHeaderTipText
public java.lang.String modifyHeaderTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getModifyHeader
public boolean getModifyHeader()
Gets whether the header will be modified when selecting on nominal attributes.- Returns:
- true if so.
-
setModifyHeader
public void setModifyHeader(boolean newModifyHeader)
Sets whether the header will be modified when selecting on nominal attributes.- Parameters:
newModifyHeader- true if so.
-
invertSelectionTipText
public java.lang.String invertSelectionTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getInvertSelection
public boolean getInvertSelection()
Get whether the supplied columns are to be removed or kept- Returns:
- true if the supplied columns will be kept
-
setInvertSelection
public void setInvertSelection(boolean invert)
Set whether selected values should be removed or kept. If true the selected values are kept and unselected values are deleted.- Parameters:
invert- the new invert setting
-
isNominal
public boolean isNominal()
Returns true if selection attribute is nominal.- Returns:
- true if selection attribute is nominal
-
determineValues
public void determineValues(Instances inst)
determines the values to retain, it is always at least 1 and up to the maximum number of distinct values- Parameters:
inst- the Instances to determine the values from which are kept
-
getCapabilities
public Capabilities getCapabilities()
Returns the Capabilities of this filter.- Specified by:
getCapabilitiesin interfaceCapabilitiesHandler- Overrides:
getCapabilitiesin classFilter- Returns:
- the capabilities of this object
- See Also:
Capabilities
-
setInputFormat
public boolean setInputFormat(Instances instanceInfo) throws java.lang.Exception
Sets the format of the input instances.- Overrides:
setInputFormatin classFilter- Parameters:
instanceInfo- an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).- Returns:
- true if the outputFormat can be collected immediately
- Throws:
UnsupportedAttributeTypeException- if the specified attribute is not nominal.java.lang.Exception- if the inputFormat can't be set successfully
-
input
public boolean input(Instance instance)
Input an instance for filtering. Ordinarily the instance is processed and made available for output immediately. Some filters require all instances be read before producing output.
-
batchFinished
public boolean batchFinished()
Signifies that this batch of input to the filter is finished. If the filter requires all instances prior to filtering, output() may now be called to retrieve the filtered instances.- Overrides:
batchFinishedin classFilter- Returns:
- true if there are instances pending output
- Throws:
java.lang.IllegalStateException- if no input structure has been defined
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevisionin interfaceRevisionHandler- Overrides:
getRevisionin classFilter- Returns:
- the revision
-
main
public static void main(java.lang.String[] argv)
Main method for testing this class.- Parameters:
argv- should contain arguments to the filter: use -h for help
-
-