dragon.nlp.extract
Class AbstractConceptExtractor

java.lang.Object
  |
  +--dragon.nlp.extract.AbstractConceptExtractor
All Implemented Interfaces:
ConceptExtractor
Direct Known Subclasses:
AbstractPhraseExtractor, AbstractTermExtractor, AbstractTokenExtractor

public abstract class AbstractConceptExtractor
extends java.lang.Object
implements ConceptExtractor

Abstract class for concept extraction which is the super class of AbstractPhraseExtractor, AbstractTermExtractor, AbstractTokenExtrator, and AbstractTripleExtractor

Copyright: Copyright (c) 2005

Company: IST, Drexel University

Version:
1.0
Author:
Davis Zhou

Field Summary
protected  ConceptFilter cf
           
protected  boolean conceptFilter_enabled
           
protected  java.util.ArrayList conceptList
           
protected  DocumentParser parser
           
protected  boolean subconcept_enabled
           
 
Constructor Summary
AbstractConceptExtractor()
           
 
Method Summary
 java.util.ArrayList extractFromDoc(Document doc)
          Extracts concepts from a parsed document
 java.util.ArrayList extractFromDoc(java.lang.String doc)
          Extracts concepts from a raw document
 ConceptFilter getConceptFilter()
          Gets the concept filter used for this extractor.
 java.util.ArrayList getConceptList()
           
 DocumentParser getDocumentParser()
          Gets document parser.
 boolean getFilteringOption()
          Tests if the extractor applies concept filtering.
 boolean getSubConceptOption()
           
 SortedArray mergeConceptByEntryID(java.util.ArrayList termList)
          The concepts with identical entry id will be merged.
 SortedArray mergeConceptByName(java.util.ArrayList termList)
          The concepts with identical names will be merged.
 void print(java.io.PrintWriter out)
          Print out the extract concepts to the speficid print writer.
 void print(java.io.PrintWriter out, java.util.ArrayList conceptList)
          Print out the given list of concepts to the speficid print writer.
 void setConceptFilter(ConceptFilter cf)
          Sets the concept filter for the concept extatractor
 void setDocumentParser(DocumentParser parser)
          Sets the document parser for the concept extractor.
 void setFilteringOption(boolean option)
          Sets the option of concept filtering
 void setSubConceptOption(boolean option)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface dragon.nlp.extract.ConceptExtractor
extractFromSentence, getLemmatiser, initDocExtraction, setLemmatiser, supportConceptEntry, supportConceptName
 

Field Detail

conceptList

protected java.util.ArrayList conceptList

conceptFilter_enabled

protected boolean conceptFilter_enabled

subconcept_enabled

protected boolean subconcept_enabled

cf

protected ConceptFilter cf

parser

protected DocumentParser parser
Constructor Detail

AbstractConceptExtractor

public AbstractConceptExtractor()
Method Detail

setSubConceptOption

public void setSubConceptOption(boolean option)
Specified by:
setSubConceptOption in interface ConceptExtractor

getSubConceptOption

public boolean getSubConceptOption()
Specified by:
getSubConceptOption in interface ConceptExtractor

getFilteringOption

public boolean getFilteringOption()
Description copied from interface: ConceptExtractor
Tests if the extractor applies concept filtering.

Specified by:
getFilteringOption in interface ConceptExtractor
Returns:
true if the extractor applies concept filtering

setFilteringOption

public void setFilteringOption(boolean option)
Description copied from interface: ConceptExtractor
Sets the option of concept filtering

Specified by:
setFilteringOption in interface ConceptExtractor
Parameters:
option - the option of concept filtering

setConceptFilter

public void setConceptFilter(ConceptFilter cf)
Description copied from interface: ConceptExtractor
Sets the concept filter for the concept extatractor

Specified by:
setConceptFilter in interface ConceptExtractor
Parameters:
cf - the concept filter

getConceptFilter

public ConceptFilter getConceptFilter()
Description copied from interface: ConceptExtractor
Gets the concept filter used for this extractor.

Specified by:
getConceptFilter in interface ConceptExtractor
Returns:
the concept filter used.

getConceptList

public java.util.ArrayList getConceptList()
Specified by:
getConceptList in interface ConceptExtractor
Returns:
a list of extracted concepts

print

public void print(java.io.PrintWriter out)
Description copied from interface: ConceptExtractor
Print out the extract concepts to the speficid print writer.

Specified by:
print in interface ConceptExtractor
Parameters:
out - the print writer

print

public void print(java.io.PrintWriter out,
                  java.util.ArrayList conceptList)
Description copied from interface: ConceptExtractor
Print out the given list of concepts to the speficid print writer.

Specified by:
print in interface ConceptExtractor
Parameters:
out - the print writer
conceptList - a list concepts for output

mergeConceptByEntryID

public SortedArray mergeConceptByEntryID(java.util.ArrayList termList)
Description copied from interface: ConceptExtractor
The concepts with identical entry id will be merged.

Specified by:
mergeConceptByEntryID in interface ConceptExtractor
Parameters:
termList - a list of concepts
Returns:
a list of merged and sorted unique concepts

mergeConceptByName

public SortedArray mergeConceptByName(java.util.ArrayList termList)
Description copied from interface: ConceptExtractor
The concepts with identical names will be merged.

Specified by:
mergeConceptByName in interface ConceptExtractor
Parameters:
termList - a list of concepts
Returns:
a list of merged and sorted unique concepts

extractFromDoc

public java.util.ArrayList extractFromDoc(java.lang.String doc)
Description copied from interface: ConceptExtractor
Extracts concepts from a raw document

Specified by:
extractFromDoc in interface ConceptExtractor
Parameters:
doc - the content of the document
Returns:
a list of extacted concepts

extractFromDoc

public java.util.ArrayList extractFromDoc(Document doc)
Description copied from interface: ConceptExtractor
Extracts concepts from a parsed document

Specified by:
extractFromDoc in interface ConceptExtractor
Parameters:
doc - a parsed document
Returns:
a list of extacted concepts

getDocumentParser

public DocumentParser getDocumentParser()
Description copied from interface: ConceptExtractor
Gets document parser.

Specified by:
getDocumentParser in interface ConceptExtractor
Returns:
the document parser.

setDocumentParser

public void setDocumentParser(DocumentParser parser)
Description copied from interface: ConceptExtractor
Sets the document parser for the concept extractor.

Specified by:
setDocumentParser in interface ConceptExtractor
Parameters:
parser - document parser