dragon.nlp.extract
Class AbstractTermExtractor

java.lang.Object
  |
  +--dragon.nlp.extract.AbstractConceptExtractor
        |
        +--dragon.nlp.extract.AbstractTermExtractor
All Implemented Interfaces:
ConceptExtractor, TermExtractor
Direct Known Subclasses:
BasicTermExtractor

public abstract class AbstractTermExtractor
extends AbstractConceptExtractor
implements TermExtractor

Abstract class for UMLS term (CUI) extraction

Copyright: Copyright (c) 2005

Company: IST, Drexel University

Version:
1.0
Author:
Davis Zhou

Field Summary
protected  Abbreviation abbrChecker
           
protected  boolean abbreviation_enabled
           
protected  AttributeChecker attrChecker
           
protected  boolean attributeCheck_enabled
           
protected  boolean compoundTermPredict_enabled
           
protected  CompoundTermFinder compTermFinder
           
protected  boolean coordinatingCheck_enabled
           
protected  boolean coordinatingTermPredict_enabled
           
protected  Lemmatiser lemmatiser
           
protected  Ontology ontology
           
protected  CoordinatingChecker paraChecker
           
protected  boolean semanticCheck_enabled
           
protected  Tagger tagger
           
 
Fields inherited from class dragon.nlp.extract.AbstractConceptExtractor
cf, conceptFilter_enabled, conceptList, parser, subconcept_enabled
 
Constructor Summary
AbstractTermExtractor(Ontology ontology, Tagger tagger, Lemmatiser lemmatiser)
           
 
Method Summary
 boolean enableAttributeCheckOption(AttributeChecker checker)
           
 boolean enableCompoundTermPredictOption(java.lang.String suffixList)
          Enable the option compound term prediction.
 void extractTermFromFile(java.lang.String filename)
           
protected  java.util.ArrayList filter(java.util.ArrayList termList)
           
 boolean getAbbreviationOption()
          Gets the option of checking terms in abbreviation.
 boolean getAttributeCheckOption()
           
 boolean getCompoundTermPredictOption()
          Gets the option of predicting compound terms.
 boolean getCoordinatingCheckOption()
          Gets the option of checking the coordinating terms
 boolean getCoordinatingTermPredictOption()
          Gets the option of predicting terms according to coordinating relationship.
 Lemmatiser getLemmatiser()
          Gets the lemmtiser used for this extractor.
 Ontology getOntology()
          Gets the ontology used for the term extractor.
 Tagger getPOSTagger()
          Gets the part of speech tagger used for the term extractor
 boolean getSemanticCheckOption()
          Gets the option of checking the semantic type of the extracted term
 void initDocExtraction()
          It is required to call this method before one calls extractFromDoc method.
 boolean isExtractionMerged()
           
 void print(java.io.PrintWriter out, java.util.ArrayList list)
          Print out the given list of concepts to the speficid print writer.
 void setAbbreviationOption(boolean option)
          Sets the option of checking terms in abbreviation.
 void setAttributeCheckOption(boolean option)
           
 void setCompoundTermPredictOption(boolean option)
          Sets the option of predicting compound terms.
 void setCoordinatingCheckOption(boolean option)
          Set the option of checking the coordinating terms
 void setCoordinatingTermPredictOption(boolean option)
          Sets the option of predicting terms according to coordinating relationship.
 void setLemmatiser(Lemmatiser lemmatiser)
          Sets lemmatiser for this extractor.
 void setSemanticCheckOption(boolean option)
          Sets the option of checking the semantic type of the extracted term
 void setSubConceptOption(boolean option)
           
 boolean supportConceptEntry()
          Tests if the extracted concept has an entry ID.
 boolean supportConceptName()
          Tests if the extracted concept has a name.
 
Methods inherited from class dragon.nlp.extract.AbstractConceptExtractor
extractFromDoc, extractFromDoc, getConceptFilter, getConceptList, getDocumentParser, getFilteringOption, getSubConceptOption, mergeConceptByEntryID, mergeConceptByName, print, setConceptFilter, setDocumentParser, setFilteringOption
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface dragon.nlp.extract.ConceptExtractor
extractFromDoc, extractFromDoc, extractFromSentence, getConceptFilter, getConceptList, getDocumentParser, getFilteringOption, getSubConceptOption, mergeConceptByEntryID, mergeConceptByName, print, setConceptFilter, setDocumentParser, setFilteringOption
 

Field Detail

ontology

protected Ontology ontology

tagger

protected Tagger tagger

lemmatiser

protected Lemmatiser lemmatiser

semanticCheck_enabled

protected boolean semanticCheck_enabled

coordinatingTermPredict_enabled

protected boolean coordinatingTermPredict_enabled

compoundTermPredict_enabled

protected boolean compoundTermPredict_enabled

attributeCheck_enabled

protected boolean attributeCheck_enabled

coordinatingCheck_enabled

protected boolean coordinatingCheck_enabled

abbreviation_enabled

protected boolean abbreviation_enabled

attrChecker

protected AttributeChecker attrChecker

paraChecker

protected CoordinatingChecker paraChecker

abbrChecker

protected Abbreviation abbrChecker

compTermFinder

protected CompoundTermFinder compTermFinder
Constructor Detail

AbstractTermExtractor

public AbstractTermExtractor(Ontology ontology,
                             Tagger tagger,
                             Lemmatiser lemmatiser)
Method Detail

isExtractionMerged

public boolean isExtractionMerged()

supportConceptName

public boolean supportConceptName()
Description copied from interface: ConceptExtractor
Tests if the extracted concept has a name.

Specified by:
supportConceptName in interface ConceptExtractor
Returns:
true or false

supportConceptEntry

public boolean supportConceptEntry()
Description copied from interface: ConceptExtractor
Tests if the extracted concept has an entry ID.

Specified by:
supportConceptEntry in interface ConceptExtractor
Returns:
true or false

getOntology

public Ontology getOntology()
Description copied from interface: TermExtractor
Gets the ontology used for the term extractor.

Specified by:
getOntology in interface TermExtractor
Returns:
the ontology used

getPOSTagger

public Tagger getPOSTagger()
Description copied from interface: TermExtractor
Gets the part of speech tagger used for the term extractor

Specified by:
getPOSTagger in interface TermExtractor
Returns:
the part of speech tagger used

getLemmatiser

public Lemmatiser getLemmatiser()
Description copied from interface: ConceptExtractor
Gets the lemmtiser used for this extractor.

Specified by:
getLemmatiser in interface ConceptExtractor
Returns:
the lemmatiser used

setLemmatiser

public void setLemmatiser(Lemmatiser lemmatiser)
Description copied from interface: ConceptExtractor
Sets lemmatiser for this extractor.

Specified by:
setLemmatiser in interface ConceptExtractor
Parameters:
lemmatiser - the lemmatiser

setSubConceptOption

public void setSubConceptOption(boolean option)
Specified by:
setSubConceptOption in interface ConceptExtractor
Overrides:
setSubConceptOption in class AbstractConceptExtractor

setCoordinatingCheckOption

public void setCoordinatingCheckOption(boolean option)
Description copied from interface: TermExtractor
Set the option of checking the coordinating terms

Specified by:
setCoordinatingCheckOption in interface TermExtractor
Parameters:
option - the option of checking the coordinating terms

getCoordinatingCheckOption

public boolean getCoordinatingCheckOption()
Description copied from interface: TermExtractor
Gets the option of checking the coordinating terms

Specified by:
getCoordinatingCheckOption in interface TermExtractor
Returns:
true if the extractor checks the coordinating terms

setAbbreviationOption

public void setAbbreviationOption(boolean option)
Description copied from interface: TermExtractor
Sets the option of checking terms in abbreviation.

Specified by:
setAbbreviationOption in interface TermExtractor
Parameters:
option - the option of checking terms in abbreviation.

getAbbreviationOption

public boolean getAbbreviationOption()
Description copied from interface: TermExtractor
Gets the option of checking terms in abbreviation.

Specified by:
getAbbreviationOption in interface TermExtractor
Returns:
true if the extractor checks terms in abbreviation.

setAttributeCheckOption

public void setAttributeCheckOption(boolean option)
Specified by:
setAttributeCheckOption in interface TermExtractor

getAttributeCheckOption

public boolean getAttributeCheckOption()
Specified by:
getAttributeCheckOption in interface TermExtractor

enableAttributeCheckOption

public boolean enableAttributeCheckOption(AttributeChecker checker)
Specified by:
enableAttributeCheckOption in interface TermExtractor

getSemanticCheckOption

public boolean getSemanticCheckOption()
Description copied from interface: TermExtractor
Gets the option of checking the semantic type of the extracted term

Specified by:
getSemanticCheckOption in interface TermExtractor
Returns:
true if the extractor checks the semantic type of the term

setSemanticCheckOption

public void setSemanticCheckOption(boolean option)
Description copied from interface: TermExtractor
Sets the option of checking the semantic type of the extracted term

Specified by:
setSemanticCheckOption in interface TermExtractor
Parameters:
option - the option of checking the semantic type of the extracted term

getCoordinatingTermPredictOption

public boolean getCoordinatingTermPredictOption()
Description copied from interface: TermExtractor
Gets the option of predicting terms according to coordinating relationship. Predicated terms have no entry id.

Specified by:
getCoordinatingTermPredictOption in interface TermExtractor
Returns:
true if the extractor predicts terms according to coordinating relationship.

setCoordinatingTermPredictOption

public void setCoordinatingTermPredictOption(boolean option)
Description copied from interface: TermExtractor
Sets the option of predicting terms according to coordinating relationship.

Specified by:
setCoordinatingTermPredictOption in interface TermExtractor
Parameters:
option - the option of predicting terms according to coordinating relationship.

getCompoundTermPredictOption

public boolean getCompoundTermPredictOption()
Description copied from interface: TermExtractor
Gets the option of predicting compound terms. Predicted terms have no entry id. Componenets of predicted compound terms will be removed from the extracted term list.

Specified by:
getCompoundTermPredictOption in interface TermExtractor
Returns:
true if the extractor predicts compound terms

setCompoundTermPredictOption

public void setCompoundTermPredictOption(boolean option)
Description copied from interface: TermExtractor
Sets the option of predicting compound terms.

Specified by:
setCompoundTermPredictOption in interface TermExtractor
Parameters:
option - the option of predicting compound terms.

enableCompoundTermPredictOption

public boolean enableCompoundTermPredictOption(java.lang.String suffixList)
Description copied from interface: TermExtractor
Enable the option compound term prediction.

Specified by:
enableCompoundTermPredictOption in interface TermExtractor
Parameters:
suffixList - the name of the file containing a list of suffix for compound terms
Returns:
true if no error occurs

initDocExtraction

public void initDocExtraction()
Description copied from interface: ConceptExtractor
It is required to call this method before one calls extractFromDoc method.

Specified by:
initDocExtraction in interface ConceptExtractor

print

public void print(java.io.PrintWriter out,
                  java.util.ArrayList list)
Description copied from interface: ConceptExtractor
Print out the given list of concepts to the speficid print writer.

Specified by:
print in interface ConceptExtractor
Overrides:
print in class AbstractConceptExtractor
Parameters:
out - the print writer
list - a list concepts for output

extractTermFromFile

public void extractTermFromFile(java.lang.String filename)

filter

protected java.util.ArrayList filter(java.util.ArrayList termList)