dragon.nlp.ontology
Class AbstractVocabulary

java.lang.Object
  |
  +--dragon.nlp.ontology.AbstractVocabulary
All Implemented Interfaces:
Vocabulary
Direct Known Subclasses:
BasicVocabulary

public abstract class AbstractVocabulary
extends java.lang.Object
implements Vocabulary

The class implements all the basic functions related with vocabulary

Copyright: Copyright (c) 2005

Company: IST, Drexel University

Version:
1.0
Author:
Davis Zhou

Field Summary
protected  boolean enable_adjterm_option
           
protected  boolean enable_coordinate_option
           
protected  boolean enable_lemma_option
           
protected  boolean enable_npp_option
           
protected  Lemmatiser lemmatiser
           
protected  SimpleElementList list
           
protected  int maxPhraseLength
           
protected  int minPhraseLength
           
protected  java.lang.String nonboundaryPunctuations
           
 
Constructor Summary
AbstractVocabulary(java.lang.String termFilename)
           
AbstractVocabulary(java.lang.String termFilename, Lemmatiser lemmatiser)
           
 
Method Summary
protected  java.lang.String buildString(Word start, Word end, boolean useLemma)
           
 boolean getAdjectivePhraseOption()
          Gets the option whether adjective phrase is allowed.
 boolean getCoordinateOption()
          Gets the option whether a phrase can contain a conjunction.
protected  java.lang.String getLemma(Word word)
           
 boolean getLemmaOption()
          Gets the option of using the base form of the word when matching a phrase.
 java.lang.String getNonBoundaryPunctuation()
           
 boolean getNPPOption()
          Gets the option whether NPP phrase is allowed.
 java.lang.String getPhrase(int index)
          Gets the index-th phrase in the vocabulary.
 int getPhraseNum()
          Gets the number of phrases in the vocabulary.
protected  boolean isBoundaryWord(Word curWord)
           
 boolean isStartingWord(Word cur)
          Tests if the specified word could be the starting a word of a phrase.
protected  boolean isUsefulForPhrase(Word word)
           
 int maxPhraseLength()
          Gets the maximum number of words a phrase can contain.
 int minPhraseLength()
          Gets the minimum number of words a phrase can contain.
protected  void readVocabularyMeta(java.lang.String termFilename)
           
 void setAdjectivePhraseOption(boolean enabled)
          Sets the option whether adjective phrase is allowed.
 void setCoordinateOption(boolean enabled)
          Sets the option whether a phrase can contain a conjunction.
 void setLemmaOption(boolean enabled)
          Sets the option of using the base form of the word when matching a phrase.
 void setNonBoundaryPunctuation(java.lang.String punctuations)
           
 void setNPPOption(boolean enabled)
          Sets the option whether NPP phrase is allowed.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface dragon.nlp.ontology.Vocabulary
findPhrase, isPhrase, isPhrase
 

Field Detail

lemmatiser

protected Lemmatiser lemmatiser

enable_npp_option

protected boolean enable_npp_option

enable_coordinate_option

protected boolean enable_coordinate_option

enable_adjterm_option

protected boolean enable_adjterm_option

enable_lemma_option

protected boolean enable_lemma_option

nonboundaryPunctuations

protected java.lang.String nonboundaryPunctuations

list

protected SimpleElementList list

maxPhraseLength

protected int maxPhraseLength

minPhraseLength

protected int minPhraseLength
Constructor Detail

AbstractVocabulary

public AbstractVocabulary(java.lang.String termFilename)

AbstractVocabulary

public AbstractVocabulary(java.lang.String termFilename,
                          Lemmatiser lemmatiser)
Method Detail

getPhraseNum

public int getPhraseNum()
Description copied from interface: Vocabulary
Gets the number of phrases in the vocabulary.

Specified by:
getPhraseNum in interface Vocabulary
Returns:
the number of phrases in the vocabulary.

getPhrase

public java.lang.String getPhrase(int index)
Description copied from interface: Vocabulary
Gets the index-th phrase in the vocabulary.

Specified by:
getPhrase in interface Vocabulary
Parameters:
index - the position of the phrase in the vocabulary
Returns:
the phrase in the given position of the vocabulary.

maxPhraseLength

public int maxPhraseLength()
Description copied from interface: Vocabulary
Gets the maximum number of words a phrase can contain.

Specified by:
maxPhraseLength in interface Vocabulary
Returns:
the maximum number of words a phrase can contain.

minPhraseLength

public int minPhraseLength()
Description copied from interface: Vocabulary
Gets the minimum number of words a phrase can contain.

Specified by:
minPhraseLength in interface Vocabulary
Returns:
the minimum number of words a phrase can contain.

setNonBoundaryPunctuation

public void setNonBoundaryPunctuation(java.lang.String punctuations)

getNonBoundaryPunctuation

public java.lang.String getNonBoundaryPunctuation()

setLemmaOption

public void setLemmaOption(boolean enabled)
Description copied from interface: Vocabulary
Sets the option of using the base form of the word when matching a phrase.

Specified by:
setLemmaOption in interface Vocabulary
Parameters:
enabled - the option of using the base form of the word when matching a phrase.

getLemmaOption

public boolean getLemmaOption()
Description copied from interface: Vocabulary
Gets the option of using the base form of the word when matching a phrase.

Specified by:
getLemmaOption in interface Vocabulary
Returns:
true if the base form of words is used when matching a phrase.

setAdjectivePhraseOption

public void setAdjectivePhraseOption(boolean enabled)
Description copied from interface: Vocabulary
Sets the option whether adjective phrase is allowed.

Specified by:
setAdjectivePhraseOption in interface Vocabulary
Parameters:
enabled - whether adjective phrase is allowed.

getAdjectivePhraseOption

public boolean getAdjectivePhraseOption()
Description copied from interface: Vocabulary
Gets the option whether adjective phrase is allowed.

Specified by:
getAdjectivePhraseOption in interface Vocabulary
Returns:
true if adjective phrase is allowed.

setNPPOption

public void setNPPOption(boolean enabled)
Description copied from interface: Vocabulary
Sets the option whether NPP phrase is allowed. For example, "bank of america" is a NPP phrase

Specified by:
setNPPOption in interface Vocabulary
Parameters:
enabled - the option whether NPP phrase is allowed.

getNPPOption

public boolean getNPPOption()
Description copied from interface: Vocabulary
Gets the option whether NPP phrase is allowed.

Specified by:
getNPPOption in interface Vocabulary
Returns:
true if NPP phrase is allowed.

setCoordinateOption

public void setCoordinateOption(boolean enabled)
Description copied from interface: Vocabulary
Sets the option whether a phrase can contain a conjunction. For example, the phrase "the cancer of neck and hand" contains a conjunction.

Specified by:
setCoordinateOption in interface Vocabulary
Parameters:
enabled - the option whether a phrase can contain a conjunction

getCoordinateOption

public boolean getCoordinateOption()
Description copied from interface: Vocabulary
Gets the option whether a phrase can contain a conjunction.

Specified by:
getCoordinateOption in interface Vocabulary
Returns:
true if a phrase can contain a conjunction.

isStartingWord

public boolean isStartingWord(Word cur)
Description copied from interface: Vocabulary
Tests if the specified word could be the starting a word of a phrase.

Specified by:
isStartingWord in interface Vocabulary
Parameters:
cur - the current word for testing
Returns:
true if the current word could be the starting a word of a phrase

isBoundaryWord

protected boolean isBoundaryWord(Word curWord)

getLemma

protected java.lang.String getLemma(Word word)

buildString

protected java.lang.String buildString(Word start,
                                       Word end,
                                       boolean useLemma)

isUsefulForPhrase

protected boolean isUsefulForPhrase(Word word)

readVocabularyMeta

protected void readVocabularyMeta(java.lang.String termFilename)