Package dragon.onlinedb

A collection of java files for textual corpus preparation.

See:
          Description

Interface Summary
Article Interface of Article which is the unit of collections
ArticleParser Interface of Article Parser
ArticleQuery An interface for online document retrieval
CollectionPreparer Interface of Collection Preparer
CollectionReader Interface of Collection Reader which read out articles from a collection one by one
CollectionWriter Interface of Collection Writer
 

Class Summary
AbstractQuery Abstract class for querying articles from a data source
ArrayCollectionReader Collection reader for reading multiple collection
BasicArticle Basic article implements basic functions related with article operations
BasicArticleIndex Basic class of handling article index information
BasicArticleKey Data structure for article key
BasicArticleParser Basic Parser for parsing and assembling a given article
BasicCollectionPreparer Basic collection preparer which writes article to disk in "collection" format which can be later processed by BasicIndexer and so on
BasicCollectionReader Basic collection reader (supporting class for indexing)
BasicCollectionWriter Writing collection to disk
SimpleArticleParser Simple article parser which simply treats the content of a file as the body of an article
SimpleCollectionReader A light collection reader
 

Package dragon.onlinedb Description

A collection of java files for textual corpus preparation.

Package Specification

The toolkit provides convenient ways to read out articles from text collections with various format. The interface called CollectionReader defines the methods for article extraction from collections. The interface called ArticleParser has a method parse which can parse a sequence of text into an article.