The Dragon Toolkit
Home Documentation Examples Demos Download Stats Contact
 

Examples using the Dragon Toolkit

Suggestions

The simplest way to run DragonToolkit on your computer would be to create an Eclipse Java Project, and update your build path to reference the DragonToolkit jar file.

XML Configuration Examples
  • MaxMatcher: Biolgoical Term Extraction

  • Text Retrieval Using Ontology: Topic Signature Language Models for text retrieval. UMLS-based Concept and Concept Pairs are used as topic signatures.

  • Xtract: Build a multiword phrase dictionary from a collection.

  • Text Retrieval Using Multiword Phrase: Topic Signature Language Models for text retrieval. Multiword phrases are used as topic signatures.

  • Text Clustering: Agglomerative clustering, spherical k-means, and model-based k-means with four smoothing approaches (Laplacian smoothing, background smoothing, context sensitive semantic smoothing, and context insensitive semantic smoothing)

  • Link K-Means:Utilize both content and hyper linkages between documents for clustering

  • Text Classification: SVM classifier, Nigam active learning classifier, and Bayesian classifiers with four smoothing approaches (Laplacian smoothing, background smoothing, context sensitive semantic smoothing, and context insensitive semantic smoothing)

  • Text Summarization: LexRank Generic Multi-Document Summarization

  • Topic Modeling: LDA, Aspect Model as well as Simple Mixture Model
Sample Code
  • MaxMatcher: Biolgoical Term Extraction

  • Indexing: Index a collection using Basic Token Indexer.

  • Text Process: Tokenize, lemmatise and part-of-speech tag.