The TreeTagger is a tool for annotating text with part-of-speech and lemma information which has been developed within the TC project at the Institute for Computational Linguistics of the University of Stuttgart. The TreeTagger has been successfully used to tag German, English, French, Italian, Spanish, Greek and old French texts and is easily adaptable to other languages if a lexicon and a manually tagged training corpus are available.
Sample output:
word lemma
The DT
TreeTagger NP
is VBZ – be
easy JJ
to TO
use VB
Tags: text data mining