Part of speech taggingWikipedia defines PoS tagging as follows: "In corpus linguistics, part-of-speech tagging (POS tagging or POST), also called grammatical tagging or word-category disambiguation, is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition, as well as its contexti.e. relationship with adjacent and related words in a phrase, sentence, or paragraph. " In this corpus, we applied PoS tagging to the German, French and Italian parts using Helmut Schmid's TreeTagger. For Romansh, unfortunately, there is no parameter file available for TreeTagger and there are in fact no other tools available for this language, either. In our corpus, the PoS annotation is applied to the layerpos in Annis.
German (both dialectal and non-dialectal)
French
Italian
PrecisionOur test gave the following precision for the respective sub-corpora:
|
On this page:
Other processing steps:
You might also be interested in: Please don't forget to quote the corpus in your work. |