User Tools

Site Tools


02_browsing:01_sub_corpora

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
02_browsing:01_sub_corpora [2022/01/05 17:30] Simone Ueberwasser02_browsing:01_sub_corpora [2022/06/27 09:21] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== Sub-corpora ====== ====== Sub-corpora ======
-The following sub-corpora are available:+The corpus all-tagged contains all SMS in all languages. Data for all languages except Romansh are tagged with TreeTagger. 
 + 
 +Next to that, the following sub-corpora per language are available:
   * deu-rftagged: non-dialectal German data tagged with RF-Tagger   * deu-rftagged: non-dialectal German data tagged with RF-Tagger
   * deu-tagged: non-dialectal German data tagged with TreeTagger   * deu-tagged: non-dialectal German data tagged with TreeTagger
02_browsing/01_sub_corpora.1641400249.txt.gz · Last modified: 2022/06/27 09:21 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki