02_browsing:01_sub_corpora
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
02_browsing:01_sub_corpora [2022/01/05 17:30] – created Simone Ueberwasser | 02_browsing:01_sub_corpora [2022/06/27 09:21] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Sub-corpora ====== | ====== Sub-corpora ====== | ||
- | The following sub-corpora are available: | + | The corpus all-tagged contains all SMS in all languages. Data for all languages except Romansh are tagged with TreeTagger. |
+ | |||
+ | Next to that, the following sub-corpora | ||
* deu-rftagged: | * deu-rftagged: | ||
* deu-tagged: non-dialectal German data tagged with TreeTagger | * deu-tagged: non-dialectal German data tagged with TreeTagger | ||
Line 25: | Line 27: | ||
* If you need specific information about an individual chat, you can select the SMS instead of the sub-corpus in the top left to get information such as languages contained, demographic information, | * If you need specific information about an individual chat, you can select the SMS instead of the sub-corpus in the top left to get information such as languages contained, demographic information, | ||
- | {{ : | ||
- | Figure 1: Information about a (sub-)corpus | ||
On the right-hand side of the information window, you see which annotations are available to be queried for the selected sub-corpus. | On the right-hand side of the information window, you see which annotations are available to be queried for the selected sub-corpus. |
02_browsing/01_sub_corpora.txt · Last modified: 2022/06/27 09:21 (external edit)