02_browsing:01_sub_corpora
Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| 02_browsing:01_sub_corpora [2022/01/05 16:30] – created simone.ueberwasser.ds.uzh.ch | 02_browsing:01_sub_corpora [2022/06/27 07:21] (current) – external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Sub-corpora ====== | ====== Sub-corpora ====== | ||
| - | The following sub-corpora are available: | + | The corpus all-tagged contains all SMS in all languages. Data for all languages except Romansh are tagged with TreeTagger. |
| + | |||
| + | Next to that, the following sub-corpora | ||
| * deu-rftagged: | * deu-rftagged: | ||
| * deu-tagged: non-dialectal German data tagged with TreeTagger | * deu-tagged: non-dialectal German data tagged with TreeTagger | ||
| Line 25: | Line 27: | ||
| * If you need specific information about an individual chat, you can select the SMS instead of the sub-corpus in the top left to get information such as languages contained, demographic information, | * If you need specific information about an individual chat, you can select the SMS instead of the sub-corpus in the top left to get information such as languages contained, demographic information, | ||
| - | {{ : | ||
| - | Figure 1: Information about a (sub-)corpus | ||
| On the right-hand side of the information window, you see which annotations are available to be queried for the selected sub-corpus. | On the right-hand side of the information window, you see which annotations are available to be queried for the selected sub-corpus. | ||
02_browsing/01_sub_corpora.1641400210.txt.gz · Last modified: (external edit)
