start
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
start [2022/01/04 12:50] – [Using the corpus] Simone Ueberwasser | start [2022/01/26 14:18] – Simone Ueberwasser | ||
---|---|---|---|
Line 4: | Line 4: | ||
===== The corpus ===== | ===== The corpus ===== | ||
- | The Swiss SMS corpus consists of 25'947 SMS (~650' | + | The Swiss SMS corpus consists of 25'947 SMS (~650' |
===== Using the corpus ===== | ===== Using the corpus ===== | ||
Line 11: | Line 11: | ||
* Quote the source of the data as "Swiss SMS corpus" | * Quote the source of the data as "Swiss SMS corpus" | ||
- | Since the corpus is available on the same platform as the data from the sister-project [[https:// | + | If you need help browsing the corpus, please check the chapter [[02_browsing|Browsing]]. |
+ | |||
+ | Since the corpus is available on the same platform as the data from the sister-project [[https:// | ||
+ | * deu-rftagged: | ||
+ | * deu-tagged: non-dialectal German data tagged with TreeTagger | ||
+ | * fra-tagged: French data tagged with TreeTagger | ||
+ | * gsw-rftagged: | ||
+ | * gsw-tagged: Swiss German data where the normalized data was tagged with TreeTagger | ||
+ | * ita-tagged: Italian data taggend with TreeTagger | ||
+ | * roh: Romansh data | ||
+ | |||
+ | |||
+ | For more information about the WhatsApp | ||
=====How to quote==== | =====How to quote==== | ||
====Quoting the corpus==== | ====Quoting the corpus==== | ||
- | Stark, Elisabeth; Ueberwasser, | + | Stark, Elisabeth; Ueberwasser, |
====Quoting the corpus documentation==== | ====Quoting the corpus documentation==== | ||
- | Ueberwasser, | + | Ueberwasser, |
More resources that document the creation of the corpus: | More resources that document the creation of the corpus: |
start.txt · Last modified: 2022/09/12 19:18 by Stefan Bircher