Legal information

Two stages of processing two querying tools

The Swiss SMS corpus is available in two different versions, which are based on the same data. While the data in the two versions are the same, their processing stopped at a different stage and consequently different querying tools are needed to investigate them. The first processing steps focussed on making the corpus available in a form that guarantees the anonymity of the informants and that is easily available to researchers. This version is available in the SMS navigator. Taking this version as a starting point, further processing took place to create the version available in ANNIS. These steps include a tokenization, normalization and PoS tagging.

Please keep in mind that you only have full access to the two corpora if you are a registered user.

In the following texts, we explain the processing of the corpora in detail. Please keep in mind that this is a continuous description, i.e. all the steps explained as the processing for the SMS navigator were then used as a basis for the processing of the ANNIS corpus.

You might also be interested in:
Please don't forget to quote the corpus in your work.
Topic revision: r10 - 10 May 2015, SimoneUeberwasser
 

This site is powered by FoswikiCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors.


The corpora and documentation are licensed under the <strong>Creative Commons license: Attribution + NoncommercialThe corpora and documentation are licensed under the *Creative Commons license: Attribution + Noncommercial:
- Licensees may copy, distribute, display, and perform the work and make derivative works based on it only for noncommercial purposes.
- Licensees may copy, distribute, display and publish the work and make derivative works based on it only if they give the author or licensor the credits as follows:
Stark, Elisabeth; Ueberwasser, Simone; Ruef, Beni (2009-2014). Swiss SMS Corpus. University of Zurich. https://sms.linguistik.uzh.ch

Ideas, requests, problems regarding sms4science? Send feedback