Legal information

Results View

When you start your first search from the start page, you get an image similar to this:

Please keep in mind: The results you get do not represent the numbers of SMS in which a certain token can be found but rather the number of tokens found. It is thus well possible that one SMS appears twice or more time, meaning that the token was found more than once in the specific SMS. Each token is highlighted as it is being counted.

The different parts can be interpreted as follows:

Total sms found (1)

In the top lilac part you see the total of tokens found for your search. For a simple search, only the total is given, for a search that derives from a selection of personal data you will also see the regular expressions you searched for or information about the sub-set you are presented with. In either case you see the sub-corpus you searched in. sms_extended_all as seen in the example above, means you searched in the whole corpus, while sms_extended_all_known means that you searched only in SMS for which demographic information is available.

ust below, you can navigate through your selection. Before, you selected in the options how many SMS should be presented on one page. Now you can navigate through them with the following keys:

  • >> moves one page forward
  • << moves one page backward
  • |< moves to the beginning of the selection
  • >| moves to the end of the selection
  • By typing a number into the entry field and then pressing Show Page: you can jump directly to a specific page.

The navigation information just to the right of the navigation keys informs you about your momentarily position within your search. You can get informed that you see e.g. solutions 1-20 of the total solutions as they are mentioned above. You also see the page that is active versus the number of pages total.

View types

You can view your selection of SMS in two ways. You can change them by pressing the botton KWIC View. The views are the following:

The default is what you see above, called Sentence view. It represents the whole text of each SMS in your selection, starting from the beginning and ending at the end. The token you are looking for is represented in bold (Unless you searched for all SMS in which case no bold highlighting is used, because otherwise the whole SMS would be highlighted.). The KWIC (Keyword in context) view, on the other hand, puts your search token in the center, literally. In this view you get all your searched tokens one beneath the other, with parts of the surrounding text to the left and right of the searched token. This is an example of a KWIC view:

Main windows (4)

In the main window, your search results are shown. You get the following information:

  • No: This is a running number for your query. You can use it as a reference when talking to somebody who sees the same view. However, you cannot use this number as a reference in papers, since it will change for every search.
  • Sender ID: This column informs you about the person who wrote the SMS. If the number is lower than 2'000, the person filled in the questionnaire and you can see the personal data by clicking on the number. If, however, the person did not fill in the questionnaire, you will get a list of all SMS written by this person when clicking on the number.
  • Timestamp: this is the time and date when this SMS was sent to us.
  • Language: the main language of the SMS.
  • The SMS itself, represented in the view you selected above.

Information about you (5)

The Information Processed for [person] at [location] is needed by the technical support in case of problems.

Further information (6)

In the top right, you can further process your results. You have the following options:

  • New Query: takes you back to the startpage, where you can define a new query.
  • Edit Query: takes you back to the startpage, too, but this time the selection you made for this query is still entered in the individual fields. This is the best option if you defined a RegEx query and, upon seeing the results, find that the query should have been slightly different.
  • Frequency Distribution: The frequency distribution gives an overview over the informants who contributed to your query, i.e. it shows you data for all the users, whose SMS you are looking at. You can search for factors such as age, sex or education.
  • Regex List: When you start a RegEx query, you are normally looking for different features. You might, e.g. be looking for occurrences of ;-) and :-) by typing in the RegEx expression <//[;:]-\)//>. This query would result in two groups of tokens, ;-) and :-). With the function RegEx List, you can see how often each one of those expressions appear in the corpus.


In this column you see the three types of language taggings that are applied to the SMS, separated by semicolons. The first value (M) designates the main language, the second one (B) borrowings and the third one (N) nonce-borrowings. The semicolons are always present, thus if you have the value "deu;;eng", this specific SMS has Standard German as a main language, contains no borrowings (because the field in between the first and the second semicolon is empty) but at least one English nonce-borrowing (because of the "eng" behind the second semicolon).

In the RegEx List you can see how often your search term can be found in which language.

On this page:
You might also be interested in:
Please don't forget to quote the corpus in your work.
Topic revision: r1 - 29 Apr 2015, SimoneUeberwasser
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.

The corpora and documentation are licensed under the <strong>Creative Commons license: Attribution + NoncommercialThe corpora and documentation are licensed under the *Creative Commons license: Attribution + Noncommercial:
- Licensees may copy, distribute, display, and perform the work and make derivative works based on it only for noncommercial purposes.
- Licensees may copy, distribute, display and publish the work and make derivative works based on it only if they give the author or licensor the credits as follows:
Stark, Elisabeth; Ueberwasser, Simone; Ruef, Beni (2009-2014). Swiss SMS Corpus. University of Zurich.

Ideas, requests, problems regarding sms4science? Send feedback