This is an old revision of the document!

Facts and figures

The data in this corpus was collected and made available to the research community as a first release in 2009. Until 2015, further processing took place to create the corpus as it stands now. During all this time, papers were written and presentations were given and they all provided information about the corpus as it was available at that time. If you publish with data from this corpus now, please do not quote old publications but use and quote the up-to-date information in this section instead.

We provide information and statistics about:

The corpus
The SMS in the corpus
The participants
Mother tongues
Languages in the SMS