03_processing:05_normalization
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | Next revisionBoth sides next revision | ||
03_processing:04_normalization [2022/01/04 13:30] – Simone Ueberwasser | 03_processing:04_normalization [2022/01/04 13:32] – Simone Ueberwasser | ||
---|---|---|---|
Line 56: | Line 56: | ||
In German, nouns are spelled with a starting upper case letter. Independent of the capitalization in the SMS layer, nouns are in upper case in the normalized layer in an attempt to support a PoS tagger in recognizing nouns. | In German, nouns are spelled with a starting upper case letter. Independent of the capitalization in the SMS layer, nouns are in upper case in the normalized layer in an attempt to support a PoS tagger in recognizing nouns. | ||
- | //Spelling// | + | ====Spelling==== |
Assuming that spelling is unorthodox in SMS all over, we decided to adjust spelling to what is found in a dictionary for a lemma on an according syntactical position. In German, e.g., there is an definite neutral article das and the conjunction dass (e.g. //er sagte, dass er komme// ('he said **that** he would come' | Assuming that spelling is unorthodox in SMS all over, we decided to adjust spelling to what is found in a dictionary for a lemma on an according syntactical position. In German, e.g., there is an definite neutral article das and the conjunction dass (e.g. //er sagte, dass er komme// ('he said **that** he would come' |
03_processing/05_normalization.txt · Last modified: 2022/06/27 09:21 by 127.0.0.1