Sunday, February 12, 2012

Lexical Categories at the Edge of the Word (Onnis, Christiansen) Cognitive Science 32 (pp184-221)

Today I read this article about the usage of bound morphemes and their role in helping children who are learning language to correctly identify parts of speech and the relationship between words. The goal of the study was to show that language learners may learn just as much from the affixes of words than their stems when learning a language and trying to figure out how words work together to compose sentences.

Experiment 1 consisted of database analysis of previously collected child utterances and analysed how these utterances displayed the understanding of a word, or its usage in a sentence only by identifying its affixes. This was paired with a control to show how the differences between correctly identified words based on affixes would matched if such groupings were randomly assigned.

Even though the statistics of experiment 1 showed that the part of speech which was correctly identified by its affix most often were nouns, the first grouping of affixes which are considered to be most important by the researcher were related to nouns.

"the following morphological affixes (in parentheses is the class that they
most often predicted, N = Noun; V = Verb ; O = Other). The cues are in decreasing order of importance: -ing (V), -ed (V), -y (O), -er (N), -or (N), -(o)ry (N), -ite (N), -id (V), -ant (N), e- (N), -ite (O), -ate (N), un- (N), -ble (O), -ive (O), an- (N), pre- (N), out- (N), -s (unvoiced; N), bi- (N), -ine (N)." ( 193)

The first five, also the most important to identifying the part of speech, are verb related: -ing (V), -ed (V), -y (O), -er (N), -or (N). Even the affixes which are not directly used to identify verbs are used to distinguish verbs from other parts of speech and thus they do refer to them. -y is common to adjectives even if they cannot be made from verbs so there seems to be multiple factors which contribute to -y being associated with this specific part of speech. In any case there are verbs which fit into each of these affixes as different parts of speech (dust (V): dusting(V), dusted(V), dusty (adj), duster(n). One question I have about this is how some of these adjective forms which include -y but do not appear in verb form, noun form, or appear in neither form got to be? Do their etymologies reveal this? I can be "lazing on a sunday afternoon", but was there ever a time when this would have made me a "lazer"? Likewise why can't we describe an actor, who acts, as "acty"?

The second part of the study looked at the idea that experiment 1 assumed that children at this stage are able to separate words into Stem + Affix and thus are able to identify the Affix. The identification of the affix is key to understanding the results of the experiment 1. A different study in 2001 with 18-21 MO children demonstrated that children were able to identify the correct usage of affixes in speakers even if they could not use these affixes themselves, which demonstrates a degree of understanding of the affixes. The study goes on to suggest that this may come from a language learning mechanism in the brain which is biased toward the beginnings and endings of words.

This phenomena of affixes is common in almost every language world wide. This is something I can tie in with the behavioral/economic psychology that Kahneman talked about in "Thinking". The cognitive process that he identified as "System 1" reads, or presents certain parts of or experienced more biased than others. One story that he told in a TED talk involves someone who's experience listening to a symphony was effectively ruined by the very ending of the recording which was damaged. Kahneman goes on to explain that it was not the experience, but the memory of the experience which was ruined by the "end-bias". Is it possible that a similar cognitive process is happening here, where the memory of the language experience is biased toward the end and beginning of the word. The only difference is that the frame has shifted from a long attention draining symphony to an incredibly short moment. The only difference is the size of the diet of experience.

Experiment 2 looked at how the same tool may function on the phoneme level instead of the morpheme level. This is another device used the researcher to remove any assumptions about the language learner's understanding of the Stem + Affix structure of the given language. The researchers essentially wanted to look at whether the same cognitive device can describe a phonological process instead of a morphological one. The result was that 66% of nouns were identified correctly verses a baseline of 34% (80% were identified correctly in the morpheme-centric experiment 1), 56% of verbs were identified against a baseline of 32% (54% were identified in experiment 1), and 47% of other words were identified verses baseline of 33% (about 30% were identified in experiment 1). The difference between this word-edge phoneme study and the morpheme study was negligible. How much of this comes from the fact that many of the affix morphemes also stand alone as phonemes? (-ing, -s, -ed (-/d/), -ch or -tch). It is important to identify that it can be difficult to find the causation here. Are morpheme-affixes tools for language learners because they also often break down along the same lines as phonemes? Or were speakers of the language more likely to add the phonemes as morphemes because of the ease that this shared trait brings? (I'm not sure these are the right questions which reflect this idea, but surely it is not clear why one influences the other and which one influences which.)

The third experiments tested the use of word-edges to identify unknown words which lacked a syntactic context. The idea here is that the only tool is the word endings where as the other experiments assumed or at least left unclear the role of the stem. This experiment specifically tries to avoid stems which may be easily identified on their own, looking only at the role that the affixes play in the identification process. The study looked at three groups. All groups were trained on 500 word types and tested on 4,230 words. This method worked best to identify verbs overall.

No comments:

Post a Comment