Code-switching and types of multilingual communities

Developed the principles that a corpus of texts containing code-mixing should have and built a working prototype of Udmurt/Russian Code-Mixing Corpus. Discussed different approaches to studying code-mixing and various classifications of code-mixing.

Рубрика Программирование, компьютеры и кибернетика
Вид дипломная работа
Язык английский
Дата добавления 30.12.2015
Размер файла 1,7 M

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Additionally, when a word is in L1, but has L2 morphology markers within L2 context it should also be considered borrowing (one can also add a special marker to such cases), However if the context is mixed (preceding word in L1 and the following in L2) , it should be considered a code-switch in the middle of the word and therefore a violation the free-morpheme constraint. Such an example, although found ungrammatical was stated in (Budzhak-Jones 1995) where she studied English/Ukranian code-mixing:

(2) *So you go to a storu des' iskupytysja

-M.Gen somewhere to-shop-Refl

Alternation is much less controversial and easier to mark. Each word of L2 should get an `alternation' marker, the first word of the segment should also get the `first' marker. If the switch occurs on the word that is homonymous in two languages that word should be marked as a trigger.

When marking congruent lexicaliztation we see if there are a few insertions of another language in the sentence and then annotate every word with `congruent lexicalization' (in our case cong.lex) marker. As well as with other types of code-mixing we mark the first word in the sentence as such. It is also essential for future search queries. If for instance the user wants to find all occurrences of congruent lexicaliztation the engine will only search and show it in regard to the first word to avoid repetition.

In addition, someone might want to study code-mixing in regard to it's syntactical constraints. To be able to do so some annotation of syntax is necessary. As equivalency constraint is still a rather controversial topic it is something that a corpus might help with (as we discussed it through the notion of government). If both languages have syntactic chunkers (shallow parsers, i. e. tools that identify NPs and other kinds of constituents), two annotations can be offered, so that not only the observance of the equivalence constraint in general can be checked, but more specific hypotheses as well. For instance, the potential examples offered in (Muysken 2000) .

a. V(Eng) NP(Sp) - 2

b. V(SP) a NP(Eng) - 1

c. *V(Eng) a NP(Sp) - 3

d. *V(Sp) NP(Eng) - 0

However, a chunker is not always available or easy to make. It however does not mean that without it no syntactic constraints and theories can be checked. An easy solution can be just looking at part of speech and grammatical characteristics:

Verb (Sp) + Noun (Eng + Acc).

This type of search can also reveal what code-mixing patterns are like when different languages have different variants within one grammatical category (for instance, 9 Russian cases vs 15 Udmurt cases).

7. Udmurt/Russian Code-Mixing Corpus

7.1 Why Udmurt/Russian?

After working out the principles I have started working on creating a prototype of a corpus to test them. This research initially came out of working on an online annotated corpus of Udmurt language that Timofey Arkhangelskiy and I have been building. While collecting texts and annotating them with morphological marking we have constantly come across Russian words and phrases inserted into Udmurt sentences which could not get annotation with the use of just an Udmurt parser. In fact in some texts the mixing occurred so often that we had to remove them from the corpus as they only soiled the data being left unannotated. That project was aimed at researching standard Udmurt and it's dialects, so we did not try to work around code-mixing. However, we have realized that it a relatively common situation, especially when annotating blogs and social networks pages as people tend to use constructions, phrases and grammar of spoken language and more informal language in general, which includes switching between languages. All researches of this topic however turned out to be conducted with the help of manually collected (mostly recorded) and manually annotated corpora. Therefore, having all this invaluable internet data we have decided that there is an urgent need to create the way to annotate switches. We wanted to both be able to mark foreign words in relatively `clean' data, but also be able to work on code-mixing as a separate area of linguistics. So, out of pure enthusiasm of solving a problem combined with absence of any research of the topic in regard to these particular languages, we have started working on a corpus of Udmurt/Russian code-switching.

7.2 General Remarks

Udmurt is a Finno-Ugric language of Uralic family. It is spoken by around 340,000 people in Russia according to the census of 2010. Udmurt together with Russian is an official language of Republic of Udmurtia. It is also rather widespread around Republics of Tatarstan, Bashkortostan, Mari El, as well as in Kirov, Perm, Sverdlovsk Regions in Russia. Udmurt has four major dialects: Northern, Beserman, Southern, and South-central. Furthermore, there are other transition and mixed dialects. The differences between the dialects are not very substantial and are mostly phonological rather than morphological (Alatyrev 1983).

Udmurt writing was created in 13th century on the basis of Russian graphics. However, Udmurt alphabet has only formed in the beginning of 20th century (therefore there are not many texts written before the middle of 20th century). Alphabet consists of 38 letters, 33 of which are Russian and the remaining 5 have diacritics - Ч?ч?, З?з?, Ж?ж?, Цц, И?и?.

Being mostly located on the territory of Russia it is obvious that it has been under a great amount of influence, due to language contact, lack of prestige and history of suppression. Therefore, most people who speak Udmurt are bilingual. There have been a utterly recent raise in support of Udmurt language. A few electronic dictionaries have been compiled, a few Udmurt books reissued, but there are still not many resources to study Udmurt using linguistic approach.

There are a few grammar books describing Udmurt language, as well as printed and electronic dictionaries (on the basis of the printed copies). There is also a Corpus of Udmurt language created at University of Helsinki (Suihkonen 1998); it however has restricted access. A lot of work on this corpus is based on the work that we did on Udmurt corpus.

Both that corpus and the one that has been created during this work were created by adaptation of the search system of Eastern Armenian National Corpus (EANC - http://www.eanc.net/).

Udmurt, unlike Russian, is an agglutinative language, therefore it can join many suffixes sometimes even recursively:

(21) лыдз?ылытэмъськылыны

лыдз? - ыл -ыт-эмъяськ- ыл - ыны

читать FREQ CAUS FICT FREQ INF

`to often pretend that somebody is making someone read often'

It has a similar but not entirely same syntax, for instance, Udmurt has many postpositions, whereas in Russian the same function is normally taken up by prepositions. As for the lexicon, the dictionary that I had to work with (Kirillova 2008) contains an enormous amount of Russian words, most of which are indeed established loans. That however, for the obvious, reasons causes trouble for automatic annotation of the language of each word.

7.3 What Can We Expect?

There are a few things that we can suppose about Udmurt/Russian code-switching by just analyzing its grammar, historical and geographical situation and what we know about code-mixing by now. First of all, we can assume that different authors prefer different code-mixing patterns, although within what this particular language pair can offer. Insertion however is usually more common in general, but we might have problem to distinguish between loans and code-mixing due to the long language contact. Russian and Udmurt have very similar word order and therefore there is possibility for congruent lexicalization. As we have collected blogs that are mostly in Udmurt, we expect most alternations to be switches from Udmurt to Russian, but a few cases the other way around should also occur. We will be checking both free-morpheme and equivalence constraints; however, we expect them to be upheld.

7.4 Corpus Contents

The texts used for the corpus are the ones available online, these are mostly blogs and post from social networks. Such texts are usually informal by nature. They represent practically a unique type of data. They have a merge of characteristics of both written and oral speech. Being already written down it offers a perfect opportunity for automatic language processing. In terms of code-mixing it can push forward the researches in sociolinguistics, psycholinguistics as well as synchronic and diachronic studies. There were a few studies in code-mixing conducted on Dutch-Moroccan and Dutch-Turkish internet sites by (Dorleijn, Nortier 2009) already and they have already proven how differing are the patterns that appear in different language pair. They however state that those differences are mostly due to the sociolinguistic situations as opposed to typological considerations.

Nevertheless, there is no real evidence that there is no drastic difference from typological perspective between some types of language pairings. It is clear that there are some, but nobody have been able to compile a big enough and diverse enough typological sample to properly analyze it.

The overall framework, pipeline and the tools that can be used to build corpora and make a morphological annotation have been developed in the course of realization of the `Corpus linguistics' fundamental research programme of the Russian Academy of Sciences. Within this programme, corpora of Buryat, Kalmyk, Lezgian, Ossetic, Tatar and other languages of Russia have been developed (see (Arkhangelskiy 2012). Common standards developed for these corpora include automated full morphological tagging, annotating the texts with metadata (including author, title, genre, year of creation, etc.), and providing the corpus with a publicly available online interface. The ideology and some of the tools used while building the corpora in focus originated in the project of the Eastern Armenian National Corpus (Daniel 2009). For a detailed description of the methods and tools by the example of the Ossetic National Corpus, see (Arkhangelskiy et al. 2012).

The Udmurt/Russian corpus uses the search platform initially designed for the Eastern Armenian National Corpus and then adapted for use with numerous other corpora. The platform allows the users to make complex queries, including searching for certain lemmata, grammatical tags, punctuation surrounding the words, combinations of words and more. The results can be sorted in random order or according to parameters like text title, token (in alphabetic order), etc.

For each instance found in the corpus, the user can see a single sentence where that instance was found. The result may be expanded to a maximum of a 3-sentence context. This constraint solves the dilemma of copyrighted materials in an open access corpus: while any particular sentence can be found in any text, there is no way the user can read or extract the whole text or any significant part of it from the corpus.

After collecting the texts we had to create a grammatical dictionary, which is complicated by the fact that Udmurt parts of speech are often hard to distinguish. As I had to use a dictionary where parts of speech were not marked, the decision on what part of speech a word belongs to was often made on the basis of Russian translation. As complexity of automatic annotations has increased drastically so had the amount of mistakes. Some words however had their parts of speech indicated in the dictionary (most particles and many conjunctions), and the verbs were easy to determine due to their unique infinite suffix -ны. Some lexemes though had to be removed from the dictionary completely, it contained many participles and gerunds, which are derived from the verb with the use of very productive suffixes and these forms can be generated automatically.

The dictionary and the grammar in the particular form is need for morphological parser. Automatic morphology annotation for Udmurt was conducted with the use of data from (Alatyrev 1983), (Perevoshchikov 1962), (Winkler 2001) grammars and dictionaries (Butolina 1942), (Kirillova 2008) in the form required by UniParser as it has been done by us previously for the Corpus of Udmurt Language (http://web-corpora.net/UdmurtCorpus), built in 2014.

Each lexeme in the Udmurt grammar dictionary has a full form,a stem, it's grammatical characteristics, paradigm of inflection and translation. English translation is only present in this paper for illustrative reasons, the corpus itself only contains translations into Russian.

-lexeme

lex: веднаськыны

stem: веднаськ.

gramm: V,I

paradigm: connect_verbs-1

trans_ru: заниматься колдовством `to practice witchcraft'

lang: udm

The grammar dictionary whatever it is based on should be compiled in such a way that it does not contain (to the maximum possible level) the forms that can be generated. Therefore, such instances should be eliminated:

-lexeme

lex: веднаськытыны

stem: веднаськыт.

gramm: V,I

paradigm: connect_verbs-1

trans_ru: заставить заниматься колдовством `to make someone practice witchcraft'

lang: udm

The translations should be as short and as accurate as possible to fit on the screen and be laconic enough to let the user glance through it quickly. Sometimes if it is impossible to shorten the translation (e.g. due to many different meanings) properly automatically, even simple number of characters restriction is better than long bulk strings of words.

Simultaneously with working on the dictionary we have to create paradigms for the inflexions. By combining stems with corresponding paradigms we generate all possible forms that the word can have. Each box of the paradigm has the following form:

-paradigm: Verb-pres-I-positive

-flex: .э

gramm: 3,sg,pres,I

Then these new forms are being searched in the texts and when found, the words get respective annotations, which means that if we, with the purpose of simplifying the paradigms, generate some surplus forms that do not actually exist they just will not be found in the texts. Although it is important to trace whether these forms are not homonymous with anything else, because then the paradigm has to be changed.

As it is often the case and many Russian forms of the same words are homonymous to each other, as well as Udmurt forms have homonyms within Udmurt language, and most unfortunately many Udmurt words are homonymous with Russian words, thus some words when we cannot easily resolve the homonymy automatically through syntax get a few annotations and in code-mixing corpus naturally two language markers. However, in code-mixing homonymy of the words in two different languages may become a trigger for a switch.

Here I list some basic Udmurt and Russian linguistic characteristics, as some of them might be useful in regard to equivalence. Udmurt as opposed to Russian is an agglutinative language with mostly postposition agglutination. However, some there are some flexive elements. Udmurt verbs one of 4 tenses, 4 aspects, plural or singular, in one of 3 persons; they can be transitive and intransitive and have 2 conduction types. Verbs have negative and positive forms. Nouns can be singular or plural, in one of 15 cases as opposed to 10 of Russian (National Corpus of Russian Language). As in Russian it is possible to generate gerunds and adverbial participles.

7.5 Annotation of Code-Mixing in the Corpus

7.5.1 Additional Principles

According to the principles that have been discussed, I have annotated the insertion, congruent lexicalization and alternation. Although there are some decisions that have been made for such an annotation for any language pair, such as some distinctions between insertions and borrowings and if one-word switch at the end of the sentence should be considered alternation. But there are also some language specific decisions that have to be made. First of all, during annotation as there is a lot of homonymy; and when it is significantly unbalanced in frequency, for instance in one language the word is a very common pronoun and in another a rather rarely used in general topics and in a particular corpus noun, then removing the latter from the dictionary actually improves the accuracy and functionality of the corpus. Another specific modification to the annotation that I had to do especially for Udmurt/Russian is `Russian infinitive + карыны (udm.to_do)',, which is fairly common and certainly productive, therefore is gets annotated as a construction borrowed into Udmurt.

7.5.2 Insertion

The corpus contains a very large amount of insertions. One of the most popular elements that are being inserted is a Russian conjunction и `and' between the clauses.

(22) Вуэ но тани со дорам куное и кутске ни аслаз мудрон кылъёсыныз мыным мадьыны.

There are also many interjections (23) and Russian idioms (24). One might argue that the latter is used for the lack of similar expression in Udmurt. Not being a native Udmurt speaker, I cannot make this statement; nevertheless, various psycholinguistic studies suggest that is often the case.

(23) Пыдйылам султыса гинэ мон шоди - туннэ м? вордиськом... аххааа, ну ти монэ валады, может цд но валалэ…

(24) Атае третий десяток пошёл шуыса шоккет?з.

My original hypotheses was that the insertional code-mixing would be the most common. And although I did come across it rather often, it is sometimes very hard to distinguish between borrowings and insertion. Udmurt lexicon is so overfilled with Russian words that it can be hard to see when the loan is established or iа it has been brought into the sentence for just this particular occasion. The strategy that I have chosen for working around this problem, as you may remember, was to check if the word exists in both dictionaries. Some of the Russian loans that are included in the Udmurt dictionary however have Udmurt analogues that are much more widely used, which means that some of them should probably not be used for the annotation. The other major problem is that some Russian and Udmurt words that are homonymous in those forms that are not grammatically identical and that might lead to wrong output.

Some of these cases I have tried to solve, not all of them can be solved automatically through verb government or word order though, consequently some mistakes still may occur, they however should be relatively easy to recognize manually.

7.5.3 Alternation

If a sentence starts in one language, switches into another at some point (once) and then finishes in that other language it should have got alternational annotation. It goes for switches for just one word as well if it is in the beginning or at the very end of the sentence. The first word after the switch gets a mark that it is indeed the first and how many words are there are in this language in this sentence. If the first word is homonymous in Udmurt and Russian it gets a trigger mark as well. This allows the user to find all the alternational switches and narrow it down to longer or shorter once if there is a need. As I have discussed before there are two types of alternations: central and peripheral. My annotation does not allow search for them separately (at least not as of today), although restriction on the length may narrow it down a little bit. Here are the examples of both from the corpus (chosen manually). Interestingly, although in general central alternation is more common in code-mixing, Udmurt/Russian seems to have an overwhelming superiority of peripheral alternations.

Peripheral alternation in the corpus:

(25) Нырысет? 200 страница вал напряжённой, интригующий женский роман.

`The first 200 pages were an intense, intriguing women's novel.'

Although `напряженной' exists in both Russian and Udmurt, here grammatically it is in Udmurt here, as this is one of the cases when phonetically homonymous words in these two languages do not have the same morphological characteristics; however I believe that in this case `напряженной' could still possibly be a trigger for the switch.

Central alternation in the corpus:

(26) А мы вообще не парились, но чай сектам.

`And we didn't bother at all and treated them with tea.'

The same as in previous example there is a word чай `tea', that exaists in both languages, but grammatically we can assume that it is in Udmurt here. Но also exists in both languages, in both it can mean `but'. Even if we decide it is an Udmurt word, we should still consider it a trigger, due to the homonymy to a Russian conjunction.

7.5.4 Congruent lexicalization

Much more common than alternation in the Udmurt/Russian corpus is congruent lexicalization.

(27) И мыным тунсыко потылэ вал котькуд гужем, день военно-морского флота соос эшъёсыныз люкаськыса празновать карыло вал шуыса и вообще.

`I was interested to go out that summer, on the day of the Navy forces, they came together with their friends to celebrateand everything.

The example (27) qualifies for congruent lexicalization, however note that `праздновать карыло' as I have discussed before is marked as borrowing due to the regularity of such formation (Russian infinitive + Udmurt `to do') in Udmurt.

A very interesting example is (28), here the reader may see how often the author switches back and forth.

(28) Окно - со стекло прозрачное, ад?иськод, мар луэ со сь?рын, а чтобы лэсьтыны сое зеркало и чтобы ад?ыны астэ гинэ и не замечать, мар луэ вокруг стеклоез покрытьтоно сереброен.

A window is a transparent glass, you can see through it what's going on, and if you want to make a mirror out of it, to see just yourself, and not what's around, the glass has to be covered with silver.

Покрытьтоно is an interesting borrowing; causative suffix -тоно is attached to a Russian infinitive `to cover'.

7.6 Checking the Constraints

7.6.1 Equivalence Constraint

I have checked every sentence from the examples on violation of equivalence constraint and it seems like none of them show any deviation. The exception is relatively strange word order in (27), which is not typical for either Russian or Udmurt although found grammatical in both. `Стекло прозрачное' - Noun + Adj order, in contrast to usual Adj + Noun.

Most of the alternations in the corpus happen on the trigger word, making violation of the equivalence constraint less probable.The similar syntax does not allow many opportunities for it. Therefore, all the examples that I managed to check turned out to be equivalent.

This however does not mean that the equivalence constraint is never violated in Russian/Udmurt code-mixing discourse, but rather that there is more precise ersearch needed to prove whether it is or not.

7.6.2 Free-Morpheme Constraint

Analyzing Udmurt/Russian code-mixing, it is often hard to distinguish between code-mixing and nonce-loans when the languages have been in such close contact for such a long time.

(29) В то время адямиос сыыче богатствоен нокинэ но не замечают солэн совесть не позволит бомжэн вераськыны таиз дась вераськыны но.

Interestingly, example (29) might be an example of violation according to the principles that we have worked out for annotation on the basis of (Budzhak-Jones 1995). The sentence starts in Russian, than one of the words get Udmurt morphology becoming a nonce-loan, but as the sentence continues in Udmurt it makes it a code-switching within one word.

7.7 Future Improvements

Although we have built a corpus according to the suggested annotation principles there is always room for improvement. The priority in case of this Udmurt/Russian code-mixing corpus should be cleaning and expanding Udmurt grammar dictionary.

It has been discussed the equivalence constraint might be enough for Udmurt/Russian code-switching due to it's similar syntax, but creating a chunker for Udmurt (there are a few Russian chunkers available) will potentially unify the process of code-mixing analysis.

There are ways to resolve homonymy throughout syntax, some of which we have done, however there is still work to be done in this direction, a chunkers might be also helpful in regard to this problem.

In addition, one of the all-time tasks is of course expanding the corpus and adding the texts to it.

8. Further Work

One of the main goals of this work was creating a unified system of code-mixing annotation, so that after combining it with morphological annotation a whole set of corpora could be built.

Building multiple corpora of language pairs of various morphology, syntax, word order, as well as languages from the same family opposed to languages from different ones will give us a possibility to see a much wider picture of why and where code-mixing occurs in the speech, analyze existing constraint hypotheses and suggest other ones, therefore allow us to work on creating an extensive description of code-mixing in general. It will allow to research the code-mixing occurrences under different circumstances. For instance, Spanish/English (both SVO languages) versus Irish/English (VSO/SVO) versus Turkish/Dutch (SOV/V2); or Basque/Spanish (ergative/accusative) versus Udmurt/Russian (both accusative); or Ngen/French (agglutinative/fusional) versus Baoule/Ngen (agglutinative/agglutinative), etc.

Therefore, creating a whole set of corpora will allow to work with typology of code-mixing all around the world.

Conclusion

In course of this work I have developed the principles that a corpus of texts containing code-mixing should have and built a working prototype of Udmurt/Russian Code-Mixing Corpus on the basis of an Online Annotated Corpus of Udmurt Language (), created in 2014.

I have discussed different approaches to studying code-mixing and various classifications of code-mixing by different scholars, eventually choosing the types that are more generally accepted, including insertion, alternation and congruent lexicalization. I have analysed most of the constraints that were offered in regard of code-mixing in the last 65 years, both the once that are claimed to be universal and the language-specific once. I have described a way to annotate multilingual texts to ease the verification of equivalence constraint, governmental constraint and free-morpheme constraint, as well as some of the language specific constraints, although their implementation depends on every language pair.

I have tried to create the most flexible rules for annotation, so that they could be adapted for various language pairs. Although the most traditional theories were preferred throughout designing these methods, my main goal is to find the balance between the existing theories and what can be done automatically in order to create the best functional system possible.

This work was the first step in hopefully creating a whole set of corpora with such rules to increase the speed and accuracy of research the code-mixing, help check the existing theories, offer new once and give an opportunity to work with more specific examples and conduct more subtle research.

Creation of a set of corpora with both morphological and code-mixing annotation has a potential to give a huge start to typological studies of this phenomenon, as the result of significantly easier access to data analyses. It will create a possibility to move forward in finding answers to questions regularly raised by linguists researching the reasons of code-mixing, such as whether there are some sorts of constituents in discourse which can be switched and others which cannot or if there are some constituents which tend to be switched into one language rather than the other, or in what ways incorporated items combine with the rest of the discourse and many other. I have outlined some possible characteristics of such a set and built the first automatically annotated corpus of Udmurt/Russian, which has both morphological and code-mixing annotation. All files related to it, including formated dictionary, grammatical paradigms and other links can be found at: https://github.com/masha-medvedeva/UdmurtRussianCorpus

Acknowledgments

Special thanks to my supervisor Timophey Arkhangelskiy for helping me with this work every step of the way and to Nikolay Vakhtin for inspiring us both to work on this topic. Huge thanks to Michael Daniel for supplying me with endless materials on bilingualism and code-mixing. I am endlessly grateful to all our Udmurt informants who helped us searching for texts when we first started working on the Udmurt Corpus and supported our project with the hugest enthusiasm and everyone who has been so kind to provide their feedback.

References

1. Alvarez-Caccamo, Celso (1998). From “switching code” to “code- switching”: Towards a reconceptualisation of communicative codes. In P. Auer (ed.), Code-switching in conversation: Language, interaction and identity, pp. 29-50. London and New York: Routledge.

2. Arkhangelskiy, Timofey (2012). Electronic Corpora of the Albanian, Kalmyk, Lezgian, and Ossetic Languages // Automatic Documentation and Mathematical Linguistics, Vol. 46, No. 2, pp. 118-123. Allerton Press.

3. Auer, Peter (1984). Bilingual conversation. Amsterdam and Philadelphia: John Benjamins.

4. Auer, Peter (1988). A conversation analytic approach to code-switching and transfer. In M. Heller (ed.), Codeswitching: Anthropological and socio- linguistic perspectives, pp. 187-214. Berlin and New York: Mouton de Gruyter.

5. Auer, Peter (1995). The pragmatics of code-switching: A sequential approach. In L. Milroy and P. Muysken (eds.), One speaker, two languages: Cross-disciplinary perspectives on code-switching, pp. 115-135. Cambridge, UK and New York: Cambridge University Press.

6. Auer, Peter (ed.) (1998). Code-switching in conversation: Language, interaction and identity. London and New York: Routledge.

7. Auer, Peter (1999). From codeswitching via language mixing to fused lects: Toward a dynamic typology of bilingual speech. International Journal of Bilingualism, 3 (4), 309-332.

8. Auer, Peter (2000). Why should we and how can we determine the “base language” of a bilingual conversation? Estudios de Sociolingu ? мэstica, 1 (1), 129-144.

9. Auer, Peter (2005). A postscript: Code-switching and social identity. Journal of Pragmatics. Special Issue: Conversational Code-Switching, 37 (3), 403-410.

10. Backus, Ad (1992). Patterns of language mixing: A study in Turkish-Dutch bilingualism. Wiesbaden: Harrassowitz.

11. Backus, Ad (2003). Units in code switching: Evidence for multimorphe- mic elements in the lexicon. Linguistics, 41 (1), 83-132.

12. Backus, Ad (2005). Codeswitching and language change: One thing leads to another? International Journal of Bilingualism, 9 (3-4), 307-340.

13. Bentahila, A. (1983a). Language attitudes among Arabic-French bilinguals in Morocco, Clevedon, Avon: Multilingual Matters

14. Bentahila, A. (1983b). Motivations for code-switching among Arabic-French code-switching. Language and Communication 3: 233-43

15. Bentahila, A., and Davies, Eileen D. (1983). The syntax of Arabic-French code-switching. Lingua 59: 301-30

16. Bentahila, A., and Davies, Eileen D. (1991). Constraints on code-switching: a look beyond grammar. In Papers for the Symposium on Code-Switching and Bilingual Studies: Theory, Significance and Perspective, Barcelona, pp. 396-404. Strasbourg: ESF

17. Berk-Seligson, S. (1986). Linguistic constraints on intra-sentential code-switching: a study of Spanish/Hebrew bilingualism. Language in Society 15: 313-48

18. Berruto, G. (2005), Italiano parlato e comunicazione mediata dal computer, in Hцlker K., MaaЯ Ch. (eds.), Aspetti dell'italiano parlato.

19. Clyne, Michael. (1967) Transference and Triggering. The Hague: Nijhoff.

20. Clyne, Michael. (1972) Perspectives on Language Contact. Melbourne: Hawthorn Press.

21. Clyne, Michael. (1980) Triggering and language processing. Canadian Journal of Psychology. 34: 400-6.

22. Clyne, Michael (1987). Constraints on code switching: How universal are they? Linguistics, 25 (4), 739-764.

23. Clyne, Michael G (2003). Dynamics of language contact: English and immigrant languages. Cambridge, UK and New York: Cambridge University Press.

24. Di Sciullo, A., Muysken, P., and Sing, R. (1986). Government and code-mixing. Linguistics 22:1-24

25. Dorleijn, Margreet and Jacomine Nortier. 2009. Code-switching and the internet. In Barbara Bullock and Almeida Jacqueline Toribio (eds.) 2009. The Cambridge handbook of linguistic code-switching. 127-141. New York: Cambridge University Press.

26. Eliasson, Stig (1989). English-Maori language contact: Code-switching and the free-morpheme constraint. Reports from Uppsala University Department of Linguistics, 18, 1-28.

27. Fano, R. M. (1950) The information theory point of view in speech communication. Journal of the Acoustical Society of America 22.6, 1950

28. Finlayson, R., Calteaux, K., & Myers-Scotton, C. (1998). Orderly mixing and accommodation in South African codeswitching. Journal of Sociolinguistics, 2(3).

29. Gardner-Chloros, Penelope (1991). Language selection and switching in Strasbourg. Oxford and New York: Oxford University Press.

30. Golovko 2001 - Головко Е.В. Переключение кодов или новый код? // Европейский университет в Санкт-Петербурге. Труды факультета этнологии. Вып.1, СПб., 2001. - С.298 - 316.

31. Grosjean, F. (1982), Life with two languages: an introduction to bilingualism, Cambridge, Harvard University Press.

32. Gullberg, M., Indefrey, P., Muysken, P.(2009) Research techniques for the study ofcode-switching. In: Bullock, B.E. and Toribio, A.J. (eds.) The Cambridge Handbook on Linguistic Code-Switching. Cambridge University Press.

33. Gumperz, John J. (1962) Types of linguistic communities. Anthropological Linguistics 4.1, 28-40., 1962

34. Gumperz, John J. (1964). Hindi-Punjabi code-switching in Delhi. In: H. Lunt, ed. Proceedings of the Ninth International Congress of Linguistics, Cambridge, Massachusetts, 1962. The Hague: Mouton, 1115-1124, 1964

35. Gumperz, J.J. (1982), Discourse strategies, Cambridge, Cambridge University Press., 1982

36. Haugen, Einar. (1950a) The analysis of linguistic borrowing. Language 26.2., 210-231, 1950

37. Haugen, Einar. (1950b) Problems of bilingualism. Lingua 2.3., 271-290, 1950

38. Heller, Monica, ed. (1988) Codeswitching: Anthropological and sociolinguistic perspectives. Berlin/New York/Amsterdam: Mouton de Gruyter, 1988

39. Heller, M. (1992) “The politics of code-switching and language choice”, Journal of mutilingual and multicultural development 13, 123-42, 1992

40. van Hout, Roeland., Muysken, Pieter (1994): `Modelling Lexical Borrowability', Language Variation and Change 6, 1994

41. Jakobson, Roman, C. Gunnar M. Fant, and Morris Halle. (1952) Preliminaries to speech analysis: The distinctive features and their correlates. Cambridge (Mass.): The M.I.T. Press, 1952

42. Jakobson, Roman. (1961) Linguistics and communication theory. In Roman Jakobson, ed. On the structure of language and its mathematical aspects. Proceedings of the XIIth Symposium of Applied Mathematics [New York, 14- 15 April 1960]. Providence (R.I): American Mathematical Society, 245-252, 1961

43. Joshi, Aravind K. (1985a). How much context-sensitivity is necessary for assigning structural descriptions? Tree adjoining grammars. In D. R. Dowty, L. Karttunen and A. M. Zwicky (eds.), Natural language parsing: Psychological, computational, and theoretical perspectives, pp. 206-250. Cambridge, UK and New York: Cambridge University Press.

44. Joshi, Aravind K. (1985b). Processing of sentences with intrasentential code switching. In D.R. Dowty, L. Karttunen and A.M. Zwicky (eds.), Natural language parsing: Psychological, computational, and theoretical perspectives, pp. 190-205. Cambridge, UK and New York: Cambridge University Press.

45. Kolers, Paul A. (1966). Reading and talking bilingually. American Journal of Psychology, 79 (3), 357-376.

46. Lehtinen M. K. T. (1966) An analysis of a Finnish-English bilingual corpus. Doctoral dissertation. Indiana University. Bloomington, 1966.

47. Lipski, J. M. (1978). Code-switching and bilingual competence. In Fourth LACUS Forum. M. Paradis, editor, pp. 263-277. Columbia, S. C. Hornbeam Press

48. MacSwan, Jeff (1999a). A minimalist approach to intrasentential code switching. New York: Garland.

49. MacSwan, Jeff (1999b). A minimalist approach to intrasentential code switching: Spanish-Nahuatl bilingualism in Central Mexico. London and New York: Routledge.

50. Maschler, Yael. (1998). On the transition from code-switching to a mixed code. In P. Auer (ed.), Code-Switching in Conversation. London: Routledge. 125-149.

51. Milroy, Lesley (1980). Language and social networks. Baltimore: University Park Press.

52. Milroy, L., & Muysken, P. (Eds.). (1994). One speaker, two languages: Cross-disciplinary perspectives on code-switching. Cambridge, England: Cambridge University Press.

53. Muysken, P. (2000). Bilingual speech: A typology of code-mixing. Cambridge: Cambridge University Press.

54. Muysken, Pieter (1981). Halfway between Quechua and Spanish: The case for relexification. In A. R. Highfield and A. Valdman (eds.), Historicity and variation in creole studies, pp. 52-78. Ann Arbor: Karoma.

55. Muysken, Pieter (1988). Media Lengua and linguistic theory. The Canadian Journal of Linguistics/La Revue canadienne de Linguistique, 33 (4), 409-422.

56. Muysken, Pieter (1996). Media Lengua. In S. G. Thomason (ed.), Contact languages: A wider perspective, pp. 365-426. Amsterdam and Philadelphia: John Benjamins.

57. Muysken, Pieter (2000). Bilingual speech: A typology of code-mixing. Cambridge, UK and New York: Cambridge University Press.

58. Muysken, Pieter (2005). Two languages in two countries: The use of Spanish and Quechua in songs and poems from Peru and Ecuador. In G. Delgado and J. M. Schechter (eds.), Quechua verbal artistry: The inscription of Andean voices [Arte expresivo Quechua: la inscripcio мn de voces andinas], pp. 35-60. Bonn: Bonner Amerikanistische Studien.

59. Muysken, Pieter; Kook, Hetty and Vedder, Paul (1996). Papiamento/ Dutch code-switching in bilingual parent-child reading. Applied Psycholinguistics, 17 (4), 485-505.

60. Myers-Scotton, C. (1988). Code-switching and types of multilingual communities. In Language Spread and Language Policy, P. Lowenberg (ed.), pp.61-82. Washington, D.C.:Georgetown Univ.Press

61. Myers-Scotton, C. (1989). Code-Switching with English: Types of switching, types of communities. World Englishes, 8:333-46

62. Myers-Scotton, C. (1993). Duelling Languages: grammatical structure in codeswitching. Oxford: Clarendon University Press.

63. Myers-Scotton, C., Jake, Janice L. and Okasha, M. (1996). “Arabic and constraints on codeswitching”. In Perspectives on Arabic Linguistics IX, Mushira Eid and Dilworth Parkison (eds.), pp.9-43. Amsterdam: Benjamins

64. Nait M'Barek, M. , and Sankoff D. (1988). Le discours mixte arabe/franзais: emprunts ou alternances de langue? Canadian Journal of Linguistics 33(2). 143-154

65. Nortier, J. (1989). Dutch and Moroccan Arabic in contact: code-switching among Moroccans in the Netherlands.Unpublished Ph.D. thesis, University of Amsterdam

66. Nortier J. (1990a). Dutch-Moroccan Arabic Code-Switching among Moroccans in the Netherlands. Dordrecht: Foris

67. Nortier, J. (1990b). Code-switching and borrowing. Paper presented at the Worshop on Ethnic Minority Languages, Gilze-Rijen

68. Nortier, J. (1995). Code-switching in Moroccan Arabic/Dutch versus Moroccan Arabic/French language contact, International Journal of the Sociology of Language, Vol. 112: 81-95

69. Nortier, J. , and Schatz, H. (1992). From one-word switch to loan: a comparison of between language pairs, Multilingua 11:173-94

70. Pfaff, Carol W. (1976). Functional and structural constraints on syntactic variation in code-switching. In Papers from the Para session on diachronic syntax, B. Steever et al. (eds.), pp.248-59. Chicago:Chicago Linguistic Society

71. Pfaff, Carol W. (1979). Constraints on language mixing. Language 55: 291-318

72. Poplack, S. (1980). Sometimes I'll start a sentence in Spanish y termino en espaсol. Linguistics 18: 581-618

73. Poplack, S. (1981). Syntactic structure and social function. In Latin language and communicative behavior, R.P.

74. Duran (ed.), pp.169-84, Norwood, N.J.: Ablex

75. Said, J. (1988). Codemixing and multilingual competence in Morocco. Paper presented at the Second DutchMoroccan Symposium, Leiden-Amsterdam, April.

76. Sankoff, D., and Poplack, S. (1981). A formal grammar for code-switching. Papers in Linguistics: International Journal of Human Communication 14(1): 3-45

77. Suihkonen, Pirkko (1998). Documentation of the Computer Corpora of Uralic Languages at the University of Helsinki. Technical Reports TR-2. Department of General Linguistics, University of Helsinki, 1998.

78. Timm, L., (1975). Spanish-English code-switching:el porque y how-not-to. Romance Philology 28: 473-482

79. Treffers-Daller, Jeanine (1994). Mixing two languages: French-Dutch contact in a comparative perspective. Berlin: Mouton de Gruyter.

80. Trudgill, P. (1986). Dialects in contact. Oxford: Blakwell

81. Sridhar, S.N., Sridhar, K. K. (1980). The syntax and psycholinguistics of bilingual code-mixing. In Studies in the Linguistic Sciences. 10, 203-215.

82. Swigart, Leigh. (1992) Two codes or one? The insiders' view and the description of codeswitching in Dakar. In Eastman 1992, 83-102

83. Vogt, Hans. (1954) Language contacts. Word 10.2-3, 365-374

84. Vakhtin, Golovko (2004) - Н. Б. Вахтин, Е.В. Головко. Социолингвистика и социология языка. (СПб., 2004. - 336 c.)

85. Wentz, James and McClure, Erica (1977). Aspects of the syntax of the code-switched discourse of bilingual children. In F. Ingemann (ed.), 1975 Mid-America Linguistics Conference papers, Lawrence, KS: University of Kansas.

86. Winkler, E. (2001) Udmurt Languages of the World/Materials 212 LINCOM EUROPA, Mьnchen, 2001

87. Woolford, Ellen (1983). Bilingual code-switching and syntactic theory. Linguistic Inquiry, 14 (3), 520-536.

88. Алатырев (1983) Краткии? грамматическии? очерк удмуртского языка. Ижевск, 1983

89. Бутолина (1942) Русско-удмуртскии? словарь. Ижевск, 1942 Перевощиков (1962) Грамматика современного удмуртского языка. Фонетика и морфология. Ижевск, 1962.

90. Кириллова (2008) Удмуртско-русскии? словарь Ижевск, 1962

91. The Corpus of Udmurt Language - http://web-corpora.net/UdmurtCorpus

92. Eastern Armenian National Corpus (EANK) - http://www.eanc.net/

„Q„p„x„}„u„‹„u„~„Ђ „~„p Allbest.ru


Подобные документы

  • The solving of the equation bose-chaudhuri-hocquenghem code, multiple errors correcting code, not excessive block length. Code symbol and error location in the same field, shifts out and fed into feedback shift register for the residue computation.

    презентация [111,0 K], добавлен 04.02.2011

  • Характеристика особенностей автоматизации управлением IT-инфраструктуры из нескольких серверов путем внедрения в процесс системного администрирования методологии "Infrastructure as Code". Подробное описание инструментов, которые используются на практике.

    статья [196,3 K], добавлен 10.12.2016

  • Program of Audio recorder on visual basic. Text of source code for program functions. This code can be used as freeware. View of interface in action, starting position for play and recording files. Setting format in milliseconds and finding position.

    лабораторная работа [87,3 K], добавлен 05.07.2009

  • Дистрибутиви та особливості архітектури QNX, існуючі процеси та потоки, засоби та принципи синхронізації. Організація зв'язку між процесами. Алгоритм роботи системи та результати її тестування. Опис основних елементів програмного коду файлу code.c.

    курсовая работа [132,0 K], добавлен 09.06.2015

  • Проектирование устройства, выполняющего функцию определения минимального давления на основе информации о показаниях полученных от 7 датчиков. Разработка набора команд управления микроконтроллером в среде программного обеспечения Code Vision AVR.

    курсовая работа [24,5 K], добавлен 28.06.2011

  • Program automatic system on visual basic for graiting 3D-Graphics. Text of source code for program functions. Setting the angle and draw the rotation. There are functions for choose the color, finds the normal of each plane, draw lines and other.

    лабораторная работа [352,4 K], добавлен 05.07.2009

  • Інструменти для розробки сайту. Застосування парсингу HTML-сторінок для створення web-системи з реалізації комп’ютерних комплектуючих по магазинах постачальниках з оптимальним пошуком при формуванні заказу. Аналіз можливостей фреймворку Code Igniter.

    дипломная работа [918,4 K], добавлен 08.06.2013

  • Program game "Tic-tac-toe" with multiplayer system on visual basic. Text of source code for program functions. View of main interface. There are functions for entering a Players name and Game Name, keep local copy of player, graiting message in chat.

    лабораторная работа [592,2 K], добавлен 05.07.2009

  • Creation of the graphic program with Visual Basic and its common interface. The text of program code in programming of Visual Basic language creating in graphics editor. Creation of pictures in Visual Basic, some graphic actions with graphic editor.

    лабораторная работа [1,8 M], добавлен 06.07.2009

  • Practical acquaintance with the capabilities and configuration of firewalls, their basic principles and types. Block specific IP-address. Files and Folders Integrity Protection firewalls. Development of information security of corporate policy system.

    лабораторная работа [3,2 M], добавлен 09.04.2016

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.