This reading is the introductory chapter from a large book on indigenous languages of North America. The author, Marianne Mithun, is a well-known specialist of Native American languages.
The term genetic diversity, first on page 1, is being used in the specialized sense of historical linguistics. Related languages are said to be genetically related; this has nothing to do with biological genetics. Genetic diversity in linguistics simply means diversity of language families.
The terms structuralist tradition and generative tradition (first on pages 9-10) refer to theoretical models for understanding and interpreting language structure. Their details need not concern us in this course.
An argument (first on page 12) in this context is a grammatical term referring to nouns and pronouns and their relationship to various parts of a sentence. For example, in the sentence The flycatcher ate the damselfly, both flycatcher and damsefly are arguments of the verb ate.
What are some factors that contribute to the difficulty in knowing linguistic details of the languages in North America around the time of the first arrival of Europeans?
What is the difference between Edward Sapir’s and Joseph Greenberg’s classification of North American languages? How are they both different from modern classifications?
What is meant by the term polysynthesis? Is polysynthesis universal within the North American languages?
What are some general ways in which North American languages differ from each other (and from English)?
An Overview of North American Languages
One of the pervasive myths about North American culture groups among non-Indigenous Americans is that the cultures—and thereby the languages—form a cohesive group. A common belief is that there is a single “Native American culture,” maybe with minor variants from place to place. This notion could not be further from the truth. Throughout history there have been hundreds, perhaps thousands, of distinct cultural groups residing in North America, each with its own traditions, customs, and languages.
A map showing the locations of language groups in central North America prior to the arrival of the Europeans (select to enlarge). Image from Mithun (1999:iii-iv). Used here by Fair Use guidelines for educational purposes.
Many modern Americans can name a small handful of indigenous North American languages. Among the most likely to be mentioned are Navajo, Ojibwe, Sioux, Iroquois, and Eskimo. Using the label language for these names is partially correct, but also misleading. In fact, the only member of this list that is a single language is Navajo.
Diagram of the internal anatomy of a spider with all parts labeled in Navajo. Image by Wikipedia user Seb az86556. Used here by a Creative Commons license.
Ojibwe (also spelled Ojibway or Ojibwa) is really a large group of about nine speech varieties whose intelligibility with one another varies by area. Much of the Ojibwe-speaking region forms one or more dialect chains (see Lesson 1); the English names for these varieties are sometimes quite distinct, as in Chippewa and Ottawa, while at other times employ geographic designators, as in Central Ojibwa or Western Ojibwa.
A partial tree of the Algic language family, emphasizing the Algonguian branch (select to enlarge). Data from Mithun (1999), image generated by MultiTree. Used here by Fair Use guidelines for educational purposes.
The remaining three names in the list above, Sioux, Iroquois, and Eskimo, are whole branches of different language families, or entire families themselves. There is no single Sioux language. Rather, Siouan is a large branch of the Siouan-Catawban family, which contains some 14 languages, including Dakota, Lakota, and Crow. (Lakota, Eastern Dakota, and Western Dakota are sometimes referred to collectively by the label “Sioux.”) Similarly, there is no single “Iroquois” language, but instead a nine-language family called Iroquoian. The Iroquoian family contains Mohawk and Cherokee. Eskimo is a ten-language branch of the Eskimo-Aleut family. The Eskimo branch includes at least five languages collectively referred to as Yupik, plus another five languages in the Inuit-Inupiaq sub-branch.
A dual-language sign in English and Cherokee. The Cherokee writing system was invented by Sequoya, a brilliant Cherokee innovator, in the early 1800s. This kind of script is called a syllabary. Unlike English spelling, where characters represent individual speech sounds, characters in a syllabary represent whole syllables, such as ha, he, hi, and so on. For a good overview of the Cherokee syllabary, check out its Wikipedia page. A downloadable Cherokee keyboard is also available. This image is in the public domain.
A three-language sign in French, English, and the Eskimo-Aleut language Inuktitut. Inuktitut, like Cherokee, uses a syllabary (but the symbols are entirely different from the Cherokee ones). Image by Wikipedia user VIGNERON. Used here by a Creative Commons license.
Significant here is the point that that the common English labels used to refer to indigenous North American languages often obscure important differences between the languages themselves. This is compounded by the practice of naming languages according to the names for cultural and ethnic groups, which do not always line up with the names for languages in a one-to-one fashion. Many English names for languages and culture groups also reveal the historical practice of adopting inappropriate, inaccurate, and sometimes outright derogatory names for various languages and communities. Part of being a socially responsible linguist means using language names that are culturally relevant and which accurately reflect linguistic relatedness.
Non-Indigenous perspectives on North American languages have also been tainted by a historical legacy of racism. In the early days of European interaction with Native Americans, indigenous languages were often characterized as nonsensical and inferior to European ones. While there were genuine attempts by European missionaries to learn indigenous American languages with the aim of converting those speakers to Christianity, the impressions adopted by most colonists, and the ones relayed back to Europe, were of a negative and dismissive nature. The historian Bernard Bailyn, in discussing the views on Native American culture and language by early Europeans in the American colonies, describes the sentiment of the Dutch clergyman Jonas Michaëlius (born 1577) in the following audio excerpt.
All linguists today consider the views expressed by Michaëlius to be utter hogwash. It is taken as a matter of fact that all human languages are rich, full-fledged, creative, expressive, and complex systems. That Michaëlius could not make sense of the indigenous languages he encountered likely speaks to his own ineptitude and to the presence of speech sounds that simply were not used in European languages.
Because of the exceptionally large number of languages and language families represented in North America, it will be worthwhile spending a little time giving a brief overview of the language membership in Canada, the United States, and Mexico/Central America in order to highlight the rich language diversity on this continent.
Canada is home to around seventy-seven indigenous languages. The best-represented language families are Algic, Eyak-Athabaskan and Salish. Almost two-dozen languages from the large Algic language family are spoken in Canada. Algic contains around forty-two languages total, with a fairly even division between Canada and the United States. The two primary Canadian branches of Algic are Cree-Montagnais, which includes at least six regional variants of Cree, and Ojibwa-Potawatomi, which contains at least six regional variants of Ojibwe. Other Algic languages in Canada include Blackfoot, from the Algonquian branch, and Micmac, Abenaki, and Munsee from the Eastern Algonquian branch.
A photograph from 1916 of Blackfoot chief Mountain Chief in a recording session with ethnomusicologist Frances Densmore. This image is in the public domain.
The Eyak-Athabaskan family, with around forty-four languages altogether, has about seventeen in Canada, all in the Northern Athabaskan branch of the family. Specific languages include Beaver, Dogrib, Sekani, Slavey, Tahltan, and Tutchone.
Around ten languages from the Salish family are spoken in Canada, including Bella Coola, Halkomelem, Straits Salish, Squamish, Lillooet, and Okanagan. Other Canadian languages include Assiniboine and Stoney, in the Dakota branch of the Siouan-Catawban family; Cayuga, in the Iroquoian family; and Ditidaht and Haisla, in the Wakashan family. Canada is also home to several language isolates, including Tlingit, Tsimshian, Haida, and Kutenai.
Map of languages in Canada (select to enlarge). Asher & Moseley (2007:41). Used here under Fair Use guidelines for educational purposes.
In the United States, the major language families are Eskimo-Aleut, Algic (and principally its Algonquian branch), Eyak-Athabaskan, Siouan-Catawban, Iroquoian, and Uto-Aztecan.
Most of the languages in the Eskimo-Aleut family are spoken in and around Alaska, including Aleut (which is a single language making up the entire Aleut branch of this family) and several varieties of Yupik. At least two Yupik languages, Naukun and Sirenik, are spoken across the Bering Sea in Russia. The Eskimo language Inuktitut is the native language of Greenland. The geographic spread of Eskimo-Aleut, from Eastern Russia all the way to Greenland, makes it one of the widest dispersed language families, even though it contains less than a dozen individual languages.
The Plains region of the United States consists approximately of Montana, the Dakotas, Nebraska, Kansas, Oklahoma, Minnesota, Iowa, Wisconsin, Illinois, Michigan, and parts of Missouri, Texas, Colorado, and Idaho. This large area mainly contains languages from the Siouan, Algic, and Caddoan families. Of these, Siouan is the best represented in terms of number of languages and geographic spread. Siouan languages from this region include Assiniboine, Crow, Lakhota and Dakota, Omaha, Osage, and Kansa (which is the source name for the US states Kansas and Arkansas).
The Central Plains and Great Plains occupy most of the interior of the continental US. Image from the United States Geological Survey (1980). Used here under Fair Use guidelines for educational purposes.
Other languages in the Plains region include Cheyenne, Arapaho, Ojibwe, Chippewa, and Miami, all in the Algic family; Pawnee and Wichita, in Caddoan; Kiowa in Eyak-Athabaskan, and the language isolate Tonkawa.
The Eastern region of the United States contains languages mostly from the Iroquoian, Algic, and Muskogean families; many of these were the languages first encountered by Europeans and were subsequently extinguished due to genocide and conformist pressure from English. The languages Tuscarora, Mohawk, and Cherokee are in the Iroquoian family, with Cherokee being the sole member of the Southern Iroquoian branch and now having over 10,000 first-language speakers in North Carolina and the greater Smoky Mountains. Listen to the following short sample of Cherokee words.
The languages Delaware (a collective term comprising Munsee and Unami) and Shawnee are both in the Algonquian branch of Algic. The Muskogean family contains languages almost exclusively in the Southeast; these include Creek, Alabama, Choctaw, and Chickasaw.
Marcus Amerman, a Choctaw artist, preparing food on a grill. This image is in the public domain.
The West Coast of the United States at one time had an extremely dense population of indigenous languages and was easily the most linguistically diverse region of North America. Counting the West Coast of Canada, there are well over twenty distinct language families from this area; there are over 50 individual languages from California alone. A handful of languages from this region includes Siuslaw, Coos, Yurok, Tillamook, Wiyot, and Wintu.
Languages on the West Coast of the United States at the time of European arrival (select to enlarge). Image from Asher & Moseley (2007:38). Used here under Fair Use guidelines for educational purposes.
Many languages of the American Southwest are known to the American population in general from film and television, in particular those of the “Western” genre (which has contributed heavily to the stereotyping of native peoples from that region), as well as tourism connected to the Grand Canyon and Las Vegas. The two major language families represented there are Eyak-Athabaskan, which contains Navajo and Apache, and Uto-Aztecan. Uto-Aztecan is a large family spanning several countries, and includes Ute, Paiute, Shoshone, Comanche, Pima, Serrano, Mojave, and Hopi in the United States.
The geography of the land south of the Rio Grande sometimes takes on a variety of names, depending on which specific areas are being referred to. The term Central America includes the seven nations that link the North American and South American landmasses: Guatemala, Belize, Honduras, El Salvador, Nicaragua, Costa Rica, and Panama. It does not include Mexico. Mesoamerica is a term referring to a historical region that included all of modern Central America except for Panama, but also included the southern half of Mexico. Mesoamerica is known to have been a linguistic area, where linguistic features diffused (were transferred) between unrelated languages because their speakers were in close contact with one another over prolonged periods of time (hundreds or thousands of years).
Mexico, like the rest of North America, is incredibly diverse linguistically. There are at least a dozen language families represented there, the largest being Mayan, Uto-Aztecan, Otomanguean, and Mixe-Zoquean.
There are over thirty languages in the Mayan family, the principal one being Yucatec Maya with over 700,000 speakers. The Mayan languages have dominated the Yucatan Peninsula and surrounding area for at least 5000 years. The Maya civilization throughout this time was one of the most powerful in North America, developing major social systems with rich branches of politics, science, and architecture. In addition to Yucatec Maya, other modern Mayan languages include Chol, Tzotzil, K’iche’, and Mam.
The ancient peoples of Mesoamerica are one of the few groups known to have invented writing. (Most writing systems are not invented from scratch, but are based off pre-existing writing systems used by neighboring cultures.) Ancient Mayan writing is known to have been used since 300 BCE and continued to the 16- and 1700s.
Four isolated words written in Mayan hieroglyphs. This type of writing system is called logographic. The drawings represent entire syllables; when several of them are put together in the same grouping, they form words. In most cases, the glyphs are not literal drawings of the things they signify; otherwise, it would only be possible to write nouns and perhaps some verbs. These glyphs represent all aspects of the spoken language, including grammatical morphemes. Glyph (a) contains four syllables, ta-ja-moʔ-ʔo, which is interpreted as the word tajamoʔ 'torch macaw,' a type of bird; glyph (b) contains four syallables, ti-ʔajaw-le-le, which is interpreted as ti ʔajawlel 'in lordship’; glyph (c) is a repeat of the same syllable, k’a-k’a, which is interpreted as the word k’ahk’ 'fire’; glyph (d) contains three syllables, ʔu-ts’i-‘i, which is interpreted as ʔu ts’i’ 'his dog.' Images from Mora-Marin (2010). Used here under Fair Use guidelines for educational purposes.
The Uto-Aztecan family, whose members from the modern United States were discussed earlier, is represented much more prominently in Mexico than anywhere else in North America. While fourteen Uto-Aztecan languages are from the United States, fully forty-six are from Mexico. Of these forty-six, twenty-eight are modern regional variations of Nahuatl. Classical Nahuatl was the language of the Aztec Empire, which dominated Central Mexico for several hundred years prior to the arrival of the Spanish in 1519.
It often surprises Americans to learn that there is a handful of English words of Nahuatl origin, sometimes having entered English via Spanish. The tl sequence is a give-away of Nahuatl-sourced words. The Nahuatl pronunciation of tl is very difficult for English and Spanish speakers, so non-Nahuatl speakers have modified that sequence so that it conforms to the sound pattern rules of their own languages. In English, the modification is to pronounce the t and the l in different syllables, like one would for the words bottle or little. This produces the English pronunciation of the animal name axolotl and the spear-throwing implement called an atlatl. Represented in the International Phonetic Alphabet, the Nahuatl pronunciations of these words would be [aːʃóːloːtɬ] and [aʔtɬatɬ], respectively. Even the name of the language, Nahuatl, falls into this category, and would be pronounced [nawaːtɬ] in that language. Other English words modified in similar ways from Nahuatl sources include chipotle, from [ʧilpoːktɬi]; avocado from [aːwakatɬ]; coyote from [cojotɬ]; jicama from [ʃikamatɬ], mesquite from [miskitɬ], ocelot from [oːseːloːtɬ], peyote from [pejoːtɬ], and tomato from [tomatɬ].
An early Aztec pyramid in the Acatitlan archeological zone; the name is a Nahuatl word meaning 'place among the reeds.' This image is in the public domain.
There are 177 languages in the Otomanguean family. All of these hail from Mexico, with one exception (Subtiaba, spoken in Nicaragua). The large number of languages from this family is taken up mostly by the fifty-two regional variants of Mixtec and the fifty-seven regional variants of Zapotec. Other major Otomanguean languages include Otomi, Chinantec, Chatino, Popoloca, and Mazatec.
The Mixe-Zoquean family, containing about seventeen languages, is another major group. This family is divided into two main branches: Mixe, which contains ten languages, and Zoquean, which contains seven. It is likely that in ancient times, Mixe-Zoquean languages played a more prominent role in the region than they do today. A significant archaeological artifact, La Mojarra Stela 1, is thought to be written in an early Mixe-Zoquean language.
A drawing of La Mojarra Stela 1. The text describes various rituals and activities of warfare leading to the establishment of the warrior-king Harvester Mountain Lord. Image from Justeson & Kaufman (1993). Used here under Fair Use guidelines for educational purposes.
Watch the following short animated film in the Yaqui language, which is a Uto-Aztecan language from Mexico and the southwestern United States.
The distribution company of the above video, 68 Voces, has a YouTube channel with a variety of short films in indigenous Mexican languages.
The languages of Mesoamerica at the time of European arrival. Image from Asher & Moseley (2007:56). Used here under Fair Use guidelines for educational purposes.
The geographic locations identified in this course for indigenous American languages reflect the traditional homelands of those speakers, prior to the forced relocation enacted by European Americans. Today, Native American languages, when used in a community setting (as opposed to university classrooms, for example), are often spoken on reservations, which are frequently far removed from the original homeland and always occupy a significantly smaller land area.
The contemporary distribution of indigenous languages of the United States and Mexico is significantly smaller than the original distribution. Image from Asher & Moseley (2007:41). Used here under Fair Use guidelines for educational purposes.
In many ways, it is artificial to use the boundaries of modern nations (i.e. Canada, United States, Mexico) to describe the locations of indigenous North American languages. Prior to the creation of those countries, speakers of the languages had their own systems for determining regional distribution. For most of history, there was no such thing as “the United States” or “Canada” or “Mexico.” It is common for non-indigenous people to attribute inherent validity or naturalness to boundaries demarcated by those of nations, but it is important to recognize that such boundaries are recent constructs imposed on indigenous communities and that there is nothing inherently “true” about them. This is important for us because although we may talk about the “languages of the southwest United States” differently from the “languages of Mexico,” we may in some cases be talking about the same languages.
Below is a video of man speaking K’iche Maya.
For the remainder of this lesson, we will investigate three linguistic properties found in a variety of North American languages: rare speech sounds, polysynthesis, and evidentiality.
Rare Speech Sounds in North American Languages
Recall from Lesson 2 that we explored some unique speech sounds, clicks, found in the Khoisan languages of southern Africa. Then in Lesson 3, we looked at speech sounds from Afro-Asiatic languages produced in the far back of the oral cavity. Here again we have an opportunity to investigate some not-too-common speech sounds found in indigenous North American languages.
The category of speech sound that we will explore in this lesson is called glottalized consonants. Glottalized consonants are those that involve the glottis, as well as some other articulator in the oral cavity.
The image below shows a section of the vocal tract. The small circle in the throat represents a muscular organ called the glottis. The glottis is extremely important in human speech and is involved in producing speech sounds in every vocal human language. The glottis consists of two bands of loosely overlapping muscles called vocal folds (sometimes called vocal cords, though this is not their technical name). There is a small space between the vocal folds for air to pass through when it moves into or out of the lungs.
During swallowing, the glottis must be closed to prevent food from passing into the lungs. The vocal folds close completely and the epiglottis retracts in order to allow food and liquids to move into the esophagus. This is why breathing is not compatible with swallowing.
Image by Caleb Hicks. Used by permission.
When the focal folds are kept lax, air coming up from the lungs causes them to vibrate. This vibration is the voiced sound. When the vocal folds are tightened and held taught, air from the lungs is not able to vibrate them, which produces a voiceless sound. You can feel the difference in vibration between voiced and voiceless speech sounds by holding your hand to your throat and alternating between zzzzzzzzzz and ssssssssss. The z sound is voiced; you can feel the vibration of your vocal folds. The s sound is voiceless; you cannot feel any vibration.
All vocal languages make use of the voiced/voiceless distinction made possible by vibration of the vocal folds. Some languages do other things with the glottis, as well. Because all of these speech sounds involve manipulations of the glottis, they are collectively called glottalized speech sounds.
Ejectives are speech sounds that are very similar to regular consonants like k, t, and s. But ejectives sound like they are forced out of the oral cavity with a quick, sudden burst (hence the name ejective). In the International Phonetic Alphabet, ejectives are symbolically indicated by following the regular consonant symbol with an apostrophe: the regular k sound is written [k], but the ejective k sound is written [k’]. By convention, linguists write speech sounds by placing the symbols inside square brackets to indicate their status as speech sounds rather than regular letters. Below are some examples of ejective speech sounds from various North American languages.
The sound files are as follows. Huixtán Tzotzil, a Mayan language from Mexico: (1) [ʧ’eʧ’] 'he will pass by’ (2) [ʧ’ul] 'it will dry up’ (3) [ʧ’iʧ’] 'blood’; Shuswap, a Salishan language from British Columbia: (4) [p’iχm] 'to fry’ (5) [t’upns] 'he wrings it’ (6) [t’əkʷilx] 'doctor’; Montana Salish from Montana: (7) [ʧ’atəɬq] 'horsefly’ (8) [sts’om] 'bone’ (9) [laq’i] 'sweatbath.' Sound files from UCLA Phonetics Lab Archive (2007). Used here via a Creative Commons license.
The physical production of ejectives involves at least two primary mechanisms working in concert to produce coordinated articulatory gestures. Let’s take [k’] as an example. First, the various parts of the oral cavity (e.g. tongue, lips, and so on) have to make the regular consonant [k]: the back of the tongue presses up against the velum (also called the soft palate), creating a closure in the oral cavity. For a normal k sound, air coming up from the lungs builds up just a little bit behind that closure. When the tongue is moved away from the velum, the air passes through and produces the k sound. But for an ejective k, the vocal folds remain closed in addition to closure at the velum, leaving two points of closure in the oral cavity. Then, the larynx (which contains the vocal folds and glottis) moves up, decreasing the volume between the closed glottis and the closed velum. Since the volume is decreasing between two closed points in the oral cavity, the air pressure increases dramatically. When the two closures open (simultaneously or nearly so), the built-up air pressure is released in a sudden, forceful burst. This burst produces the characteristic sound of an ejective. Note that articulatory gestures described here for the production of ejectives occur over milliseconds and are not perceptible to people speaking normally.
The distinction between ejectives and non-ejective consonants can alter the meanings of words in some languages. For instance, in Shuswap, one of the languages sampled above, the word [p’iχm] contains an ejective p; this word means 'to fry.' But if a plain p is used, instead of an ejective, the result is [piχm], which means 'to hunt.'
Other types of glottalized speech sounds are possible, too, but they occur less frequently in North American languages. For example, implosives are produced by using air pressure within the oral cavity to draw air in, rather than let it out. Creaky voice is the name for unique vibrations of the vocal folds that cause vowels to sound bumpy or creaky.
The map below shows the worldwide distribution of some glottalized speech sounds, based on a sample of 541 languages. Blue circles, which predominate in the Americas, represent languages with ejectives, but without other glottalized speech sounds. The red circles represent languages with implosives, but without ejectives. Purple circles, of which there are few in North America, represent languages with both ejectives and implosives.
Map of glottalized speech sounds. Maddieson (2013). Used here by a Creative Commons license.
Leaving aside unique speech sounds, we turn now to another property found in many North American languages: polysynthesis. Recall from Lesson 7 that we characterized languages in the Sinitic branch of Sino-Tibetan as having isolating morphology, meaning that they tend to have very few morphemes per word; that is, they isolate their morphemes into single words, rather than having words containing multiple morphemes. Then again in Lesson 9, we noted that many languages in Southeast Asia are also strongly isolating.
In those lessons, we practiced calculating the synthetic index for a text in a given language by counting up the total number of morphemes (m) and the total number of words (w) in the text. Using the formula m/w, we arrived at a synthetic index, which represents the morpheme-to-word ratio for that text. The synthetic indexes we calculated for Sinitic and Southeast Asian languages were very close to 1 because those languages typically have one morpheme per word.
In this lesson, we’ll see that many North American languages are just the opposite: they have a very large number of morphemes per word. Such languages are termed polysynthetic.
The word polysynthetic begs some qualification. In common usage, synthetic has different meanings in different contexts. It can mean 'artificial' or 'human-made,' as in synthetic sweeteners or synthetic rubber, which can also have the connotation of 'fake' or 'not genuine.' There is also a category of musical instruments called synthesizers, which reproduce sounds electronically and allow the user to specify their combinations and sequences. At the heart of these various meanings is the notion of putting things together or combining things. That is the sense in which synthetic is used in linguistics: synthetic morphology combines multiple morphemes into a single word. When this is done to an extreme degree, the prefix poly, meaning 'many,' is added.
Polysynthetic languages have calculable synthetic indexes just like isolating languages do. Using the same formula, m/w = si, we can determine the synthetic index for any language based on a representative text with a 3-line gloss. While isolating languages have synthetic indexes at or very close to 1, polysynthetic languages have higher indexes because they tend to have many more morphemes per word.
As an example, consider Dene, whose English name is Chipewyan, spoken in western Canada. Dene is a member of the Eyak-Athabaskan family and is distantly related to Navajo and Apache. The following word in Dene contains seven morphemes.
'I have given her to him.' (Mithun 1990:36)
Using the formula m/w = si, we would have 7/1 = 7. Based on that single word, the synthetic index of Dene would be 7. Remember that when calculating synthetic indexes, we are using the number of morphemes and words in the object language, not the number of morphemes and words in the English translation. Although the Dene construction above requires an entire sentence in English, it is genuinely a single word in Dene.
It is usually not very accurate to make a claim about a language’s synthetic index based only on a single word or a single sentence. All languages show variation in their morpheme-to-word ratios, so it is important to calculate synthetic indexes from texts containing as many words as possible. The larger the text, the more likely it is that the resulting synthetic index will represent that of the language as a whole.
Consider the following text from Musqueam, spoken in southwestern Canada. Musqueam, also called Holkomelem, is in the Salish language family, which contains about 26 languages; it is related to Bella Coola and Squamish. This Musqueam text is a narrative, which means it is a reproduction of a lengthy story or conversation by a native speaker of the language. Examine the text carefully and calculate its synthetic index. You do not need to worry about the pronunciations of non-English symbols or the meanings of the various grammatical morphemes (you only need to be able to count them). As you inspect the narrative, remember that words are separated by spaces and morphemes within a word are separated by dashes. A few of the symbols sometimes confuse beginning students into thinking they signal a new morpheme, but they really just represent speech sounds (for instance, the apostrophe [‘] means that the immediately previous sound is glottalized; the little triangles [ˑ] and [ː] mean that the immediately previous sound has a longer duration). Only a dash (-) signals a new morpheme within a single word. Recall also that words without multiple morphemes count as one morpheme.
Practice Activity 1
Using the glosses below, can you figure out the synthetic index of Musqueam?
'Someone came to get me to go night hunting with him.
So we left and crossed over to Active Pass.
It had just become night when we got there.
Then we cooked our supper.
Then Gabe said, "We'll go this way.
It's where I always used to go."
Then we saw a mink.
Then he shot it.
Well, then we went off again.
Then he saw two raccoons.
So he shot them, too.
Then we were all aboard [the canoe].
Then we left and we were still leisurely paddling along when we heard something whistling.
When the whistling stopped, we heard something chopping "pow…pow…swoosh!"
Then Gabe said, "It's those little choppers.
We can't get anything.
We just heard a big Douglas-fir fall.
We'd better go back.
We'll go to the little island."
Then we got there.
Then he [Gabe] saw a deer.
And he shot it.
Then the deer rolled down right into the canoe so that he was looking right into its face.' (Suttles 2004:528-532)
3PL = third-person plural, 3POSS = third-person possessive suffix, 3TR = third-person transitive subject, ACT = activity suffix, ART = article, AUX = auxiliary verb, EST = “established” aspectual prefix, EXP = expectable particle, FUT = future particle, INTR = intransitive suffix, NOM = nominalizer, OBL = oblique particle, PAST = past particle, TR = transitive suffix.
As you can tell from the Musqueam text, some words have quite a lot of morphemes, such as wə-xʷə-k’ʷék’əc-ás-t-əm, which has 6. Other words are monomorphemic. This is an important observation because it means that characterizing a language as polysynthetic is really making a claim about what kinds of morpheme-to-word ratios are possible in the language, rather than the ratios that occur always in the language. While a polysynthetic language may have words with six morphemes, it also has words with only one morpheme. An isolating language, on the other hand, will have on average one morpheme per word, but is very unlikely to have more than one (and almost never has more than two).
Having spent some time dwelling on unique speech sounds and polysynthesis, we turn now to a semantic property shared by several North American languages. These languages have a grammatical requirement to include a morpheme that specifies the type of evidence a speaker has when making a claim. This property is called evidentiality.
Before we get into the details of evidentiality, consider the following dialogue in English:
A: What happened last night?
B: It rained.
When Speaker B responds, “It rained,” what does Speaker A know about the kind of evidence Speaker B has for claiming that it rained? Did B actually see the rain? Did B hear the raindrops on the roof? Did B infer that it rained based on circumstantial evidence, such as everything being wet? Is B just reporting what someone else said? English speakers cannot answer these questions from the short dialogue given because the type of evidence is not obligatorily encoded in English morphology. But in some languages, it is.
In Central Pomo, a nearly extinct language from California and a member of the small Pomoan language family, there are suffixes that specify a statement’s type of evidence, its evidentiality. Different suffixes are used for different types of evidence. In Central Pomo, the phrase čʰé mul- has the basic meaning 'it rained,' but this phrase is not grammatical all by itself. It needs an evidential suffix in order to satisfy the grammatical requirements of the language. The suffix -nme encodes auditory evidentiality, so čʰé mul-nme could be paraphrased 'it rained (I heard the drops falling).' The suffix -ʔka is an inferential morpheme, so čʰé mul-ʔka could be paraphrased 'it rained (I infer, since everything is wet).'
Rain in California. Image by Wikipedia user Brocken Inaglory. Used here by a Creative Commons license.
Of course, English speakers can encode evidentiality, too. But in English, it must be done by using new words to modify a sentence, rather than by adding morphemes whose only job is to encode evidentiality. For instance, sentences like Isabel is exhausted by the looks of her or Everybody says that Bojangles is amazing or It smells like someone farted each have evidential interpretations. Their evidentiality is accomplished phrasally, rather than morphemically. In English, furthermore, evidentiality is never grammatically required. In evidential languages, these kinds of morphemes are obligatory: if they are absent, the resulting sentence will be ungrammatical.
Languages that use evidential morphemes differ in the number of such morphemes they have and in the kinds of evidence they encode. Some languages only have two evidential morphemes, perhaps one that encodes direct knowledge, to be used when the speaker witnessed an event first-hand, and another than encodes secondary knowledge, to be used when the speaker has heard about the event from someone else or when the speaker uses circumstantial evidence when reporting it.
Eastern Pomo, a California language closely related to Central Pomo, has four evidential morphemes. The word pʰabekʰ- has the basic meaning 'it burned.' Adding the non-visual sensory evidence suffix, -ink’e, produces pʰabekʰ-ink’e 'it burned (I know because I felt it).' The direct knowledge suffix, -a, produces pʰabekʰ-a 'it burned, I saw it happen.' Adding the circumstantial suffix, -ine, produces pʰabekʰ-ine, 'it must have burned' or 'it burned, I imagine.' Finally, the hearsay suffix, -le, would be added to get pʰabekʰ-le 'it burned, they say,' or 'everyone says it burned.'
Practice Activity 2
As an exercise, take a look at the four Eastern Pomo sentences below. A three-line gloss is provided for each one, but the evidential suffixes are not specified. For each sentence, which evidential suffix would be use in place of the abbreviation EVID? View the answer in the Summary section of this page.
'(I judge from his behavior) that he is afraid of the dark.'
'(I can feel that) that mosquito is biting me repeatedly.'
'(According to the traditional stories,) Coyote Old Man was staying home in Big Valley, at Nonapoti.'
'I am going east (I know for a fact which way I am going).' (McLendon 2003:101-7)
3 = third person, AGENT = agentive subject, CONTROL = controlled action, EVID = evidential, LOC = locative, M = masculine, SG = singular
While most languages that treat evidentiality as a grammatical category encode that property using designated suffixes (like Central Pomo and Eastern Pomo), some languages have evidential morphemes that are separate words. One such language is Western Apache, which is one of six languages in the Apachean branch of the Eyak-Athabaskan family. In Western Apache, there are six evidential morphemes, but none of them is a suffix.
The evidential morpheme hiłs’ad includes any kind of evidence that is directly experienced but not seen visually, as in train hilwoł hiłs’ad 'I hear the train' or dinshniih hiłs’ad 'I am not feeling well.'
A moonrise on the San Carlos Apache Indian Reservation in Arizona. Image by Wikipedia user John Fowler. Used here by a Creative Commons license.
In fact, evidentiality in Western Apache is a bit more complicated than presented here. Unlike most languages with evidentiality, this category in Western Apache is optional rather than required. Its optional nature might be related to the fact that the evidential morphemes are separate words instead of suffixes. But this means that the sentence below, 'The road is sandy,' does not necessarily mean that the speaker saw the road directly. It could simply be a statement that does not specify evidentiality and leaves the interpretation open. The fact that evidentiality in Western Apache is optional rather than required, and that it is signaled by full words rather than attached morphemes, highlights the diversity of ways that grammatical properties can be encoded in the world’s languages.
There are three distinct evidential morphemes which apply to inferred statements: lą̄ą̄ encodes inferred evidence as well as surprise (Shash isdzán oyinłshōōd lą̄ą̄ 'A bear dragged the woman', inferred from a description of wounds), golnīī encodes an inference about someone else’s mental state (Chaghą́shé doo ákū nádabini’ da golnīī 'I think the children do not want to go back there’), and nolįh encodes an inference based on visual evidence (Mízhaazhé míł na’iłbąąs nolį dak’eh ałdó’ áí 'Her daughter seems to drive her at times also'), inferred by seeing the daughter’s car in transit).
The remaining two evidentials, ch’inīī and lę́k’eh, encode evidence quoted from another source or known through hearsay or traditional storytelling (Ma’ hanaazhį’ sitį̄į̄ ch’inīī/lę́k’eh 'Coyote was lying on the other side, it is said’).
Notice that in Western Apache, unlike in the Pomoan languages described earlier, there is no evidential morpheme designated to direct visual evidence. Instead, visual evidence is the default or assumed reading of a sentence that otherwise lacks an evidential morpheme. In the sentence below, you can tell from the three-line gloss that no evidential morpheme is present. The typical interpretation of this sentence would be that the speaker saw the road and personally judged it to be sandy.
This lesson has been an overview of the indigenous languages of North America. We began the lesson with a broad tour across the continent, with emphasis on the languages from Canada, the United States, and Mexico. Languages of Greenland and of Central America were mentioned incidentally. The major language families we delineated included Algic, Eskimo-Aleut, Eyak-Athabaskan, Iroquoian, Mayan, Mixe-Zoquean, Muskogean, Otomanguean, Siouan-Catawban, and Uto-Aztecan (learning objective 1).
We noted that the history of indigenous North American languages following the arrival of the Europeans is fraught with racism, extermination, and ultimately, a severe decline in linguistic diversity. Maps of North America showing language distribution around the year 1500 and the contemporary distribution drive home the loss of languages on this continent (learning objective 2).
We characterized ejectives as a unique speech sound present in North American languages and identified the apostrophe following a regular character as the International Phonetic Alphabet symbol for ejectives. The ejective sounds, produced with small bursts of air created by increased pressure in the oral cavity, are not exclusive to North America, but are especially common there (learning objective 3).
We returned to the notion of synthetic indexes that we had built up in previous lessons, this time observing that some North American languages have particularly high synthetic indexes because they have the ability of affixing many morphemes into a single word. This trait is significantly unlike the morphology of East and Southeast Asian languages, whose isolating tendencies permit very few morphemes per word (learning objective 4).
Finally, we explored the property of evidentiality as a strategy for encoding the source of information into the grammatical structure of a language. Languages with evidentiality tend to have morphemes designated for various types of evidence (e.g. visual, hearsay, inferential, and so on), and to require that these morphemes be attached to verbs any time a claim is made (learning objective 5).
1. The synthetic index of Musqueam, based on the narrative text provided, is 1.84. There are 239 morphemes and 130 words. Using the formula m/w, we would have 239/130 = 1.84.
2. The Eastern Pomo words would include evidential suffixes as follows:
afraid-CONTROL-EVID (circumstantial evidence)