Hi, I'm Taylor and welcome to Crash Course Linguistics! According to the word count feature in a document, a word is a thing with spaces around it. That’s a useful definition if we're just trying to figure out how long an essay should be, but it's not a very good guide to defining what “counts” as a word. For example, "doghouse" is generally written without a space, while "rabbit hole" is written with one. But they feel like they should both be words. After all, sometimes people write "dog house" with the space, and we could totally start writing "rabbit-hole" or even "rabbithole" completely smushed together. Also, just because a word like “hangry” isn’t in your dictionary doesn’t mean it’s not a word, or that I’m not feeling it right now. Man, I should’ve eaten a snack before this shoot. Anyway, today, we’re going to talk about how a linguist would answer the question, “What even is a word?” [THEME MUSIC] To a linguist, the word "word" has a big meaning and a small meaning. The big meaning of "word" is what we expect when we're looking something up in a dictionary. We'd expect to find a dictionary entry for "rabbit hole" because this phrase has a meaning that we can't figure out from the definitions of its individual parts. After looking up “rabbit” and “hole,” we wouldn't guess that “rabbit hole” means a place where a rabbit lives, or a complicated or absorbing situation like finding yourself down a Wikipedia rabbit hole at 2am after Googling what languages are spoken on the International Space Station. Its meaning is relatively unpredictable from its parts. Dictionary-makers define one entry or unit as the largest unpredictable combinations of form and meaning. They call each of these units lexemes or lexical items, because they're the parts of a lexicon, which is another word for dictionary. In contrast, we wouldn't expect to find a dictionary entry for "deep hole" because if we look up “deep” and “hole,” we can figure out the meaning of the two combined. It's predictable. So "deep" and "hole" are both lexemes, while “deep hole” is not. When we think about a phrase like "falling down rabbit holes", this is where the small meaning of “word” comes in. Here, we can break the sentence into parts: fall, -ing, down, rabbit, hole, and -s — even though we don't say "ing" or "-s" by themselves, they have distinct meanings. For example, -s indicates that there's more than one rabbit hole, and we can predict this from the meaning of "rabbit hole" and "-s" together. But we can't separate "rabbit" into rabb and it, even though "it" is a word, because “rabb” doesn’t mean anything on its own. "Rabb" and "it" don't each have their own meanings that they're contributing to "rabbit" The meaning of "rabbit" is unpredictable. Rabbit and -s are examples of the smallest unpredictable combinations of form and meaning. Linguists call these units morphemes, and the study of them is morphology. That's morph as in “metamorphosis” or “Animorphs.” It’s from a Greek word meaning shape or form, because morphemes can stick to each other to change the shape of a word. One reason it’s helpful to divide language into morphemes is because it helps us see patterns across languages. A separate word in one language might be a part of a word in another language. For example, the phrase “I washed my feet” is a sentence with several words in Mandarin. The same idea is a single word with many morphemes in Murrinhpatha and lots of other Australian languages. If we just think of words, rather than the morphemes that build words, we miss this and a lot of other interesting potential patterns. If we look at morphemes instead, we can see differences and similarities between languages in the information they convey, not just the number of words they use! There are a couple different kinds of relationships that morphemes can have with each other. When we have a morpheme that can stand by itself, that's a free morpheme, like "rabbit" or "hole." When we have two or more free morphemes combined together, that's a compound, whether it's written with a space, a hyphen, or all joined together, such as doghouse, rabbit hole or even rabbit hole fence sign. In American Sign Language, there are signs like “teacher” and “student” that are compounds, composed of “teach” and “learn” plus a variant of the sign “person”. Recognizing compounds allows us to see similarities between languages that we might have missed. In other languages, nouns might be linked by other words, like “the sign of the fence of the hole of the rabbit,” but English and German just put them all together into long compound nouns. The only difference is that English keeps spaces when writing long strings of nouns, while German doesn't write the spaces. So while it looks like English and German have very different ways of creating words, they actually often use the same compound nouns! Perhaps we could call this the Deutschewörterübersetzungsproblem or “Word in German translation problem”. Meanwhile, when we have a morpheme that can’t stand by itself, like the “-s” in “rabbits, that’s a bound morpheme. Let's head over to the Thought Bubble to see more about how morphemes fit together. We can visualize morphemes as fitting together like the parts of a plant. In this metaphor, the most central part of a word is the root, and the other morphemes that are stuck (or fixed) onto it are affixes. So “rabbits” is made from the root “rabbit” and the affix “-s.” Since the “-s” affix in “rabbits” comes after the root, we call it a suffix. If a word has an affix stuck on before the root, it’s called a prefix. To extend our plant metaphor, when we add a morpheme to a root, this new unit becomes the stem for the next morpheme. And here’s where it gets interesting: We can also have a word with several affixes at once, like untwistable, which has the prefix "un-," the root "twist," and the suffix "-able." It sounds simple enough, but this word's meaning depends on whether "untwist" is a stem for "-able" or whether "twistable" is a stem for "un-." It could mean: able to be untwisted. That's untwist plus able. Or, it could mean: not able to be twisted. That's un plus twistable. Not every word with multiple affixes has more than one meaning, though. It all depends on how the word builds. At each stage, the stem has to work as a word by itself. So untwistable is ambiguous because "untwist" is a word but "twistable" is also a word. In contrast, with a word like "un-rabbit-y", rabbit-y is a word , but "un-rabbit"? That's not a word, so un-rabbit-y only has one meaning. Rabbit, rabbity, unrabbity, unrabbitiness… This can go all the way up to lots and lots of affixes. That was the most thought-bubble-y of Thought Bubbles! The root is often a free morpheme, like rabbit. But the root isn't always free -- think of words like: receive, deceive, perceive, and conceive. You can receive. And you can deceive. But can you just...ceive? It's the same part in all these words, but it doesn't have its own independent meaning. It's a bound morpheme, just like -s, but it's also the root. It’s one of a handful of examples of bound roots in English. And in addition to prefixes and suffixes, there are some other kinds of affixes that can be attached to a root. Affixes can sometimes go inside a word. This is called an infix, and in English, it primarily happens with swear words or pseudo-swears: fan-hecking-tastic. And for the completionists out there, there’s also circumfixes, which have information attached to both the beginning and end of a word. English doesn't really do circumfixes, but Malay has eight different ones. The meaning of the word changes only with the addition of both parts of the circumfix. So far, morphology is looking very neat and defined — you can make words by stacking morphemes on roots to make longer stems. But morphology isn’t always neat little packages of affixes. Sometimes one affix can hold more than one piece of information. This is known as fusional morphology, because it’s hard to tease out how each morpheme relates to a specific part of the meaning. It’s all fused together. For example, as languages change over time, they often smush smaller words together, making free morphemes into bound morphemes. The English words "not" "none" "never" and "nothing" all contained "ne", the Old English word for "not". And "not" itself gets smushed together in Modern English into words like "didn't" or "dunno". In French, when a word ends in -al, like animal or journal, the suffix -al indicates that it’s masculine and that it’s singular. You may not be familiar with the idea of words being masculine, but don't worry! For now, just focus on how this suffix tells us two things about the word. To make it plural, you need to change the whole ending into -aux like "animaux" or "journaux" to indicate both of these things. There were once two suffixes, one for masculine and one for plural — which we can still kind of see in the spelling. But -aux is now simply pronounced “o” and indicates both. To further broaden our idea of morphology, we should mention, there are ways of building meaning in words that go beyond adding affixes all in a single row. For example, some words in English change their vowels instead of adding an affix, such as foot and feet or sing, sang and sung. In Arabic, Hebrew, and other Semitic languages, the root of a word is just the consonants, and then vowels are added in different configurations to create different related words. For example, this Arabic root means things having to do with books or writing, and from it we get "kitaab" meaning "book", "kutub" meaning "books", "kaatib" meaning "writer", "maktab," meaning “office" and more. In American Sign Language, nouns and their related verbs sometimes have the same handshape and location, but different movement. For example: “chair” and “sit” And occasionally, a language will change the word completely, rather than adding a morpheme. Think about the English verb ‘go’, which is ‘went’ in the past tense, rather than "goed", which would follow the regular patterns of English morphology. This process of completely replacing a word is called suppletion, and languages mostly use it with a handful of common words rather than as a systematic process. Thank goodness for that! If you thought conjugating verbs with different suffixes was hard - imagine having to learn a completely different word each time! So, to get back to this tricky question of what a word is... linguists don't really know, and that's actually fine. There are so many edge cases and exceptions about the word "word" that when linguists need to be really precise, we use completely different terminology instead We talk about morphemes. But when we're not zoomed in quite so closely, it's still totally okay to talk in terms of words Like when we're talking about combining words into longer phrases and sentences, like in our next video! See you next time! Thanks for watching this episode of Crash Course Linguistics, which is produced by Complexly & PBS. So 2020 has been... bad. PBS has a new show called Self-Evident that explores how we've been persevering in this supremely weird year. It's hosted by historian Danielle Bainbridge from Origin of Everything and therapist Ali Mattu, who you might know from The Psych Show. Because who better than a historian and a therapist to help guide us through ALL of this. Self-Evident is part of PBS American Portrait, a massive storytelling project involving thousands of people around the country. Subscribe to PBS Voices for Self-Evident and other great shows, and tell them Crash Course sent you.