Most recent comments
2021 in Books -- a Miscellany
Are, 4 months, 1 week
Moldejazz 2018
Camilla, 2 years, 9 months
Romjulen 2018
Camilla, 3 years, 4 months
Liveblogg nyttårsaften 2017
Tor, 4 years, 4 months
Jogging og blogging
Are, 5 years, 4 months
Liveblogg nyttårsaften 2016
Are, 5 years, 4 months
Kort hår
Tor, 1 year, 4 months
Camilla, 11 months, 2 weeks
Melody Gardot
Camilla, 2 years, 10 months
Den årlige påske-kommentaren
Tor, 3 years, 1 month
50 book challenge
Camilla, 4 months, 2 weeks
+ 2004
+ 2005
+ 2006
+ 2007
+ 2008
+ 2009
+ 2010
+ 2011
+ 2012
+ 2013
+ 2014
+ 2015
+ 2016
+ 2017
+ 2018
+ 2019
+ 2020
+ 2021
+ 2022

The Sami languages


Tor sent me the following e-mail this morning:


We were just chatting about languages and stuff over lunch, and a friend of mine said he would like to know where the sami language connects on the tree of languages. Could I possibly talk you into writing an article about that?


I began writing an e-mail back explaining that, while I could write an article, I'd feel like a bit of a fraud since what I know myself could be summed up in one sentence – and all the rest of the information would come directly from wikipedia. I was going to suggest that Tor refer his friend to the wikipedia articles on the Sami language family and the wider Finno-Ugric language family.

However, I then started writing an "in a nutshell" summary, which ended up being long enough for an article after all (although most of the information about the languages and the families into which they're classified does still come from wikipedia). I then realised I ought to include a short explanation of what language families actually are and what it means to say that two languages are related. This doesn't all come from wikipedia. So here, despite my original intentions, is an actual article about Sami and its linguistic relations.

What is Sami?

Sami is not, as is commonly supposed, a single language, but a family of 11* known languages. By far the largest (in terms of speakers) is Northern Sami at 20,700. The next largest (Lule Sami) has 2000. Three of them have fewer than 20 speakers and two are extinct.

*Be aware that the figures come from Ethnologue, which is a notorious "splitter" when it comes to questions of whether two speech varieties are separate languages or dialects of one language (e.g. German is apparently 19 different languages). So it may be that some linguists would group some of the 11 together and give a smaller number.

Historical and comparative linguistics

Before saying any more about language family trees, I should probably talk a little bit about what they mean. Linguists talk about "genetic relationships" between languages, but this has nothing to do with DNA. It means that the languages are descended from a single ancestor language (such as Norwegian, Swedish, Danish, Icelandic and Faroese from Old Norse), but not that its speakers are descended from a common human ancestor. Of course, DNA and languages usually go together, but the link can be broken (e.g. by the language of a conquering group becoming dominant in a region and replacing the language of the conquered, as in South America).

When the ancestor language isn't attested (known from historical written records), a hypothetical approximation of it can often be reconstructed using similarities between the more recent attested languages. This is known as a proto-language. Proving (or trying to prove) that two languages are related usually involves identifying correspondences and reconstructing a proto-language, which might look very different from any of the attested languages.

The tree model assumes that at some point in time, one older language split into two or more newer languages. This allows you to produce something that looks very like a human family tree, and we refer to parents, daughters and sisters (apparently in this metaphor languages are all female and procreate by parthenogenesis). This idea is somewhat artificial as language change is always gradual, and unless one group of speakers sends a colony across the ocean and doesn't hear from it again for centuries, the two sister languages will continue to influence each other all the time that they're supposedly splitting. The tree model works well enough if you're thinking in very long time periods, but things become extremely fuzzy if you look more closely. When talking about language families it's important to remember that the whole thing is an abstraction, and any number of linguists could each produce different results from the same set of hard data.

There is also a "wave" model of language change, which looks more like ripples spreading out from a place where a stone was dropped a pond. Nonetheless, the tree model is the most commonly used.

Sami's ancestry and relations

The closest relative to the Sami family is the Baltic-Finnic family, which comprises Finnish, Estonian and some minority languages in Russia such as Karelian. These two families together make up the Finno-Samic or Finno-Lappic family. (Wikpedia uses the term "Finno-Lappic", but since "Lapp" is sometimes considered pejorative the former name would seem more politically correct.) More distantly, this is part of the Finno-Ugric family, which includes Hungarian (much to the chagrin of some Hungarian nationalists, who for some inscrutable reason spend inordinate amounts of time trying to prove that their language is related to Turkish instead of Finnish) and a bunch of languages spoken in Siberia. This family is in turn related to the Samoyedic language family, spoken even further north and east in Siberia.

All of the aforementioned families together make up the Uralic language family, named after the Ural mountains and totalling about 25 million speakers between its 37 languages. The Uralic languages are not related to the Indo-European languages (unless you believe in Nostratic, which not very many linguists do, or in Proto-World**, which no-one except crackpots does).

**Before anyone gets too excited about Proto-World, it's an inherently unprovable hypothesis. The further back in time you go to find the point of origin, the fewer correspondences you would expect to see between related languages (and thus the shakier any theories are likely to be). Beyond about 5,000-10,000 years, the level of expected correspondence becomes the same as what you'd expect to see by chance in two unrelated languages, so it's impossible to rely on those correspondences to establish a relationship.


So now you know. Sami is not one language but nine (plus two extinct ones), and is related to Finnish, Estonian, and more distantly Hungarian, as well as some minority languages spoken in Russia, but not to Norwegian or English (as far as anyone has been able to prove). Now you can impress any Sami friends with your knowledge of their ancestral language. Just be prepared for a lot of hard work if you actually want to learn it, as you won't find many grammatical similarities with Norwegian or English to latch on to.
Camilla likes this


Lena,  28.05.10 16:14

Yay for articles on linguistics!

It is also worth mentioning that the different Sami languages are not mutually intelligible, at least not Northern Sami and Southern Sami, the two largest Sami languages in Norway. I think the two can be compared to Norwegian and German.

So if some languages can be considered to be completely unrelated, does that mean there is more than one language tree?

Camilla,  28.05.10 16:24

I agree with Lena: more of this please.

I am happy to say I knew the gist of this already, but I did learn some new stuff. Are the Hungarian nationalists having any luck in tying their language to Turkish? And is it an attempt to connect it to Altaic in general, or are they trying to make their own little group? And why? What is wrong with Finland? Or is there some indication of an overarching tree of Altaic and Finno-Urgic languages (I know they sound equally incomprehensible to me...).

I also didn't know Sami varied so widely. It is always hard to tell when people divide things into "languages" what that actually entails. I thought the difference would be less than that.

Camilla,  28.05.10 17:00

Jeg mener det skal være en bok om språkhistorie og hvordan forskjellige språk er i slekt et eller annet sted i bokhyllen vår. Jeg klarer ikke å huske hva den heter, men den har hvitt cover og fargerike bokstaver, tror jeg. Hmm. Det kan hende den heter Ordenes historie.

Tim,  29.05.10 00:58

@Lena: I should probably have mentioned that mutual (un)intelligibility is supposed to be the defining criterion for when two speech varieties are different languages or dialects of the same language. So if I read that a family consists of 11 "languages", I expect them to be mutually unintelligible; otherwise the number should be smaller. However, as I mentioned, Ethnologue is often quite generous about what it calls a "language" and so it may be that some of the languages are similar enough to be mutually intelligible with each other. But as I don't know any of the Sami languages, I can't say either way with any certainty. I'm sure, however, that there is more than one of them.

The other thing to bear in mind is that the designation of speech varieties as languages or dialects often gets muddled by politics. As the saying goes, "A language is a dialect with an army and a navy". Norwegian and Swedish are a perfect example, as their respective speakers readily understand each other – they should really be called one language, but they aren't because they're spoken in different countries. Also there can be partial mutual intelligibility (where speaker A understands about half of what speaker B is saying) and one-way intelligibility (e.g. Danes understanding Norwegians but not vice versa).

@Tor: What I really mean when I say two languages are completely unrelated is that nobody has managed to convincingly show that they are related. It's certainly possible that all languages are related to each other, but as I said near the end of the article, relationships going that far back in time are impossible to prove. I should probably qualify what I said there: it's not crackpottery to think that there was probably a single ancestor language, but it is to think you've managed to reconstruct it.

There is certainly more than one language tree: there are over 120 independent language families and almost as many language isolates (languages which do not appear to be related to any other language at all). Many of the 100-odd known isolates are now extinct, but by no means all are extinct or even endangered: Korean is an isolate, and Basque seems to be doing fine. Let the linguistics department know if you think you can reliably connect one supposedly independent tree, or a language isolate, to another tree.

@Camilla: I don't think many respectable linguists side with the Hungarian/Turkish hypothesis. I have no idea why they think the Turks are better relations than the Finns, unless it's simple incredulity at the distances involved. (Which are easily explained by a large Uralic area being mostly taken over by Indo-Europeans migrating westwards.) I don't know the details of their claims (indeed there are probably many different versions), but I get the impression they simply want to be in the Turkic family, as joining Turkic to Finno-Ugric in a larger family wouldn't do anything to undermine the Finno-Ugric grouping itself or dissociate them from the Finns. I came across one such person on a linguistics forum, and he was very against the Finno-Ugric hypothesis, so I don't imagine he'd've been happy with a Finno-Ugric-Turkic hypothesis.
Google hits
Last google search
sami languages tree