Historical sources of the rhotic vowel and rhotic syllables in Chinese

Modern Standard Chinese syllables are composed of an initial (聲母) and a final (韻母). A tone can be realized on the final as a pitch contour.

One of these finals is rather peculiar, i.e. the rhotic “er”, or [ɚ] in IPA. This final can only be combined with a zero-initial, i.e. no initial consonant. Therefore it is always just “er” by itself. Note that there are rhotic syllables (see below) where the “er” vowel is preceded by a consonant initial, e.g. shùgēr 樹根兒 (“root of a tree”), but these syllables are not basic morphemic level syllables since they are results of morphological processes called rhoticization. In fact, such syllables are still represented with two separate characters, just as shown in the example above. Therefore on a purely morphemic level, the “er” vowel cannot be preceded by a consonant initial.

In terms of its tones, the “er” can only have the 2nd, 3rd, and 4th tones, but not the 1st tone. Thus there are words like ér, ěr and èr, but no words like ēr. Here again, note that we are talking about basic morphemic level units, because as shown above in the example of shùgēr, it is possible to get a 1st tone syllable with the “er” vowel, but in these cases, the 1st belongs to the preceding syllable, i .e. “gēn” here, before the morphological fusion and phonological reduction of the suffix “ér”.

Now let’s call this final “er” the rhotic vowel (i.e. 捲舌元音 juǎnshé yuányīn “the retroflex vowel” in Chinese). This vowel has two features: (1) no initial consonant allowed; (2) no 1st tone with this vowel. These features need to be explained.

One of the words with this rhotic vowel is 兒 ér. This word can be used as a suffix, and its meanings include the diminutive, among many other uses for it. When the suffix is used, it obligatorily fuses with the previous syllable. For example: xiǎo.hái+ér > xiǎo.hár (“little kid”). In these cases, the rhotic vowel is reduced to a rhotic gesture of the tip of the tongue following the previous vowel. Now let’s call such fused syllables rhotic syllables, or 兒化韻 érhuà yùn (“rhoticized rimes”) in Chinese. Thus a rhotic syllable is still just one syllable, instead of comprising two syllables with the -er as a separate syllable. To illustrate this point, xiǎo.hár would be realized as [ɕiao211.xaɻ35], rather than as [ɕiao211.xai3535].If the preceding syllable has the coda -n, then the -n is deleted so that the two syllables can fuse. For example, shàngbān + ér > shàngbār (“go to work”). If the preceding syllable has the coda -ng, then it has to nasalize first and then it can be fused with the -er. For example: diànyǐng + ér > [tian51. ĩɻ214].

Such rhotic syllables are often puzzling to many people since they are largely absent from the southern dialects in China. Therefore a common misunderstanding of such syllables is that they were influenced or even borrowed from the Manchu language during the Qing Dynasty because the Manchu people could not speak Chinese correctly. But such a view does not reflect the true origins of these syllables. Quite a few scholars have studied the rhotic syllables in Chinese, but here without comprehensively reviewing these studies, I will just go back to the Qieyun-Guangyun records of Middle Chinese and explore the historical sources of the rhotic vowel “er” and the rhotic syllables in Chinese and see whether they are results of natural sound change.

First let’s look at the rhotic vowel “er”. One interesting thing about this vowel is that there are just a few words that can be formed with such a vowel. The most commonly used ones are listed below with their Middle Chinese initial and rime, and the reconstructed sounds in Wang Li’s system.

Character MC Initial MC Rime Reconstructed pronunciation
支A 開 ȵʑĭe

Thus it is clear that all these characters have the same initial 日 in Middle Chinese, and they have very similar vowels as well. At least the initial is followed by a [ i ] sound, although sometimes it is a short [ i ] sound as indicated by the accent, as in ĭ. In traditional terminology, they all belong to the 止攝開口三等 with an initial 日.

In Early Modern Chinese as recorded in the Zhongyuan Yinyun 《中元音韻》, these characters are listed in the rime 支思開, and according to Ning Jifu’s reconstructed sounds, the vowel here [ï], which would later split into the apical [ɿ] and retroflex [ʅ].  The initial 日 had evolved into a retroflex approximant [ɻ]. Thus all of these characters would have the same inital and final, i.e. [ɻï], although their tones are different.

Therefore based on the above data, we can say that both ȵʑĭe and ȵʑi in Middle Chinese developed into ɻï in Early Modern Chinese, and subsequently developed into [ɻ ʅ ] and this syllable can further fused into the rhotic vowel [ɚ]. So in this whole process, the crucial step is the development of the set of initials in the 章 category in Middle Chinese into retroflex initials in Early Modern Chinese. The initial 日 had the same place of articulation as the 章 initials, and they underwent similar changes. Therefore the retroflexization of the initial 日 is part of an internal sound change from Middle Chinese to Early Modern Chinese. It should not be considered to be resulted from outside influence from other non-Sinitic languages, e.g. the Manchu. Furthermore, the retroflexization took place well before the Qing Dynasty when Manchu was used by the nobility class in Beijing before they acquired Beijing Mandarin.

Therefore to sum up the phonological change, these “er” words in Modern Standard Chinese originated from characters in the 止攝開口三等 with an initial 日 in Middle Chinese. Two separate developments are very important here. First, the general trend that brought about a set of retroflex initials also see the initial 日 develop into a retroflex approximant in Early Modern Chinese. Second, the merge of the formerly different finals of these characters into [i] and then into [ï] creates the uniform environment for them to develop further into the retroflex vowel. Then the whole syllable [ɻ ʅ ] fused into the retroflex vowel [ɚ]. These changes share something in common: assimilation of place of articulation due to the retroflex initials.

So now we are ready to answer two of the questions that we raised at the beginning of this article, repeated here as (1) no initial consonant allowed; (2) no 1st tone with this vowel. The reason why no initial consonant is allowed is because the original initial consonant has already been “absorbed” into the rhotic vowel in the form of the retroflexion. There is no other historical source for any other initial consonant at all. For the second question, since the initial 日 is a voiced consonant, all ping-tone characters in Middle Chinese with this initial will develop into the yángpíng tone, i.e 2nd tone. Therefore because this initial was the only initial involved here and there has never been a voiceless initial in these cases, there is no possible “ēr” word with the 1st tone in Modern Standard Chinese.

In terms of the fusion of the suffixal use of 兒, early examples such as 魚兒 and 雁兒 already appeared in the Tang Dynasty (Normal 1988). There is evidence that the 兒 maintained its status as a separate syllable. This is only to be expected since the pronunciation of the word at that time was still something like ȵʑĭe, and there was no easy path to fusing with the preceding syllable. Another factor would still be that it is likely that at that time the word 兒 retained some of its original meaning of “son” or “infant”. The grammaticalization of this word into a diminutive suffix was companied by phonological reduction, as is often the case in grammaticalization. Such phonological reduction can include loss of stress, neutralization of vowel, fusion of syllables and etc. Here the development of this word from ȵʑĭe to ɚ further makes this fusion possible. In terms of feature geometry, this is very easy to explain. The retroflex feature of ɚ will just be relinked to the preceding vowel, while the syllabic nature of ɚ will just be deleted.

Now we see that the rhotic syllables in Modern Standard Chinese are just the result of the fusion of the suffixal rhotic vowel with the preceding syllable, and everything in this whole process can be system internal, meaning that there need be no external factors such as a possible Manchu influence.

Therefore it is fair to say that the rhotic vowel and the rhotic syllables in Modern Standard Chinese are pure Chinese features, and they were never a contact-induced change in the Qing Dynasty due to the contact with the Manchu language. And in fact, the retroflex initial, the rhotic vowel and the rhotic syllables are quite common in northern dialects. Even if it could be possible that the Manchu language influenced Beijing Mandarin in this respect, it would be much more difficult for this to be more widespread across many northern dialects.

Now there is one remaining issue. The word 日 itself developed differently. The character 日 belonged to 臻攝 質A 開 三 之韻 in Middle Chinese, and its reconstructed pronunciation (again in Wang Li’s system) is ȵʑiet, and further in Zhongyuan Yinyun, it belonged to the 齊微 rime with the [ i ] vowel and the reconstructed pronunciation in Ning Jifu’s system is [ɻi]. Now it looks like the word 日 could also have developed into a rhotic vowel, but it did not. Why?

There are two possible explanations for this change. First, 日 is in the entering tone category. The previous sound change might just apply to non-entering tone characters. Second, the coda consonant -t might have prevented the main vowel from being assimilated to the place of articulation of the initial consonant. Thus if the previous sound change that brought words such as 兒 to  [ɻ ï ] or [ɻ ʅ ] took place either before or at the same time when the -t disappeared in the word日. Then by the time the -t disappeared and the main vowel [i] in the word 日 is assimilated to the initial retroflex consonant, the first sound change involving words like 兒 is already one step further ahead. Therefore the word 日 was left behind, not catching the wave of rhoticization. Maybe this can be best illustrated as follows:

Stage 1 Stage 2 Stage 3 Stage 4
而etc ȵʑi ɳʐi > ʐi ɻ ʅ ɚ
ȵʑĭĕt ɳʐit > ʐit ɻ i ɻ ʅ

So here we see that at Stage 3 and Stage 4, the sound change of the word 日 is one step behind that of 而. At stage 4, the fusion of ɻ ʅ into the rhotic vowel has already completed, and although at this stage the word 日ɻ ʅ could in theory also develop into a rhotic vowel, it nevertheless missed the train of the first round of rhotic fusion. Thus the rhotic fusion just did not take place for a second time.

Therefore we see that the rhotic fusion sound change is indeed quite regular in terms of the form of the change and the phonological condition of the change. Such regularity also lends to our theory about the natural sound change that brought about the rhotic vowel and consequently the rhotic syllables. These were all internal sound change processes without external contact influence.


① Note that if the [ï] sound is taken to be the apical [ɿ], there is some conflict when it is combined with a retroflex initial such as [ɽ]. For one thing, the apical vowel and the retroflex vowel are results of place of articulation assimilations due to the influence of the preceding consonant. It is phonetically unnatural to assume that an apical vowel could follow a retroflex initial. Thus here I am not going to commit to how [ï] should be pronounced in Ning’s system, but just maintain that it is a distinct final in Early Modern Chinese.
② In Ning’s notation, it is a retroflex flap [ɽ]. However it seems that the retroflex approximant is more appropriate here.
③ I will update this section with a more detailed treatment of this process with nonlinear phonology.


  1. A

    I find it unlikely that retroflexion happened as a “process” across wide swathes of dialects from a non-retroflex MC. More likely, as some have suggested, MC rhyme books should be taken as what it is, a taxonomy covering multiple phonological systems for the purpose of acceptable rhyming of poetry. It is far more likely that even before MC, retroflexion was already a feature in Northern Chinese languages, and probably was always a feature in certain areas since OC, tracing back to g-/r- features in Proto Sino-Tibetan. 二 *g/s-ni-s (Proto Tibeto-Burman),耳 *r/g-na (Proto Tibeto-Burman), for example; c.f. 七 *s-ni-s (Proto Tibeto-Burman), which did not become retroflex. And characters like 恥 indicate some kind of retroflex had to have been part of the OC phonology. Northern China, being closer to the Sino-Tibetan migration path, preserved more of these features. When OC got filtered by the 苗-dominated 楚 state on its way to southern China, that’s probably when prefixes, along with the retroflexal tendencies they caused, were lost.

  2. A

    “And in fact, the retroflex initial, the rhotic vowel and the rhotic syllables are quite common in northern dialects. Even if it could be possible that the Manchu language influenced Beijing Mandarin in this respect, it would be much more difficult for this to be more widespread across many northern dialects.”

    This is also not entirely true, for the following reason:

    If, on the other hand, the retroflex did not predate MC within Chinese itself, then its unlikely appearance between MC and early modern Chinese still needs to be explained. The demonstration here is only that the changes from commonly accepted MC reconstructions are internally consistent, not necessarily that they are internally driven. The Manchu thesis should not be so easily discarded for that reason, because the same linguistic group, i.e. Jurchens, ruled northern China under 金 from 1115–1234, just before 中原音韵. This fact should not be ignored.

