English phonology

English phonology is the study of the phonology (i.e., the sound system) of the English language. Like all other languages, spoken English has wide variation in its pronunciation both diachronically and synchronically from dialect to dialect. This variation is especially salient in English, because the language is spoken over such a wide territory, being the predominant language in Australia, Canada, the Commonwealth Caribbean, Ireland, New Zealand, the United Kingdom and the United States, in addition to being spoken as a first or second language by people in countries on every continent, notably South Africa and India. In general, the regional dialects of English are mutually intelligible.
Although there are many dialects of English, the following are usually used as prestige or standard accents: Received Pronunciation for the United Kingdom, General American for the United States and General Australian for Australia.




See IPA chart for English dialects for concise charts of the English phonemes.
The number of speech sounds in English varies from dialect to dialect, and any actual tally depends greatly on the interpretation of the researcher doing the counting. The Longman Pronunciation Dictionary by John C. Wells, for example, using symbols of the International Phonetic Alphabet, denotes 24 consonants and 23 vowels used in Received Pronunciation, plus two additional consonants and four additional vowels used in foreign words only. For General American, it provides for 25 consonants and 19 vowels, with one additional consonant and three additional vowels for foreign words. The American Heritage Dictionary, on the other hand, suggests 25 consonants and 18 vowels (including r-colored vowels) for American English, plus one consonant and five vowels for non-English terms [1].


The following table shows the consonant phonemes found in most dialects of English. When consonants appear in pairs, fortis consonants (i.e., aspirated or voiceless) appear on the left and lenis consonants (i.e., lightly voiced or voiced) appear on the right:
Consonant phonemes of English
Bilabial Labio-
Dental Alveolar Post-
Palatal Velar Glottal
Nasal1 m


Plosive p  b

t  d

k  ɡ

tʃ  dʒ

f  v θ  ð s  z ʃ  ʒ
(x)3 h

ɹ1, 2, 5 j w4

l1, 6

  1. Nasals and liquids may be syllabic in unstressed syllables, though these may be analyzed phonemically as C/.
  2. Postalveolar consonants are usually labialized (e.g., [ʃʷ]), as is word-initial or pre-tonic /r/, though this is rarely transcribed.
  3. The voiceless velar fricative /x/ is dialectal, occurring largely in Scottish English. In other dialects, words with these sounds are pronounced with /k/.
  4. The sequence /hw/, a voiceless labiovelar approximant [hw̥], is sometimes considered an additional phoneme. For most speakers, words that historically used to have these sounds are now pronounced with /w/; the phoneme /hw/ is retained, for example, in much of the American South and in Scotland.
  5. Depending on dialect, /r/ may be an alveolar [ɹ], postalveolar approximant, or labiodental approximant.
  6. Many dialects have two allophones of /l/—the "clear" L and the "dark" or velarized L. In some dialects, /l/ may be always clear or always dark.
/p/ pit /b/ bit
/t/ tin /d/ din
/k/ cut /ɡ/ gut
/tʃ/ cheap /dʒ/ jeep
/f/ fat /v/ vat
/θ/ thin /ð/ then
/s/ sap /z/ zap
/ʃ/ she /ʒ/ measure
/x/ loch
/w/ we /m/ map
/l/ left /n/ nap
/ɹ/ run (also /r/, /ɻ/) /j/ yes
/h/ ham /ŋ/ bang


Although regional variation is very great across English dialects, some generalizations can be made about pronunciation in all (or at least the vast majority) of English accents:
  • The voiceless stops /p t k/ are aspirated at the beginnings of words (for example tomato) and at the beginnings of word-internal stressed syllables (for example potato). They are unaspirated after /s/ (stan, span, scan) and at the ends of syllables.
  • For many people, /r/ is somewhat labialized in some environments, as in reed [ɹʷiːd] and tree [tɹʷiː]. In the latter case, the [t] may be slightly labialized as well.[1]
  • In many dialects, /h/ becomes [ç] before [j], as in human [ˈçjuːmən].


The vowels of English differ considerably between dialects. Because of this, corresponding vowels may be transcribed with various symbols depending on the dialect under consideration. When considering English as a whole, no specific phonemic symbols are chosen over others; instead, lexical sets are used, each named by a word containing the vowel in question. For example, the vowel of the LOT set ("short o") is transcribed /ɒ/ in Received Pronunciation, /ɔ/ in Australian English, and /ɑ/ in General American. For an overview of the correspondences, see IPA chart for English dialects.
Monophthongs of Received Pronunciation[2]

Front Central Back
long short long short long short
Close ɪ

ɛ ɜː ə ɔː
ʌ ɑː ɒ
Monophthongs of Australian English

Front Central Back
long short long short long short
Close ɪ ʉː

Mid e ɜː ə ɔ
Open æː æ a

The monophthong phonemes of General American differ in a number of ways from Received Pronunciation:
  1. Vowels are more equal in length, differing mainly in quality.
  2. The central vowel of nurse is rhotic /ɝ/ or a syllabic /ɹ̩/.
  3. Speakers make a phonemic distinction between rhotic /ɚ/ and non-rhotic /ə/.
  4. No distinction is made between /ɒ/ and /ɑː/, nor for many speakers between these vowels and /ɔː/.
Reduced vowels occur in some unstressed syllables. (Other unstressed syllables may have full vowels, which some dictionaries mark as secondary stress.) The number of distinctions made among reduced vowels varies by dialect. In some dialects vowels are centralized but otherwise kept mostly distinct, while in Australia and many US dialects all reduced vowels collapse to a schwa [ə]. In Received Pronunciation, there is a distinct high reduced vowel, which the OED writes ɪ.
  • [ɪ]: roses (merged with [ə] in Australian English)
  • [ə]: Rosa’s, runner
  • [l̩]: bottle
  • [n̩]: button
  • [m̩]: rhythm
English diphthongs

RP Australian American
GA Canadian
low /əʊ/ /əʉ/ /oʊ/
loud /aʊ/ /æɔ/ /aʊ/ /aʊ/
lout [əʊ]1
lied /aɪ/ /ɑe/ /aɪ/ /aɪ/
light [əɪ]1
lane /eɪ/ /æɪ/ /eɪ/
loin /ɔɪ/ /oɪ/ /ɔɪ/
leer /ɪə/ /ɪə/ /ɪɚ/³
lair /ɛə/² /eː/ ² /ɛɚ/³
lure /ʊə/² /ʊə/ /ʊɚ/³
  1. Canadian English exhibits allophony of /aʊ/ and /aɪ/ called Canadian raising. This phenomenon is also realized (especially for /aɪ/) by many US speakers, notably in the Northeast, as well as in South Atlantic English and the Fens of eastern England. In some areas, especially the Northeast US, /aɪ/) actually shifts to /ʌɪ/.
  2. In Received Pronunciation, the vowels in lair and lure may be monophthongized to [ɛː] and [oː] respectively.[3] Australian English speakers more readily monophthongize the former but it is listed here anyway.
  3. In rhotic dialects, words like pair, poor, and peer can be analyzed as diphthongs, although other descriptions analyze them as vowels with /r/ in the coda.[4]

Reduced vowels

Linguists such as Ladefoged[5] and Bolinger[6] argue that vowel reduction is phonemic in English, and that there are two "tiers" of vowels in English, full and reduced; traditionally many English dictionaries have attempted to mark the distinction by transcribing unstressed full vowels as having "secondary" stress, though this was later abandoned by the Oxford English Dictionary. Though full unstressed vowels may derive historically from stressed vowels, either because stress shifted over time (such as stress shifting away from the final syllable of French loan words in British English) or because of loss or shift of stress in compound words or phrases (óverseas vóyage from overséas or óverséas plus vóyage), the distinction is not one of stress but of vowel quality (Bolinger 1989:351), and over time, if the word is frequent enough, the vowel will tend to reduce.
English has up to five reduced vowels, though this varies with dialect and speaker. Schwa /ə/ is found in all dialects, and a rhotic schwa ("schwer") /ɚ/ is found in rhotic dialects. Less common is a high reduced vowel ("schwi") /ɪ̈/ (also "/ɪ/"); the two are distinguished by many people in Rosa's /ˈroʊzəz/ vs roses /ˈroʊzɪ̈z/. More unstable is a rounded schwa, /ö/ (also /ɵ/); this contrasts for some speakers in a mission /əˈmɪʃən/, emission /ɪ̈ˈmɪʃən/, and omission /ɵˈmɪʃən/. In words like following, the following vowel is preceded by a [w] even in dialects which do not otherwise have a rounded schwa: [ˈfɒlɵwɪŋ, ˈfɒləwɪŋ]. A high rounded schwa /ʊ̈/ (also "/ʊ/") may be found in words such as into /ˈɪntʊ̈/, though in many dialects this is not be distinguished from /ɵ/.
Though speakers vary, full and reduced unstressed vowels may contrast in pairs of words like Shogun /ˈʃoʊɡʌn/ and slogan /ˈsloʊɡən/, chickaree /ˈtʃɪkəriː/ and chicory /ˈtʃɪkərɪ̈/, Pharaoh /ˈfɛəroʊ/ and farrow /ˈfæroʊ/ (Bolinger 1989:348), Bantu /ˈbæntuː/ and into /ˈɪntʊ̈/ (OED).


  • A distinction is made between tense and lax vowels in pairs like beet/bit and bait/bet, although the exact phonetic implementation of the distinction varies from accent to accent. However, this distinction collapses before [ŋ].
  • Wherever /r/ originally followed a tense vowel or diphthong (in Early Modern English) a schwa offglide was inserted, resulting in centering diphthongs like [iə] in beer [biəɹ], [uə] in poor [puəɹ], [aɪə] in fire [faɪəɹ], [aʊə] in sour [saʊəɹ], and so forth. This phenomenon is known as breaking. The subsequent history depends on whether the accent in question is rhotic or not: In non-rhotic accents like RP the postvocalic [ɹ] was dropped, leaving [biə, puə, faɪə, saʊə] and the like (now usually transcribed [bɪə, pʊə] and so forth). In rhotic accents like General American, on the other hand, the [əɹ] sequence was coalesced into a single sound, a non-syllabic [ɚ], giving [biɚ, puɚ, faɪɚ, saʊɚ] and the like (now usually transcribed [bɪɹ, pʊɹ, faɪɹ, saʊɹ] and so forth). As a result, originally monosyllabic words like those just mentioned came to rhyme with originally disyllabic words like seer, doer, higher, power.
  • In many (but not all) accents of English, a similar breaking happens to tense vowels before /l/, resulting in pronunciations like [piəɫ] for peel, [puəɫ] for pool, [peəɫ] for pail, and [poəɫ] for pole.

Transcription variants

The choice of which symbols to use for phonemic transcriptions may reveal theoretical assumptions or claims on the part of the transcriber. English "lax" and "tense" vowels are distinguished by a synergy of features, such as height, length, and contour (monophthong vs. diphthong); different traditions in the linguistic literature emphasize different features. For example, if the primary feature is thought to be vowel height, then the non-reduced vowels of General American English may be represented as follows:
General American full vowels,
vowel height distinctive
e ɚ, ə o
ɛ ɔ
If, on the other hand, vowel length is considered to be the deciding factor, the following symbols may be chosen:
General American full vowels,
vowel length distinctive

e ʌ o
(This convention has sometimes been used because the publisher did not have IPA fonts available, though that is seldom an issue any longer.)
If vowel transition is taken to be paramount, then the chart may look like one of these:
General American full vowels,
vowel contour distinctive
ej ər ow
e ə o
General American full vowels,
vowel contour distinctive
ɛɪ̯ ɚɹ ɔʊ̯
ɛ ʌ ɔ
(The transcriber at left assumes that there is no phonemic distinction between semivowels and approximants, so that /ej/ is equivalent to /eɪ̯/.)
Many linguists combine more than one of these features in their transcriptions, suggesting they consider the phonemic differences to be more complex than a single feature.
General American full vowels,
height & length distinctive

ɛ ʌ ɔ


Stress is phonemic in English. For example, the words desert and dessert are distinguished by stress, as are the noun a record and the verb to record. Stressed syllables in English are louder than non-stressed syllables, as well as being longer and having a higher pitch. They also tend to have a fuller realization than unstressed syllables.
Examples of stress in English words, using boldface to represent stressed syllables, are holiday, alone, admiration, confidential, degree, and weaker. Ordinarily, grammatical words (auxiliary verbs, prepositions, pronouns, and the like) do not receive stress, whereas lexical words (nouns, verbs, adjectives, etc.) must have at least one stressed syllable.
English is a stress-timed language. That is, stressed syllables appear at a roughly steady tempo, and non-stressed syllables are shortened to accommodate this.
Traditional approaches describe English as having three degrees of stress: Primary, secondary, and unstressed. However, if stress is defined as relative respiratory force (that is, it involves greater pressure from the lungs than unstressed syllables), as most phoneticians argue, and is inherent in the word rather than the sentence (that is, it is lexical rather than prosodic), then these traditional approaches conflate two distinct processes: Stress on the one hand, and vowel reduction on the other. In this case, primary stress is actually prosodic stress, whereas secondary stress is simple stress in some positions, and an unstressed but not reduced vowel in others. Either way, there is a three-way phonemic distinction: Either three degrees of stress, or else stressed, unstressed, and reduced. The two approaches are sometimes conflated into a four-way 'stress' classification: primary (tonic stress), secondary (lexical stress), tertiary (unstressed full vowel), and quaternary (reduced vowel). See secondary stress for details.
Initial-stress-derived nouns mean that stress changes in many English words came about between noun and verb senses of a word. For example, a rebel [ˈɹɛb.ɫ̩] (stress on the first syllable) is inclined to rebel [ɹɨ.ˈbɛɫ] (stress on the second syllable) against the powers that be. The number of words using this pattern as opposed to only stressing the second syllable in all circumstances doubled every century or so, now including the English words object, convict, and addict.


Prosodic stress is extra stress given to words when they appear in certain positions in an utterance, or when they receive special emphasis. It normally appears on the final stressed syllable in an intonation unit. So, for example, when the word admiration is said in isolation, or at the end of a sentence, the syllable ra is pronounced with greater force than the syllable ad. (This is traditionally transcribed as /ˌædmɨˈreɪʃən/.) This is the origin of the primary stress-secondary stress distinction. However, the difference disappears when the word is not pronounced with this final intonation.
Prosodic stress can shift for various pragmatic functions, such as focus or contrast. For instance, consider the dialogue
"Is it brunch tomorrow?"
"No, it's dinner tomorrow."
In this case, the extra stress shifts from the last stressed syllable of the sentence, tomorrow, to the last stressed syllable of the emphasized word, dinner. Compare
"I'm going tomorrow." /aɪm ˌɡoʊɪŋ təˈmɒroʊ/
"I'm going tomorrow." /aɪm ˌɡoʊɪŋ təˈˈmɒroʊ/
"It's dinner tomorrow." /ɪts ˈˈdɪnɚ təˌmɒroʊ/
Although grammatical words generally do not have lexical stress, they do acquire prosodic stress when emphasized. Compare ordinary
"Come in"! /ˈkʌm ɪn/
with more emphatic
"Oh, do come in!" /oʊ ˈˈduː kʌm ˌɪn/


Most languages of the world syllabify CVCV and CVCCV sequences as /CV.CV/ and /CVC.CV/ or /CV.CCV/, with consonants preferentially acting as the onset of a syllable containing the following vowel. According to one view, English is unusual in this regard, in that stressed syllables attract following consonants,[citation needed] so that ˈCVCV and ˈCVCCV syllabify as /ˈCVC.V/ and /ˈCVCC.V/, as long as the consonant cluster CC is a possible syllable coda.[7] In addition, according to this view, /r/ preferentially syllabifies with the preceding vowel even when both syllables are unstressed, so that CVrV occurs as /CVr.V/.[7] However, many scholars do not agree with this view.[7]

Syllable structure

The syllable structure in English is (C)(C)(C)V(C)(C)(C)(C), with a maximal example being strengths (/strɛŋkθs/, although it can be pronounced /strɛŋθs/). Because of an extensive pattern of articulatory overlap, English speakers rarely produce an audible release in consonant clusters.[8] This can lead to cross-articulations that seem very much like deletions or complete assimilations. For example, hundred pounds may sound like [hʌndɹɛb pʰaʊndz] but X-ray[9] and electropalatographic[10][11] studies demonstrate that inaudible and possibly weakened contacts may still be made so that the second /d/ in hundred pounds does not entirely assimilate a labial place of articulation, rather the labial co-occurs with the alveolar one.
When a stressed syllable contains a pure vowel (rather than a diphthong), followed by a single consonant and then another vowel, as in holiday, many native speakers feel that the consonant belongs to the preceding stressed syllable, /ˈhɒl.ɨ.deɪ/. However, when the stressed vowel is a long vowel or diphthong, as in admiration or pekoe, speakers agree that the consonant belongs to the following syllable: /ˈæd.mɨ.ˈreɪ.ʃən/, /ˈpiː.koʊ/. Wells (1990)[7] notes that consonants syllabify with the preceding rather than following vowel when the preceding vowel is the nucleus of a more salient syllable, with stressed syllables being the most salient, reduced syllables the least, and secondary stress / full unstressed vowels intermediate. But there are lexical differences as well, frequently with compound words but not exclusively. For example, in dolphin and selfish, he argues that the stressed syllable ends in /lf/; in shellfish, the /f/ belongs with the following syllable: /ˈdɒlf.ɪn/, /ˈsɛlf.ɪʃ/[ˈdɒlfɨn], [ˈsɛlfɨʃ] vs /ˈʃɛl.fɪʃ/[ˈʃɛlˑfɪʃ], where the /l/ is a little longer and the /ɪ/ not reduced. Similarly, in toe-strap the /t/ in a full plosive, as usual in syllable onset, whereas in toast-rack the /t/ is in many dialects reduced to the unreleased allophone it takes in syllable codas, or even elided: /ˈtoʊ.stræp/, /ˈtoʊst.ræk/[ˈtʰoˑʊstɹæp], [ˈtoʊs(t̚)ɹʷæk]; likewise nitrate /ˈnaɪ.treɪt/[ˈnʌɪtɹ̥ʷeɪt] with a voiceless /r/, vs night-rate /ˈnaɪt.reɪt/[ˈnʌɪt̚ɹʷeɪt] with a voiced /r/. Cues of syllable boundaries include aspiration of syllable onsets and (in the US) flapping of coda /t, d/ (a tease /ə.ˈtiːz/[əˈtʰiːz] vs. at ease /æt.ˈiːz/[æɾˈiːz]), epenthetic plosives like [t] in syllable codas (fence /ˈfɛns/[ˈfɛnts] but inside /ɪn.ˈsaɪd/[ɪnˈsaɪd]), and r-colored vowels when the /r/ is in the coda vs. labialization when it is in the onset (key-ring /ˈkiː.rɪŋ/[ˈkʰiːɹʷɪŋ] but fearing /ˈfiːr.ɪŋ/[ˈfɪəɹɪŋ]).


There is an on-going sound change (yod-dropping) by which /j/ as the final consonant in a cluster is being lost. In RP, words with /sj/ and /lj/ can usually be pronounced with or without this sound, e.g., [suːt] or [sjuːt]. For some speakers of English, including some British speakers, the sound change is more advanced and so, for example, in General American /j/ is also not present after /n/, /l/, /s/, /z/, /θ/, /t/ and /d/. In Welsh English it can occur in more combinations, for example in /tʃj/.
The following can occur as the onset:
All single consonant phonemes except /ŋ/
Plosive plus approximant other than /j/: /pl/, /bl/, /kl/, /ɡl/,
/pr/, /br/, /tr/[1], /dr/[1], /kr/, /ɡr/,
/tw/, /dw/, /ɡw/, /kw/
play, blood, clean, glove, prize, bring, tree[1], dream[1], crowd, green, twin, dwarf, guacamole, quick
Voiceless fricative plus approximant other than /j/:[2] /fl/, /sl/,
/fr/, /θr/, /ʃr/,
/sw/, /θw/, /hw/
floor, sleep, friend, three, shrimp, swing, thwart, which
Consonant plus /j/: /pj/, /bj/, /tj/, /dj/, /kj/, /ɡj/,
/mj/, /nj/, /fj/, /vj/, /θj/,
/sj/, /zj/, /hj/, /lj/
pure, beautiful, tube, during, cute, argue, music, new, few, view, thew, suit, Zeus, huge, lurid
/s/ plus voiceless plosive:[2]
/sp/, /st/, /sk/
speak, stop, skill
/s/ plus nasal:[2]
/sm/, /sn/
smile, snow
/s/ plus voiceless fricative:
/s/ plus voiceless plosive plus approximant:[2][3] /spl/,
/spr/, /str/, /skr/,
/skw/, /smj/, /spj/, /stj/, /skj/
split, spring, street, scream, square, smew, spew, student, skewer
  1. In some American dialects (especially as spoken by children), /tr/ and /dr/ tend to affricate, so that tree resembles "chree", and dream resembles "jream".[12][13][14] This is sometimes transcribed as [tʃr] and [dʒr] respectively, but the pronunciation varies and may, for example, be closer to [tʂ] and [dʐ][15] or with a fricative release similar in quality to the rhotic, ie. [tɹ̝̊ɹ̥], [dɹ̝ɹ], or [tʂɻ], [dʐɻ].
  2. Many clusters beginning with /ʃ/ and paralleling native clusters beginning with /s/ are found initially in German and Yiddish loanwords, such as /ʃl/, /ʃp/, /ʃt/, /ʃm/, /ʃn/, /ʃpr/ (in words such as schlep, spiel, shtick, schmuck, schnapps, Shprintzen's). /ʃw/ is found initially in the Hebrew loanword schwa. Before /r/ however, the native cluster is /ʃr/. The opposite cluster /sr/ is found in loanwords such as Sri Lanka, but this can be nativized by changing it to /ʃr/.
  3. /skl/ occurs in the Greek loanword sclerosis; there is also /sfr/ (sphragistics).
Other onsets
Certain English onsets appear only in contractions: e.g., /zbl/ ('sblood), /zd/ (sdein), and /zw/ or /dzw/ (swounds or dswounds). Some, such as /pʃ/ (pshaw) or /fw/ (fwoosh), can occur in interjections. An archaic voiceless fricative plus nasal exists, /fn/ (fnese).
A few other onsets occur in further (anglicized) loan words, including /bw/ (bwana), /mw/ (moiré), /nw/ (noire), /pw/ (pueblo); /kv/ (kvetch), /ʃv/ (schvartze), /sθ/ (sthenics), /θl/ (thlipsis), /tv/ (Tver), /zl/ (zloty), and /zw/ (zwieback)
Some clusters of this type can be converted to regular English phonotactics by simplifying the cluster: e.g. /(d)z/ (dziggetai), /(h)r/ (Hrolf), /kr(w)/ (croissant), /(p)f/ (pfennig), /(f)θ/ (phthalic), /(t)s/ (tsunami), /vw/ (voilà).
Others can be substituted by native clusters differing only in voice: /sv ~ sf/ (svelte), /zb ~ sp/ (sbirro), /zgr ~ skr/ (sgraffito).


The following can occur as the nucleus:


Most, and in theory all, of the following except those which end with /s/, /z/, /ʃ/, /ʒ/, /tʃ/ or /dʒ/ can be extended with /s/ or /z/ representing the morpheme -s/z-. Similarly most, and in theory all, of the following except those which end with /t/ or /d/ can be extended with /t/ or /d/ representing the morpheme -t/d-.
Wells (1990) argues that a variety of syllable codas are possible in English, even /ntr, ndr/ in words like entry /ˈɛntr.ɪ/ and sundry /ˈsʌndr.ɪ/, with /tr, dr/ being treated as affricates along the lines of /tʃ, dʒ/. He argues that the traditional assumption that pre-vocalic consonants form a syllable with the following vowel is due to the influence of languages like French and Latin, where syllable structure is CVC.CVC regardless of stress placement. Disregarding such contentious cases, which do not occur at the ends of words, the following sequences can occur as the coda:
The single consonant phonemes except /h/, /w/, /j/ and, in non-rhotic varieties, /r/
Lateral approximant + plosive or affricate: /lp/, /lb/, /lt/, /ld/, /ltʃ/, /ldʒ/, /lk/ help, bulb, belt, hold, belch, indulge, milk
In rhotic varieties, /r/ + plosive or affricate: /rp/, /rb/, /rt/, /rd/, /rtʃ/, /rdʒ/, /rk/, /rɡ/ harp, orb, fort, beard, arch, large, mark, morgue
Lateral approximant + fricative: /lf/, /lv/, /lθ/, /ls/, /lʃ/ golf, solve, wealth, else, Welsh
In rhotic varieties, /r/ + fricative: /rf/, /rv/, /rθ/, /rs/, /rʃ/ dwarf, carve, north, force, marsh
Lateral approximant + nasal: /lm/, /ln/ film, kiln
In rhotic varieties, /r/ + nasal or lateral: /rm/, /rn/, /rl/ arm, born, snarl
Nasal + homorganic plosive or affricate: /mp/, /nt/, /nd/, /ntʃ/, /ndʒ/, /ŋk/ jump, tent, end, lunch, lounge, pink
Nasal + fricative: /mf/, /mθ/ in non-rhotic varieties, /nθ/, /ns/, /nz/, /ŋθ/ in some varieties triumph, warmth, month, prince, bronze, length
Voiceless fricative + voiceless plosive: /ft/, /sp/, /st/, /sk/ left, crisp, lost, ask
Two voiceless fricatives: /fθ/ fifth
Two voiceless plosives: /pt/, /kt/ opt, act
Plosive + voiceless fricative: /pθ/, /ps/, /tθ/, /ts/, /dθ/, /dz/, /ks/ depth, lapse, eighth, klutz, width, adze, box
Lateral approximant + two consonants: /lpt/, /lfθ/, /lts/, /lst/, /lkt/, /lks/ sculpt, twelfth, waltz, whilst, mulct, calx
In rhotic varieties, /r/ + two consonants: /rmθ/, /rpt/, /rps/, /rts/, /rst/, /rkt/ warmth, excerpt, corpse, quartz, horst, infarct
Nasal + homorganic plosive + plosive or fricative: /mpt/, /mps/, /ndθ/, /ŋkt/, /ŋks/, /ŋkθ/ in some varieties prompt, glimpse, thousandth, distinct, jinx, length
Three obstruents: /ksθ/, /kst/ sixth, next
Note: For some speakers, a fricative before /θ/ is elided so that these never appear phonetically: /ˈfɪfθ/ becomes [ˈfɪθ], /ˈsiksθ/ becomes [ˈsikθ], /ˈtwelfθ/ becomes [ˈtwelθ].

Syllable-level rules

  • Both the onset and the coda are optional
  • /j/ at the end of an onset cluster (/pj/, /bj/, /tj/, /dj/, /kj/, /fj/, /vj/, /θj/, /sj/, /zj/, /hj/, /mj/, /nj/, /lj/, /spj/, /stj/, /skj/) must be followed by /uː/ or /ʊə/
  • Long vowels and diphthongs are not found before /ŋ/ except for the mimetic word boing![16]
  • /ʊ/ is rare in syllable-initial position[17]
  • Stop + /w/ before /uː, ʊ, ʌ, aʊ/ (all presently or historically /u(ː)/) are excluded[18]
  • Sequences of /s/ + C1 + + C1, where C1 is a consonant other that /t/ and is a short vowel, are virtually nonexistent[18]

 Word-level rules

  • /ə/ does not occur in stressed syllables
  • /ʒ/ does not occur in word-initial position in native English words although it can occur syllable-initial, e.g., luxurious /lʌɡˈʒʊəriəs/
  • /θj/ occurs in word-initial position in a few obscure words: thew, thurible, etc.; it is more likely to appear syllable initial, e.g. /ɛnˈθjuːz/
  • /m/, /n/, /l/ and, in rhotic varieties, /r/ can be the syllable nucleus (ie a syllabic consonant) in an unstressed syllable following another consonant, especially /t/, /d/, /s/ or /z/
  • Certain short vowel sounds, called checked vowels, cannot occur without a coda in a single syllable word. In RP, the following short vowel sounds are checked: /ɛ/, /æ/, /ɒ/ and /ʌ/.

History of English pronunciation

Around the late 14th century, English began to undergo the Great Vowel Shift, in which
  • the high long vowels [iː] and [uː] in words like price and mouth became diphthongized, first to [əɪ] and [əʊ] (where they remain today in some environments in some accents such as Canadian English) and later to their modern values [aɪ] and [aʊ]. This is not unique to English, as this also happened in Dutch (first shift only) and German (both shifts).
The other long vowels became higher:
  • [eː] became [iː] (for example meet),
  • [aː] became [eː] (later diphthongized to [eɪ], for example name),
  • [oː] became [uː] (for example goose), and
  • [ɔː] become [oː] (later diphthongized to [oʊ], for example bone).
Later developments complicate the picture: whereas in Geoffrey Chaucer's time food, good, and blood all had the vowel [oː] and in William Shakespeare's time they all had the vowel [uː], in modern pronunciation good has shortened its vowel to [ʊ] and blood has shortened and lowered its vowel to [ʌ] in most accents. In Shakespeare's day (late 16th-early 17th century), many rhymes were possible that no longer hold today. For example, in his play The Taming of the Shrew, shrew rhymed with woe.[19]


æ-tensing is a phenomenon found in many varieties of American English by which the vowel /æ/ has a longer, higher, and usually diphthongal pronunciation in some environments, usually to something like [eə]. Some American accents, for example that of New York City or Philadelphia, make a marginal phonemic distinction between /æ/ and /eə/ although the two occur largely in mutually exclusive environments.

Bad-lad split

The bad-lad split refers to the situation in some varieties of southern English English and Australian English, where a long phoneme /æː/ in words like bad contrasts with a short /æ/ in words like lad.

Cot-caught merger

The cot-caught merger is a sound change by which the vowel of words like cot, rock, and doll (/ɒ/ in New England, /ɑː/ elsewhere) is pronounced the same as the vowel of words like caught, talk, and tall (/ɔ/). This merger is widespread in North American English, being found in approximately 40% of American speakers and virtually all Canadian speakers.

Father-bother merger

The father-bother merger is the pronunciation of the short O /ɒ/ in words such as "bother" identically to the broad A /ɑː/ of words such as "father", nearly universal in all of the United States and Canada save New England and the Maritime provinces; many American dictionaries use the same symbol for these vowels in pronunciation guides.

