French IPA Transcription - Help
← Back to French transcription ↑ All Languages
About This Implementation
This French transcription app uses the Wiktionary French Pronunciation Module to generate phonemic transcriptions for French text.
The system uses comprehensive Wiktionary data dumps[1][2] as a lexicon to first retrieve transcriptions from the dictionary. When a word is not found in the lexicon, it falls back to generating transcriptions using the pronunciation module's rule-based approach.
Dialects Supported
-
Default - Standard French pronunciation
This option generates IPA based on a traditional or conservative standard of French. This pronunciation maintains the historical distinctions between all four of the language's nasal vowels. It is a pronunciation still widely heard in Belgian French, Quebecois French, and is often used in dictionaries for etymological clarity.
This setting preserves the four-vowel system:
- brun is pronounced /bʁœ̃/.
- brin is pronounced /bʁɛ̃/.
- blanc is pronounced /blɑ̃/.
- bon is pronounced /bɔ̃/.
-
Parisian (experimental) - Parisian French pronunciation patterns
This option implements the major phonological shifts characteristic of modern Parisian and Metropolitan French. It transforms the traditional four-vowel system into a new three-vowel system through a series of mergers and shifts.
This setting applies the following crucial changes in a specific order:
-
The "un/in" Merger: The distinction between
/œ̃/(in brun) and/ɛ̃/(in brin) is completely lost. Both sounds are merged into a single, open nasal vowel, transcribed as/ɑ̃/.- brun:
/bʁœ̃/→/bʁɑ̃/ - brin:
/bʁɛ̃/→/bʁɑ̃/ - Result: brun and brin become perfect homophones.
- brun:
-
The "an/on" Chain Shift: This is a two-step process where one vowel shifts, and
another moves into its former place. It is not a simple merger.
- First, the original vowel
/ɔ̃/(as in bon) is modified to/õ/. This is a subtle phonetic shift, often involving a change in vowel height or rounding, and is represented by a different IPA symbol. - Then, the original vowel
/ɑ̃/(as in blanc) shifts to take the now-vacant phonetic space of the original/ɔ̃/.
- bon:
/bɔ̃/→/bõ/ - blanc:
/blɑ̃/→/blɔ̃/ - Result: Crucially, bon and blanc remain distinct. Their vowel qualities have shifted, but they have not merged into one sound.
- First, the original vowel
-
The "un/in" Merger: The distinction between
Transcription Forms
- Phonemic: A simplified, broad transcription that shows only the sounds that are essential for distinguishing meaning (phonemes). It represents the abstract sound system of the language.
Technical Information
This implementation combines the Wiktionary French pronunciation module with a comprehensive lexicon extracted from Wiktionary data dumps, providing both high accuracy for common words and broad coverage through rule-based generation.
Wiktionary French pronunciation module Pipeline Overview
1. Initialization & Normalization
The input string is prepared and normalized to remove orthographic variance that does not affect pronunciation.
- Boundary Markers: The symbol
⁀is added to the start and end of every word (e.g.,⁀mot⁀). - Liaison conversion: All apostrophes (e.g., l'arbre, c'est) are converted to
the liaison marker
‿. - Accent Normalization:
- Strip:
à→a,ù→u,î→i(These accents do not change vowel quality in this system). - Keep:
â(Posterior A),ô(Closed O),ê(Open E) are preserved as they alter the phoneme.
- Strip:
- Special Case (Qu'): The sequence
qu'is mapped tok'(preventing false nasalization in qu'on).
2. Early Consonant Mutations
Fixing ambiguous consonants based on the following vowel.
2.1. The "Capital E" Protection
The digraphs œu and oeu are converted to a placeholder Eu.
Example: cœur →
cEu... → C remains hard /k/ because it is not
followed by a lowercase 'e' or 'œ'.
2.2. C and G Logic
- Soft C:
c+ e, i, y →ç(later /s/). - Soft G:
g+ e, i, y →j(later /ʒ/). - Forced Soft G:
ge+ a, o, u →j(The 'e' is purely orthographic and is deleted). - Hardening:
qu+ Vowel →k.gu+ Vowel →g. (Exception: If diaeresis exists like aigüe, keep the 'u').
2.3. Consonant Devoicing
Rule: The letter b is devoiced to p when immediately followed by
s or t.
- Example: absurde →
apsurde→ /ap.syʁd/. - Example: obtenir →
optenir→ /ɔp.tə.niʁ/.
3. Verb Morphology Overrides (pos="v")
Applied strictly when the module input is tagged as a verb.
- Final "-ai": Forces
ai→é(Standard French distinction /e/ vs /ɛ/). - Anti-Assibilation (The T-Blocker):
The script inserts a separator
_to prevent T from turning into S in specific verb endings:ti+ ons/ens →t_i(e.g., portions → /pɔʁtjɔ̃/).t+ ienne(nt) →t_ienne(e.g., tiennent → /tjɛn/, preventing /sjɛn/).
- The "-ent" Ending:
- Standard:
ent→e(silent/schwa). - Liaison Exception: If followed by a liaison marker
‿, it becomesət‿. - Example: parlent-ils → /paʁlət.il/.
- Standard:
4. Orthographic Standardization
Mapping lexical exceptions and fixed spelling patterns.
4.1 Assibilation (T → S)
Rule: ti becomes si before on, en, al, el.
Exceptions (Blockers): Assibilation is cancelled if the T is preceded by:
- s (e.g., question)
- x (e.g., mixtion)
- ç (script safety check)
4.2. Special Prefix/Suffix Logic
| Pattern | Logic | Example |
|---|---|---|
| eu- / neuro- | Force eu to closed /ø/ even before R. |
Europe → /ø.ʁɔp/ |
| hyper- / super- | Force er to open /ɛʁ/ (preventing /e/). |
supermarché → /sy.pɛʁ.../ |
| trans- / intrans- | Force s to voiced /z/ before vowel. |
intransigeant → /ɛ̃.tʁɑ̃.zi.ʒɑ̃/ |
| (h)ex- | Initial ex or in+ex → /gz/ (Must be followed
by vowel/h).
|
examen → /ɛg.za.mɛ̃/ |
| désh- | Force s to voiced /z/. |
déshabiller → /de.za.bi.je/ |
4.3. Basic Digraphs
- ph →
f - gn →
ɲ - psych- →
psik(e.g., psychologie). - ch →
- Becomes
kbefore r, l, n (e.g., chrétien, chlore). - Becomes
ʃelsewhere.
- Becomes
- -ing → /iŋ/ (Loanwords).
- -emment → /a.mɑ̃/ (Adverbs).
4.4. Posterior A Shift
as, az(Final) →â/ɑ/.- Example: pas → /pɑ/, gaz → /gɑz/.
5. Vowels, Ligatures, and Nasalization
5.1. Base Mappings
- au, eau →
O(Closed /o/) - eu, œu →
ø(Closed /ø/) - oi →
wA(Generally /wa/). - ou →
U(High back /u/)
5.2. Nasalization Engine
Logic: The script scans for Vowel + n/m followed by a Consonant or End-of-Word.
oi + n (and satisfying nasal conditions), the
script overrides the standard output to /wɛ̃/.
Example: point → /pwɛ̃/ (Not /pwɑ̃/).
- a, e + n/m → /ɑ̃/
- o + n/m → /ɔ̃/
- i, ɛ, y + n/m → /ɛ̃/ (Includes pain, syndicat).
- u + n/m → /œ̃/
5.3. Special Consonant Handling in Nasals
- -um: Maps to /ɔm/ (e.g., maximum).
- mb, mp: The script converts these to
mB, mP. The vowel nasalizes due to the M, and the B/P are subsequently deleted.
Example: plomb → /plɔ̃/.
6. Structural Logic: Y and ILL
Handling the complex context-dependency of Y and the "Liquid L" digraphs.
6.1. The Y Hierarchy
The script determines whether y functions as a vowel /i/ or a semi-vowel /j/.
- The "Yi" Hiatus:
yi+ Vowel →y.y+ Vowel.
Logic: Forces a syllable split rather than a glide (e.g., yin). - Contextual Vowel /i/:
- Initial Y + Consonant: → /i/ (e.g., yttrium).
- Consonant + Y + Consonant: → /i/ (e.g., cycle).
- Consonant + Y + Vowel: Maps to
itemporarily to ensure syllabification, often becoming /j/ later.
- Standard Glide /j/: Any other
y(especially Initial + Vowel like yeux) becomes /j/. - Exceptions:
ay→ai(e.g., Gamay → /ga.mɛ/).éy→éj(Used in specific respellings).
6.2. The ILL Logic (Liquid L)
The script processes ill sequences in a strict order of operations:
- Consonant + uill: → /ɥij/.
Example: cuillère → /kɥi.jɛʁ/. - Vowel + ill: → /j/.
Example: travailler → /tʁa.va.je/ (The L is absorbed). - Standard ill: → /ij/.
Example: fille → /fij/. - Final Vowel + il: → /j/.
Example: soleil → /sɔ.lɛj/.
7. Word Endings (Deletions)
Removing silent letters at word boundaries (⁀).
7.1. Standard List
Delete: d, g, p, t, s, z.
7.2. Complex Deletions
- x: Silent only after
i, u, ø(e.g., prix, vœux). Pronounced/ks/or/s/otherwise. - -nct: Both C and T are silent (e.g., instinct → /ɛ̃s.tɛ̃/).
- -r: Silent in polysyllabic words ending in
-er(becomes /e/). Pronounced in monosyllables (fer, mer).
7.3. Protections
- -st: T is kept (e.g., est).
- -kt: T is kept (e.g., strict).
7.4 Special Endings
- -ez: Treated as /e/ (e.g., assez).
- -ied: Treated as /je/ (e.g., pied, assieds).
8. E-Rules & Syllabification
8.1. E + Double Consonant
Rule: e + CC → è + C.
- The vowel opens to /ɛ/ before the double consonant is reduced.
- Example: terre → /tɛʁ/.
8.2. Syllabification Engine
- V.CV: Break before single consonant.
- VC.CV: Break between consonants.
- Onset Maximization: If C2 is a liquid (L/R) and C1 is an obstruent, break before C1 (e.g., ta.bleau).
8.3. Schwa (ə) Logic
- Word Final: Deleted (unless monosyllable).
- Internal (VCəCV): Deleted, unless the resulting consonant cluster is forbidden.
- Forbidden Clusters: The script checks a lookup table (e.g.,
ʁʁ, ɲʁ). If the pair is found, the schwa is preserved.
9. Glides and Reversion (Cluster Protection)
Converting high vowels to semi-vowels contextually.
9.1. Conversion
i+ Vowel → /j/ou+ Vowel → /w/u+ Vowel → /ɥ/
9.2. The Reversion Rule (Missing from most guides)
- Input:
brouette(br + ou + ette). - Attempt: /bʁwɛt/ (Cluster /bʁw/ is considered too heavy).
- Reversion: /bʁu.ɛt/ (Maintains separation).
10. Final Harmonization
Final polish of the IPA string.
- R standardization: All rhotic sounds → /ʁ/.
- Loi de Position (Aperture):
- O: /o/ (closed) only if final or before z. Else /ɔ/.
- EU: /ø/ (closed) only if final or before z/t. Else /œ/.
- AU: Opens to /ɔ/ before R (e.g., dinosaure).
- Remove hyphens: Clean up remaining structural artifacts.
- Diaeresis Removal: Any remaining internal markers like
ïorëare converted to their standard IPA vowel equivalents (e.g.,ï→i).
11. Liaison Mutations (Sandhi)
When a silent final consonant becomes pronounced due to liaison (‿), or when vowels clash at
boundaries, mutations occur.
11.1 Consonant Mutations
| Letter | Mutation | Example | IPA Result |
|---|---|---|---|
| D | Becomes /t/ | grand‿homme | /gʁɑ̃.tɔm/ |
| F | Becomes /v/ | neuf‿heures | /nœ.vœʁ/ |
| S, X | Becomes /z/ | beaux‿arts | /bo.zaʁ/ |
| I | Inserts /j/ | y a-t-il (i‿a) | /i.ja.t‿il/ |
11.2 The Nasal Bridge
The script distinguishes between words that keep their nasal quality in liaison and those that denasalize.
- Group A (Nasal Preserved): un, on, en, mon, ton, son, bien, rien.
- The script appends a marker to force both nasalization and the consonant.
- Example: mon‿ami → /mɔ̃.na.mi/.
- Group B (Denasalized): bon, certain (and most other adjectives).
- The script prevents nasalization of the vowel, resulting in an oral vowel + N.
- Example: bon‿ami → /bɔ.na.mi/ (Not /bɔ̃/).
12. Reference: Complete IPA Symbol Map
This table lists every phonetic symbol generated by the script, mapping the French orthography to the specific IPA character output.
| Category | IPA Symbol | Description | Typical Orthography |
|---|---|---|---|
| Oral Vowels | /a/ | Anterior A | a, à (standard) |
| /ɑ/ | Posterior A | â, as, az | |
| /e/ | Closed E | é, er, ez, ai (final verb) | |
| /ɛ/ | Open E | è, ê, ai, ei, e (+consonant) | |
| /ə/ | Schwa (Mute E) | e (unaccented) | |
| /i/ | High Front Vowel | i, î, y | |
| /y/ | High Front Rounded | u, û | |
| /o/ vs /ɔ/ | Closed O vs Open O | /o/: ô, au, eau, o (+z) /ɔ/: o (standard) |
|
| /ø/ vs /œ/ | Closed EU vs Open EU | /ø/: eu, œu (final/z/t) /œ/: eu, œu (standard) |
|
| /u/ | High Back Vowel | ou, où | |
| Nasal Vowels | /ɑ̃/ | Nasal A | an, en, am, em |
| /ɛ̃/ | Nasal I | in, ain, ein, yn | |
| /ɔ̃/ | Nasal O | on, om | |
| /œ̃/ | Nasal U | un, um (often merges with /ɛ̃/) | |
| Semi-Vowels | /j/ | Yod glide | y, il, ill, i (+vowel) |
| /w/ | W glide | oi (/wa/), ou (+vowel) | |
| /ɥ/ | U glide | u (+vowel) | |
| Complex Consonants | /ʃ/ | Sh sound | ch, sh |
| /ʒ/ | Soft J | j, g (+e/i/y) | |
| /ɲ/ | Palatal N | gn | |
| /ŋ/ | Velar Nasal | ng (in -ing) | |
| /ʁ/ | Uvular R | r, rr, rh | |
| Standard Consonants | /b/ /d/ /f/ /ɡ/ /k/ /l/ /m/ /n/ /p/ /s/ /t/ /v/ /z/ | Invariant Mapping | b, d, f, g (hard), c (hard)/k/qu, l, m, n, p, s/ç, t, v, z/s(voiced) |
For technical issues or suggestions, please visit our GitHub repository.