Polish IPA Transcription - Help
← Back to Polish transcription ↑ All Languages
Table of Contents
- Getting Started
- Phonemic vs Phonetic
- Quick Reference (Glossary)
- How to Read IPA Symbols
- Polish Phonetics & Pronunciation Guide
- Orthography to IPA Mapping
- For Developers
- References & Further Reading
About This Tool
This Polish transcription app uses a modified version of the Wiktionary Polish Pronunciation Module (pl-pron_wasm.lua) running inside the browser via Wasmoon (Lua 5.4 WebAssembly). It converts standard Polish orthography into the International Phonetic Alphabet (IPA).
Polish spelling is highly phonetic and rule-governed — far more so than English or even German. The tool simulates the exact phonetic transformations (devoicing, assimilation, palatalization, and nasal vowel decomposition) applied by native speakers, producing both phonemic and narrow phonetic transcriptions without requiring a lookup dictionary. This rule-based approach means it can transcribe any Polish word, including neologisms and rare compounds.
Dialects & Limitations
- Supported: Standard Polish pronunciation (General Polish / polszczyzna ogólna), representing standard Warsaw/Kraków-based speech.
- Not supported: Regional dialectal variations (such as Silesian, Kashubian, Goral, or Masurian features like mazurzenie), or highly colloquial fast-speech reductions.
- Limitations: Certain compound words with non-standard stress or prefixes (e.g., recent English loanwords) may fall back to the default penultimate stress rule if they are not explicitly handled by the exception tables.
Interactive Features
- Audio playback: Click the speaker icon next to any word or line to hear text-to-speech pronunciation (requires browser TTS support).
- Export results: Use PDF or CSV buttons to save transcriptions in your preferred format.
Note: Unlike German or French, Polish transcription is entirely rule-based — there is no lexicon lookup. This means every word gets a single deterministic transcription based on orthographic rules. There are no "alternative variants" to cycle through.
Phonemic vs Phonetic
- Phonemic: A simplified, broad transcription showing only sounds essential for distinguishing meaning (phonemes). Represents the abstract sound system of Polish. The tool generates phonemic output directly from orthographic rules — no dictionary lookup is needed because Polish spelling is highly regular.
- Phonetic: A detailed, narrow transcription showing subtle allophonic variations. Builds on phonemic output with additional rules for voicing assimilation, nasal decomposition, and palatalization effects.
Quick Reference (Glossary)
- Palatalization (Zmiękczenie)
- The modification of consonant articulation by raising the tongue towards the hard palate. In Polish, it occurs before the letter i or is indicated by a diacritic acute accent (e.g., ć, ś, ń).
- Decomposition of Nasals
- The process where nasal vowels (ą, ę) are split into an oral vowel followed by a nasal consonant (like [ɔn], [ɛm], or [ɔŋ]) depending on the following consonant.
- Final Devoicing (Wygłos)
- The neutralization of contrast between voiced and voiceless obstruents at the end of a word (e.g., chleb "bread" is pronounced as if it ended in p: [xlɛp]).
- Regressive Assimilation
- A phonological change where a consonant adapts its voicing to match the following consonant (e.g., prośba "request" has ś voiced to [ʑ] due to the following voiced b).
- Progressive Devoicing
- A specific Polish assimilation rule where the voiced consonants w (/v/) and rz (/ʐ/) become voiceless [f] and [ʂ] when they follow a voiceless consonant (e.g., kwiat [kfjat], trzy [tʂɨ]).
- Affricate (Afrykata)
- A consonant sound starting as a stop and releasing immediately into a fricative at the same place of articulation (e.g., c /t͡s/, cz /t͡ʂ/, ć /t͡ɕ/).
- Penultimate Stress (Akcent paroksytoniczny)
- The standard stress pattern in Polish where the stress falls on the second-to-last syllable of a word.
- Phoneme
- The smallest unit of sound that distinguishes meaning (e.g., /p/ vs. /b/ in pas "belt" vs. bas "bass").
- Allophone
- A variant pronunciation of a phoneme that doesn't change meaning (e.g., aspirated [pʰ] vs. unaspirated [p] in different positions).
- Sonorant
- A consonant produced with continuous airflow (m, n, ɲ, ŋ, l, r, j, w). Sonorants are "neutral" in Polish voicing assimilation — they neither trigger nor undergo devoicing of neighboring obstruents.
- Obstruent
- A consonant produced by obstructing airflow (stops, fricatives, affricates). In Polish, obstruents are subject to voicing assimilation. Contrasts with sonorants.
- Syllable Boundary
- A marker (.) showing where syllables divide. Polish syllabification follows the Sonority Sequencing Principle: consonant clusters are divided so that each syllable rises in sonority toward its nucleus.
- Geminate
- A doubled consonant sound (e.g., /kk/ in oddychać). Polish orthography often shows this with consonant doubling.
- Onset
- The initial consonant(s) of a syllable (e.g., "p" in "pan"). Polish allows complex onsets like /pr/, /tr/, /kr/, /sp/, /st/, /sk/.
- Coda
- The final consonant(s) of a syllable (e.g., "t" in "kot"). Polish codas are subject to final devoicing.
- Tie Bar
- The combining tie bar (◌͡◌) used in IPA to indicate affricates — two segments forming a single phonological unit (e.g., /t͡s/, /t͡ʂ/, /t͡ɕ/).
- Homorganic
- Produced at the same place of articulation. In nasal decomposition, the nasal consonant matches the place of the following stop (e.g., ą + b → [ɔm] — both bilabial).
How to Read IPA Symbols
Vowel Symbols
Polish contains six oral vowels and two nasal diphthongs. Unlike English, there is no phonemic distinction between long and short vowels.
| IPA | Polish Grapheme | Example | English Approximation | Description |
|---|---|---|---|---|
| /a/ | a | tak [tak] | father (shortened) | Open central unrounded vowel |
| /ɛ/ | e | ten [tɛn] | bed | Open-mid front unrounded vowel |
| /i/ | i | pić [pʲit͡ɕ] | meet (shortened) | Close front unrounded vowel |
| /ɨ/ | y | ty [tɨ] | roses (unstressed) | Close central unrounded vowel |
| /ɔ/ | o | to [tɔ] | caught (British short) | Open-mid back rounded vowel |
| /u/ | u, ó | tu [tu], bóg [buk] | boot (shortened) | Close back rounded vowel |
| /ɔw̃/ | ą (word-final / before fricatives) | są [sɔw̃], vąski [ˈvɔw̃skʲi] | French "on" + English "w" | Nasalized back rounded diphthong |
| /ɛw̃/ | ę (before fricatives) | węch [vɛw̃x] | French "in" + English "w" | Nasalized front unrounded diphthong |
Consonant Symbols
Polish consonants are grouped into dental, retroflex (hard), alveolo-palatal (soft), labial, and velar sounds.
| IPA | Polish Graphemes | Example | English Approximation | Description |
|---|---|---|---|---|
| /p/, /b/ | p, b | pan [pan], bok [bɔk] | pin, bin | Bilabial plosives |
| /t/, /d/ | t, d | to [tɔ], dom [dɔm] | toy, do (dental articulation) | Dental plosives |
| /k/, /ɡ/ | k, g | kot [kɔt], gra [ɡra] | keep, go | Velar plosives |
| /f/, /v/ | f, w | fakt [fakt], wóz [vus] | fit, voice | Labiodental fricatives |
| /s/, /z/ | s, z | syn [sɨn], zoolog [zɔˈɔlɔk] | see, zoo | Dental sibilants |
| /ʂ/, /ʐ/ | sz, ż, rz | szum [ʂum], żaba [ˈʐaba] | hash, vision (but retracted) | Voiceless/voiced retroflex fricatives |
| /ɕ/, /ʑ/ | ś, si | śmiech [ɕmʲɛx], ziemia [ˈʑɛmʲa] | sheet, azure (hissing/soft) | Voiceless/voiced alveolo-palatal fricatives |
| /x/ | ch, h | chleb [xlɛp], hak [xak] | Scottish loch | Voiceless velar fricative |
| /t͡s/, /d͡z/ | c, dz | co [t͡sɔ], dzban [d͡zban] | cats, ads | Dental affricates |
| /t͡ʂ/, /d͡ʐ/ | cz, dż | czas [t͡ʂas], dżem [d͡ʐɛm] | church, jam (hard/retracted) | Retroflex affricates |
| /t͡ɕ/, /d͡ʑ/ | ć, ci, dź, dzi | ćma [t͡ɕma], dzień [d͡ʑɛɲ] | cheep, geep (soft/palatal) | Alveolo-palatal affricates |
| /m/, /n/ | m, n | mama [ˈmama], noc [nɔt͡s] | more, no | Bilabial, dental nasals |
| /ɲ/ | ń, ni | koń [kɔɲ], niebo [ˈɲɛbɔ] | canyon, Spanish ñ | Alveolo-palatal nasal |
| /ŋ/ | n (before k, g) | bank [baŋk], gong [ɡɔŋk] | sing | Velar nasal |
| /l/ | l | las [las] | leaf (clear L) | Alveolar lateral approximant |
| /w/ | ł | ładny [ˈwadnɨ] | wet | Labio-velar approximant |
| /j/ | j | jutro [ˈjutrɔ] | yes | Palatal approximant |
| /r/ | r | ręka [ˈrɛŋka] | Spanish r (rolled) | Alveolar trill |
Diacritical Marks & Stress
| Symbol | Name | Function | Example |
|---|---|---|---|
| ˈ | Primary stress | Marks the onset of the stressed syllable | Warszawa /varˈʂa.va/ |
| . | Syllable boundary | Divides phoneme sequences into syllables | okno /ˈɔk.nɔ/ |
| ʲ | Palatalization marker | Indicates secondary narrowing of the vocal tract (softening) | pies /pʲjɛs/ |
| ͡ | Tie bar | Indicates that two letters form a single affricate sound | noc /nɔt͡s/ (not t followed by s) |
Polish Phonetics & Pronunciation Guide
Oral Vowels
Polish oral vowels are remarkably stable and monophthongal. The language does not have phonemic vowel length distinctions (like English ship vs. sheep or German Staat vs. Stadt). However, minor contextual shifts exist:
- Unstressed syllables have vowels that are slightly shorter and more centralized, but they maintain their core vocalic identity.
- The vowel y [ɨ] is a close central unrounded vowel, distinctly different from i [i]. While i always softens the preceding consonant, y can only follow hard consonants.
Nasal Vowels (ą, ę) in Depth
Polish is unique among Slavic languages in retaining nasal vowels (spelled ą and ę). In modern standard Polish, these are actually pronounced as nasal diphthongs or decompose into oral vowels + nasal consonants based on context. The Lua transcription module mimics this behavior using the following rules:
1. Before Obstruent Stops & Affricates (Decomposition)
The nasal vowel loses its vowel nasality and splits into a standard oral vowel (/ɔ/ or /ɛ/) followed by a homorganic nasal consonant matching the place of articulation of the following stop/affricate:
| Context (Next Sound) | Type | ą becomes... | ę becomes... | Example Grapheme | Phonetic Output |
|---|---|---|---|---|---|
| b, p | Bilabial | [ɔm] | [ɛm] | dąb, zęby | [dɔmp], [ˈzɛmbɨ] |
| d, t, c, dz, cz, dż | Dental / Retroflex | [ɔn] | [ɛn] | kąt, chętny | [kɔnt], [ˈxɛntnɨ] |
| ć, dź, ci, dzi | Alveolo-palatal | [ɔɲ] | [ɛɲ] | pięć, dociąć | [pʲjɛɲt͡ɕ], [ˈdɔt͡ɕɔɲt͡ɕ] |
| g, k | Velar | [ɔŋ] | [ɛŋ] | mąka, ręka | [ˈmɔŋka], [ˈrɛŋka] |
2. Before Fricatives (Retained Nasal Diphthong)
Before fricatives (f, v, s, z, sz, ż, rz, ś, ź, ch, h), the nasalization cannot resolve into a distinct nasal stop. Instead, it is pronounced as an oral vowel followed by a nasalized labiovelar approximant: ą → [ɔw̃], ę → [ɛw̃]. E.g., wąski [ˈvɔw̃skʲi], węch [vɛw̃x].
3. Before Laterals and Approximants (Denasalization)
Before the letters l and ł, the nasal vowels completely lose their nasality. ą becomes standard [ɔ] and ę becomes [ɛ]. This is handled at the grapheme-to-phoneme conversion stage — the letters2phones table has explicit entries for ą + l, ą + ł, ę + l, ę + ł that produce denasalized output directly. E.g., wziął [vʑɔw], wzięli [ˈvʑɛlʲi].
4. Word-Final Position
- Final ą: Retains its nasalization, realized as a nasal diphthong [ɔw̃]. E.g., robią [ˈrɔbʲɔw̃].
- Final ę: Denasalizes completely to a simple oral vowel [ɛ] in standard, relaxed pronunciation. Realizing final ę as fully nasal [ɛw̃] is considered hypercorrect or archaic. E.g., idę [ˈidɛ].
Palatalization & Soft Consonants
Palatalization represents the division between "hard" and "soft" sounds in Polish orthography. There are two primary mechanisms:
- Diacritics: Consonants with acute accents (ć, dź, ś, ź, ń) are inherently soft alveolo-palatal sounds (/t͡ɕ, d͡ʑ, ɕ, ʑ, ɲ/).
- The letter i:
- If followed by another vowel, the i is silent and acts purely as a softening marker for the preceding consonant. E.g., pies [pʲjɛs].
- If followed by a consonant or at the end of a word, the i both softens the preceding consonant and acts as the vowel nucleus [i]. E.g., pili [ˈpʲilʲi].
When dental fricatives/affricates (s, z, c, dz) are palatalized by i or converted to diacritics, they transform completely: s → ś [ɕ], z → ź [ʑ], c → ć [t͡ɕ], dz → dź [d͡ʑ].
Voicing & Devoicing Rules
Polish exhibits pervasive voicing assimilations within consonant clusters and at word boundaries:
- Final Devoicing (Wygłos): Voiced obstruents (b, d, g, w, z, ż, rz, ź, dz, dż, dź) lose their voicing and become voiceless at the end of a word when followed by a pause. E.g., chleb [xlɛp], nóż [nuʂ], sad [sat].
- Regressive Voicing Assimilation: In a sequence of obstruents, the voicing of the final obstruent in the cluster dictates the voicing of the entire cluster.
- Voiceless + Voiced → Voiced: prośba (ś + b) → [ˈprɔʑba]
- Voiced + Voiceless → Voiceless: odpowiedź (d + p) → [ɔtˈpɔvʲɛt͡ɕ]
Consonant Cluster Assimilations
Polish contains complex consonant clusters. In some contexts, progressive voicing/devoicing rules apply to specific consonants:
- Devoicing of w and rz: The consonants w (/v/) and rz (/ʐ/) are normally voiced. However, if they follow a voiceless consonant, they are progressively devoiced to [f] and [ʂ].
- kwiat (k + w) → [kfjat] (not [kvʲjat])
- trzy (t + rz) → [tʂɨ] (not [tʐɨ])
- przepiórka (p + rz) → [pʂɛˈpʲurka]
- Neutral Sonorants: Sonorants (m, n, ń, l, ł, r, j) do not trigger voicing or devoicing of neighboring obstruents, but they can themselves be voiceless (allophonically) when sandwiched between voiceless sounds.
Stress Patterns & Exceptions
Polish stress is predominantly fixed on the penultimate (second-to-last) syllable. E.g., do-MU [ˈdɔmu], war-SZA-wa [varˈʂawa]. The Lua module supports several grammatical exceptions:
- Verbs in the Conditional Mood: The particles -by, -byś, -by, -byśmy, -byście do not shift the lexical stress.
- zrobiłby (3rd person sing. cond.) → stressed on the antepenultimate (3rd from last): ZRO-bił-by [ˈzrɔbʲiwbɨ].
- zrobilibyśmy (1st person plur. cond.) → stressed on the preantepenultimate (4th from last): zro-BI-li-byś-my [zrɔˈbʲilʲibɨɕmɨ].
- Past Tense Plural Inflections: The suffixes -śmy and -ście in the past tense plural do not shift the stress:
- zrobiliśmy → stressed on the antepenultimate: zro-BI-li-śmy [zrɔˈbʲilʲiɕmɨ].
- Classical loanwords: Words borrowed from Latin, Greek, or other languages may retain non-standard stress patterns (e.g., uniwersytet stressed on the third syllable). The module defaults to penultimate stress when no exception matches.
Orthography to IPA Mapping
Below is a highly detailed, comprehensive spelling-to-IPA mapping guide reflecting the rule structures of the pl-pron_wasm.lua engine. These tables list every possible Polish grapheme, digraph, and context-dependent sound modification.
1. Vowels (Oral & Nasal Contexts)
Unlike English or German, Polish vowels maintain stable lengths. However, nasal vowels (ą and ę) split or lose nasality depending on their phonological environment:
| Spelling | Primary IPA | Context / Phonological Rule | Example Word | IPA Output |
|---|---|---|---|---|
| a | /a/ | Always (open central unrounded) | sam | [sam] |
| e | /ɛ/ | Always (open-mid front unrounded) | ten | [tɛn] |
| i | /i/ | Vowel nucleus (when not preceding another vowel) | mit | [mʲit] |
| o | /ɔ/ | Always (open-mid back rounded) | nos | [nɔs] |
| u, ó | /u/ | Always (close back rounded) | but, bóg | [but], [buk] |
| y | /ɨ/ | Always (close central unrounded; only after hard consonants) | syn | [sɨn] |
| au | /aw/ | Vowel + glide (loanwords, e.g., auto) | auto | [ˈawtɔ] |
| eu | /ɛw/ | Vowel + glide (loanwords, e.g., euro) | euro | [ˈɛwrɔ] |
| ą | /ɔw̃/ | Word-final position (nasal diphthong) | robią | [ˈrɔbʲɔw̃] |
| Before fricatives (e.g., s, z, sz, ż, rz, ś, ź, f, w, ch) | wąski | [ˈvɔw̃skʲi] | ||
| /ɔm/ | Before bilabial stops (p, b) | dąb | [dɔmp] (with final devoicing) | |
| /ɔn/ | Before dental/alveolar stops/affricates (t, d, c, dz, cz, dż) | kąt | [kɔnt] | |
| /ɔɲ/ | Before alveolo-palatal affricates/stops (ć, dź, ci, dzi) | bądź | [bɔɲt͡ɕ] | |
| /ɔŋ/ | Before velar stops (k, g) | mąka | [ˈmɔŋka] | |
| /ɔ/ | Before laterals and semivowels (l, ł) (denasalized) | wziął | [vʑɔw] | |
| ę | /ɛ/ | Word-final position (standard denasalization) | idę | [ˈidɛ] |
| /ɛw̃/ | Before fricatives (e.g., s, z, sz, ż, rz, ś, ź, f, w, ch) | węch | [vɛw̃x] | |
| /ɛm/ | Before bilabial stops (p, b) | zęby | [ˈzɛmbɨ] | |
| /ɛn/ | Before dental/alveolar stops/affricates (t, d, c, dz, cz, dż) | chętny | [ˈxɛntnɨ] | |
| /ɛɲ/ | Before alveolo-palatal affricates/stops (ć, dź, ci, dzi) | pięć | [pʲjɛɲt͡ɕ] | |
| /ɛŋ/ | Before velar stops (k, g) | ręka | [ˈrɛŋka] | |
| /ɛ/ | Before laterals and semivowels (l, ł) (denasalized) | wzięli | [ˈvʑɛlʲi] |
2. Standard Consonants (Hard / Unpalatalized)
Polish consonants articulated without palatalization or subsequent assimilation:
| Spelling | IPA | Description | Example Word | IPA Output |
|---|---|---|---|---|
| b | /b/ | Voiced bilabial stop | bas | [bas] |
| p | /p/ | Voiceless bilabial stop | pan | [pan] |
| d | /d/ | Voiced dental stop | dom | [dɔm] |
| t | /t/ | Voiceless dental stop | ten | [tɛn] |
| g | /ɡ/ | Voiced velar stop | gra | [ɡra] |
| k | /k/ | Voiceless velar stop | kot | [kɔt] |
| f | /f/ | Voiceless labiodental fricative | fala | [ˈfala] |
| w | /v/ | Voiced labiodental fricative | woda | [ˈvɔda] |
| s | /s/ | Voiceless dental sibilant fricative | sen | [sɛn] |
| z | /z/ | Voiced dental sibilant fricative | ząb | [zɔmp] |
| h, ch | /x/ | Voiceless velar fricative | hak, chleb | [xak], [xlɛp] |
| m | /m/ | Bilabial nasal | mama | [ˈmama] |
| n | /n/ | Dental nasal | noc | [nɔt͡s] |
| l | /l/ | Alveolar lateral approximant (always clear) | las | [las] |
| ł | /w/ | Labiovelar semivowel | ładny | [ˈwadnɨ] |
| r | /r/ | Alveolar trill | rok | [rɔk] |
| j | /j/ | Palatal approximant (semivowel) | jutro | [ˈjutrɔ] |
3. Digraphs & Affricates
Polish makes extensive use of digraphs to represent retroflex, alveolo-palatal, and dental affricates/fricatives:
| Spelling | IPA | Description | Example Word | IPA Output |
|---|---|---|---|---|
| c | /t͡s/ | Voiceless dental affricate | co | [t͡sɔ] |
| cz | /t͡ʂ/ | Voiceless retroflex affricate | czas | [t͡ʂas] |
| dz | /d͡z/ | Voiced dental affricate | dzban | [d͡zban] |
| dż | /d͡ʐ/ | Voiced retroflex affricate | dżem | [d͡ʐɛm] |
| sz | /ʂ/ | Voiceless retroflex fricative | szum | [ʂum] |
| ż | /ʐ/ | Voiced retroflex fricative | żona | [ˈʐɔna] |
| rz | /ʐ/ | Voiced retroflex fricative (standard context) | rzeka | [ˈʐɛka] |
| /ʂ/ | Voiceless retroflex fricative (after voiceless: p, t, k, ch) | trzy, przy | [tʂɨ], [pʂɨ] | |
| ć | /t͡ɕ/ | Voiceless alveolo-palatal affricate | ćma | [t͡ɕma] |
| dź | /d͡ʑ/ | Voiced alveolo-palatal affricate | dźwig | [d͡ʑvʲik] |
| ś | /ɕ/ | Voiceless alveolo-palatal fricative | śpi | [ɕpʲi] |
| ź | /ʑ/ | Voiced alveolo-palatal fricative | źle | [ʑlɛ] |
| ń | /ɲ/ | Alveolo-palatal nasal consonant | koń | [kɔɲ] |
4. Palatalized (Soft) Consonants Before "i"
When consonants are followed by the letter i, they undergo palatalization. If i is followed by a vowel, it remains a silent palatal glide marker (marked as /ʲ/ or /j/). If followed by a consonant or word-finally, it acts both as the softening marker and the vowel nucleus /i/:
| Spelling | IPA (Before Vowel) | Example (Before Vowel) | IPA (Before Consonant/Final) | Example (Before Consonant/Final) |
|---|---|---|---|---|
| ci | /t͡ɕ/ | ciasto [ˈt͡ɕastɔ] | /t͡ɕi/ | cicho [ˈt͡ɕixɔ] |
| dzi | /d͡ʑ/ | działo [ˈd͡ʑawɔ] | /d͡ʑi/ | dziki [ˈd͡ʑikʲi] |
| si | /ɕ/ | siarka [ˈɕarka] | /ɕi/ | siła [ˈɕiwa] |
| zi | /ʑ/ | ziarno [ˈʑarnɔ] | /ʑi/ | zima [ˈʑima] |
| ni | /ɲ/ | niebo [ˈɲɛbɔ] | /ɲi/ | niski [ˈɲiskʲi] |
| bi | /bʲ/ or /bʲj/ | biały [ˈbʲawɨ] | /bʲi/ | bitwa [ˈbʲitfa] |
| pi | /pʲ/ or /pʲj/ | pies [pʲjɛs] | /pʲi/ | piwo [ˈpʲivɔ] |
| wi | /vʲ/ or /vʲj/ | wiatr [vʲjatr] | /vʲi/ | wino [ˈvʲinɔ] |
| fi | /fʲ/ or /fʲj/ | fiord [fʲjɔrt] | /fʲi/ | figa [ˈfʲiɡa] |
| mi | /mʲ/ or /mʲj/ | miasto [ˈmʲastɔ] | /mʲi/ | mili [ˈmʲilʲi] |
| ki | /kʲ/ or /kʲj/ | kiedy [ˈkʲɛdɨ] | /kʲi/ | kino [ˈkʲinɔ] |
| gi | /ɡʲ/ or /ɡʲj/ | giąć [ɡʲɔɲt͡ɕ] | /ɡʲi/ | gips [ɡʲips] |
| hi, chi | /xʲ/ or /xʲj/ | hiena [ˈxʲɛna] | /xʲi/ | chichot [ˈxʲixɔt] |
| li | /lʲ/ | liść [lʲiɕt͡ɕ] | /lʲi/ | list [lʲist] |
5. Voicing & Devoicing Assimilation Contexts
Consonants shift their voicing state systematically in clusters and at the end of words:
| Spelling | Underlying Phoneme | Resulting IPA | Triggering Context / Position | Example Word | IPA Output |
|---|---|---|---|---|---|
| b | /b/ | /p/ | Word-final (devoicing) | ząb | [zɔmp] |
| Regressive (before voiceless consonant) | babka | [ˈbapka] | |||
| d | /d/ | /t/ | Word-final (devoicing) | sad | [sat] |
| Regressive (before voiceless consonant) | wódka | [ˈvutka] | |||
| g | /ɡ/ | /k/ | Word-final (devoicing) | róg | [ruk] |
| Regressive (before voiceless consonant) | lekki | [ˈlɛkʲi] | |||
| w | /v/ | /f/ | Word-final (devoicing) | krew | [krɛf] |
| Progressive (following voiceless consonant) | twój, kwiat | [tfuj], [kfjat] | |||
| z | /z/ | /s/ | Word-final (devoicing) | mróz | [mrus] |
| Regressive (before voiceless consonant) | rozkaz | [ˈrɔskas] | |||
| ż, rz | /ʐ/ | /ʂ/ | Word-final (devoicing) | nóż, talerz | [nuʂ], [ˈtalɛʂ] |
| Regressive (before voiceless consonant) | drzewko | [ˈdʐɛfkɔ] | |||
| dz | /d͡z/ | /t͡s/ | Word-final (devoicing) | ksiądz | [kɕɔnt͡s] |
| dź | /d͡ʑ/ | /t͡ɕ/ | Word-final (devoicing) | łódź | [wut͡ɕ] |
| ź | /ʑ/ | /ɕ/ | Word-final (devoicing) | gałąź | [ˈɡawɔɲɕ] |
| k | /k/ | /ɡ/ | Regressive voicing (before voiced obstruent) | także | [ˈtaɡʐɛ] |
| p | /p/ | /b/ | Regressive voicing (before voiced obstruent) | jakby | [ˈjaɡbɨ] |
| t | /t/ | /d/ | Regressive voicing (before voiced obstruent) | liczba | [ˈlʲid͡ʐba] (with cz → dż) |
| ś | /ɕ/ | /ʑ/ | Regressive voicing (before voiced obstruent) | prośba | [ˈprɔʑba] |
| ch, h | /x/ | /ɣ/ | Regressive voicing (before voiced obstruent) | niechże | [ˈɲɛɣʐɛ] |
6. Foreign Letters & Loanword Graphemes
Polish handles recent loanwords and non-standard letters through phonetic approximations matching standard Polish orthography rules:
| Spelling | IPA Output | Phonetic Mapping Principle | Example Word | IPA Output |
|---|---|---|---|---|
| x | /ks/ | Decomposed to /k/ + /s/ | xero | [ˈksɛrɔ] |
| q | /k/ | q maps to voiceless velar stop; u follows as /v/ | quiz | [kfis] (q+u always gives /kf/ in cluster) |
| v | /v/ | Mapped directly to voiced labiodental fricative (same as w) | video | [viˈdɛɔ] |
| th | /t/ + /x/ | Not a digraph; t maps to /t/, h maps to /x/ | theatre | [txɛˈatrɛ] |
| ph | /p/ + /x/ | Not a digraph; p maps to /p/, h maps to /x/ | graph | [ɡrapx] |
| qu | /kf/ | q+u mapped to /k/+/v/; /v/ devoices to /f/ after voiceless /k/ | quasi | [ˈkfaɕi] |
How the Lua Module Works
The Lua file pl-pron_wasm.lua transcribes Polish words through a pipeline of character-by-character grapheme-to-phoneme conversion, contextual nasal resolution, voicing assimilation, and syllabification. Unlike the German or French modules, it does not use a lexicon — all output is generated from orthographic rules.
Module Execution Flow
- Stress Identification (
get_stressed_syllable):Analyzes the verb suffixes (like -byśmy, -liście) and grammatical rules to establish which vowel nucleus receives the primary stress. If none matches, it defaults to the second-to-last vowel nucleus.
- Phoneme Mapping (
convert_to_phones):Processes the string from left to right. Characters are matched against the
letters2phonestable. digraphs (like sz, cz, ch) are prioritized over single letters. When i is encountered, it sets a soft flag on the preceding consonant and determines if the i itself should be kept as a vowel or discarded as a spelling marker. - Nasal Resolution (
simplify_nasals):Scans for nasal vowels and looks ahead to the next phoneme. It replaces the abstract nasals with their contextual combinations (e.g., converting nasal to oral + velar nasal before velars).
- Terminal Devoicing (
terminal_devoice):If the final phoneme is marked as a voiced obstruent, it is converted to its voiceless counterpart using a direct lookup table (e.g.,
d → t,g → k). - Consonant Cluster Voicing (
process_consonant_clusters):Identifies consonant clusters (sequences of non-vowel phones). For each cluster, finds the "determining consonant" — the last obstruent that is neither a sonorant nor a forward assimilant. The voicing state of this consonant propagates to the entire cluster. Forward assimilants (w/v and rz) are devoiced progressively — they become voiceless when preceded by a voiceless consonant (e.g., kwiat: k→voiceless, so w→f).
- Syllabification (
collect_syllables):Segments the phone array into syllables using the Sonority Sequencing Principle. Vowels form nuclei, and valid clusters are shifted to syllable onsets.
- Formatting & Post-Processing:
Applies stress markers (
ˈ) and syllable boundaries (.), then performs final IPA cleanup: removes redundantʲbeforei/j, converts remainingʲtojglide, and replaces nasal vowels with diphthong notation (ɛ̃→ɛw̃,ɔ̃→ɔw̃).
Execution Pipeline
Raw Polish Word
│
▼
┌─────────────────────────┐
│ 1. get_stressed_syllable │ Identify stress from verb suffixes
└────────────┬────────────┘
▼
┌─────────────────────────┐
│ 2. convert_to_phones │ letters2phones lookup, digraph priority
└────────────┬────────────┘
▼
┌─────────────────────────┐
│ 3. simplify_nasals │ ą/ę → oral V + nasal C (contextual)
└────────────┬────────────┘
▼
┌─────────────────────────┐
│ 4. terminal_devoice │ Final voiced obstruent → voiceless
└────────────┬────────────┘
▼
┌─────────────────────────┐
│ 5. process_consonant_ │ Regressive voicing assimilation
│ clusters │ Progressive devoicing of w/rz
└────────────┬────────────┘
▼
┌─────────────────────────┐
│ 6. collect_syllables │ Sonority-based syllabification
└────────────┬────────────┘
▼
┌─────────────────────────┐
│ 7. Format output │ Stress ˈ, syllable ., post-processing
└────────────┬────────────┘
▼
IPA Output
Key Data Structures
| Structure | Type | Purpose |
|---|---|---|
letters2phones |
Nested table | Maps graphemes to IPA phones. Uses [false] key as fallback for unknown characters. Supports multi-character lookahead (e.g., c → h → i → vowel chain). |
valid_phone |
Set | All recognized IPA phone strings. Used to distinguish phones from boundary markers during syllabification. |
vowel |
Set | {a, ɛ, ɛ̃, i, ɨ, ɔ, ɔ̃, u} — vowel nuclei for syllable detection. |
devoice / voice |
Map | Voicing pairs. Sonorants (m, n, ɲ, ŋ, l, j, r, w) map to themselves — they are "neutral" and do not trigger assimilation. |
nasal_map |
Map | Following consonant → homorganic nasal: p/b→m, t/d→n, k/g→ŋ, ć/dź→ɲ, etc. |
forward_assimilants |
Set | {v, vʲ} (and rz via flags). These consonants are devoiced progressively (by a preceding voiceless C), unlike standard regressive assimilation. |
third_last_syllable_stress |
Suffix list | Conditional mood suffixes (-łbym, -łabym, -łbyś, etc.) that shift stress to antepenultimate syllable. |
fourth_last_syllable_stress |
Suffix list | -libyśmy, -łybyśmy, -libyście, -łybyście — stress on 4th from end. |
Post-Processing Rules
After syllabification, the module applies final transformations to the IPA string:
| Rule | Input | Output | Description |
|---|---|---|---|
| Palatalization cleanup | ʲi, ʲj | i, j | ʲ before i or j is redundant — removed. |
| ʲ → j | bʲa | bja | All remaining ʲ converted to j glide. |
| Nasal diphthongization | ɛ̃ | ɛw̃ | Nasal vowels rewritten as oral + nasalized approximant. |
| Nasal diphthongization | ɔ̃ | ɔw̃ | Same for back nasal vowel. |
Extending the Module
To add custom grapheme mappings (e.g., handling recent loanwords), insert entries into the letters2phones dictionary at the top of pl-pron_wasm.lua. The table uses nested lookahead: each key maps to either a phone string, a phone array, or a sub-table for the next character.
Example structure for a simple consonant:
["b"] = {
["i"] = { "bʲ", "i" }, -- before i → palatalized
[false] = "b" -- default/fallback
}
Known Issues & Limitations
| Input | System Output | Expected | Cause |
|---|---|---|---|
| Rzeczpospolita | [ˈʐɛt͡ʂpɔspɔlʲita] | Stress on -po- | Compound boundary not detected; defaults to penultimate |
| quiz | [kfis] | [kfis] (some speakers) | Loanword qu mapped as /kv/ by module |
| partner | [ˈpartnɛr] | [ˈpartnɛr] | Correct — but stress may vary in casual speech |
| wziął | [vʑɔw] | [vʑɔw] | Correct — ą denasalized before ł |
References & Further Reading
- Polish Phonology (Wikipedia)
- International Phonetic Association (IPA)
- Polish Pronouncing Dictionary (Rhapsody)
For technical issues or suggestions, please visit our GitHub repository.