Polish IPA Transcription - Help

← Back to Polish transcription ↑ All Languages

Table of Contents


About This Tool

This Polish transcription app uses a modified version of the Wiktionary Polish Pronunciation Module (pl-pron_wasm.lua) running inside the browser via Wasmoon (Lua 5.4 WebAssembly). It converts standard Polish orthography into the International Phonetic Alphabet (IPA).

Polish spelling is highly phonetic and rule-governed — far more so than English or even German. The tool simulates the exact phonetic transformations (devoicing, assimilation, palatalization, and nasal vowel decomposition) applied by native speakers, producing both phonemic and narrow phonetic transcriptions without requiring a lookup dictionary. This rule-based approach means it can transcribe any Polish word, including neologisms and rare compounds.

Dialects & Limitations

Interactive Features

Note: Unlike German or French, Polish transcription is entirely rule-based — there is no lexicon lookup. This means every word gets a single deterministic transcription based on orthographic rules. There are no "alternative variants" to cycle through.


Phonemic vs Phonetic

Quick Reference (Glossary)

Palatalization (Zmiękczenie)
The modification of consonant articulation by raising the tongue towards the hard palate. In Polish, it occurs before the letter i or is indicated by a diacritic acute accent (e.g., ć, ś, ń).
Decomposition of Nasals
The process where nasal vowels (ą, ę) are split into an oral vowel followed by a nasal consonant (like [ɔn], [ɛm], or [ɔŋ]) depending on the following consonant.
Final Devoicing (Wygłos)
The neutralization of contrast between voiced and voiceless obstruents at the end of a word (e.g., chleb "bread" is pronounced as if it ended in p: [xlɛp]).
Regressive Assimilation
A phonological change where a consonant adapts its voicing to match the following consonant (e.g., prośba "request" has ś voiced to [ʑ] due to the following voiced b).
Progressive Devoicing
A specific Polish assimilation rule where the voiced consonants w (/v/) and rz (/ʐ/) become voiceless [f] and [ʂ] when they follow a voiceless consonant (e.g., kwiat [kfjat], trzy [tʂɨ]).
Affricate (Afrykata)
A consonant sound starting as a stop and releasing immediately into a fricative at the same place of articulation (e.g., c /t͡s/, cz /t͡ʂ/, ć /t͡ɕ/).
Penultimate Stress (Akcent paroksytoniczny)
The standard stress pattern in Polish where the stress falls on the second-to-last syllable of a word.
Phoneme
The smallest unit of sound that distinguishes meaning (e.g., /p/ vs. /b/ in pas "belt" vs. bas "bass").
Allophone
A variant pronunciation of a phoneme that doesn't change meaning (e.g., aspirated [pʰ] vs. unaspirated [p] in different positions).
Sonorant
A consonant produced with continuous airflow (m, n, ɲ, ŋ, l, r, j, w). Sonorants are "neutral" in Polish voicing assimilation — they neither trigger nor undergo devoicing of neighboring obstruents.
Obstruent
A consonant produced by obstructing airflow (stops, fricatives, affricates). In Polish, obstruents are subject to voicing assimilation. Contrasts with sonorants.
Syllable Boundary
A marker (.) showing where syllables divide. Polish syllabification follows the Sonority Sequencing Principle: consonant clusters are divided so that each syllable rises in sonority toward its nucleus.
Geminate
A doubled consonant sound (e.g., /kk/ in oddychać). Polish orthography often shows this with consonant doubling.
Onset
The initial consonant(s) of a syllable (e.g., "p" in "pan"). Polish allows complex onsets like /pr/, /tr/, /kr/, /sp/, /st/, /sk/.
Coda
The final consonant(s) of a syllable (e.g., "t" in "kot"). Polish codas are subject to final devoicing.
Tie Bar
The combining tie bar (◌͡◌) used in IPA to indicate affricates — two segments forming a single phonological unit (e.g., /t͡s/, /t͡ʂ/, /t͡ɕ/).
Homorganic
Produced at the same place of articulation. In nasal decomposition, the nasal consonant matches the place of the following stop (e.g., ą + b → [ɔm] — both bilabial).

How to Read IPA Symbols

Vowel Symbols

Polish contains six oral vowels and two nasal diphthongs. Unlike English, there is no phonemic distinction between long and short vowels.

IPA Polish Grapheme Example English Approximation Description
/a/ a tak [tak] father (shortened) Open central unrounded vowel
/ɛ/ e ten [tɛn] bed Open-mid front unrounded vowel
/i/ i pić [pʲit͡ɕ] meet (shortened) Close front unrounded vowel
/ɨ/ y ty [tɨ] roses (unstressed) Close central unrounded vowel
/ɔ/ o to [tɔ] caught (British short) Open-mid back rounded vowel
/u/ u, ó tu [tu], bóg [buk] boot (shortened) Close back rounded vowel
/ɔw̃/ ą (word-final / before fricatives) są [sɔw̃], vąski [ˈvɔw̃skʲi] French "on" + English "w" Nasalized back rounded diphthong
/ɛw̃/ ę (before fricatives) węch [vɛw̃x] French "in" + English "w" Nasalized front unrounded diphthong

Consonant Symbols

Polish consonants are grouped into dental, retroflex (hard), alveolo-palatal (soft), labial, and velar sounds.

IPA Polish Graphemes Example English Approximation Description
/p/, /b/ p, b pan [pan], bok [bɔk] pin, bin Bilabial plosives
/t/, /d/ t, d to [tɔ], dom [dɔm] toy, do (dental articulation) Dental plosives
/k/, /ɡ/ k, g kot [kɔt], gra [ɡra] keep, go Velar plosives
/f/, /v/ f, w fakt [fakt], wóz [vus] fit, voice Labiodental fricatives
/s/, /z/ s, z syn [sɨn], zoolog [zɔˈɔlɔk] see, zoo Dental sibilants
/ʂ/, /ʐ/ sz, ż, rz szum [ʂum], żaba [ˈʐaba] hash, vision (but retracted) Voiceless/voiced retroflex fricatives
/ɕ/, /ʑ/ ś, si śmiech [ɕmʲɛx], ziemia [ˈʑɛmʲa] sheet, azure (hissing/soft) Voiceless/voiced alveolo-palatal fricatives
/x/ ch, h chleb [xlɛp], hak [xak] Scottish loch Voiceless velar fricative
/t͡s/, /d͡z/ c, dz co [t͡sɔ], dzban [d͡zban] cats, ads Dental affricates
/t͡ʂ/, /d͡ʐ/ cz, dż czas [t͡ʂas], em [d͡ʐɛm] church, jam (hard/retracted) Retroflex affricates
/t͡ɕ/, /d͡ʑ/ ć, ci, dź, dzi ćma [t͡ɕma], dzi[d͡ʑɛɲ] cheep, geep (soft/palatal) Alveolo-palatal affricates
/m/, /n/ m, n mama [ˈmama], noc [nɔt͡s] more, no Bilabial, dental nasals
/ɲ/ ń, ni koń [kɔɲ], niebo [ˈɲɛbɔ] canyon, Spanish ñ Alveolo-palatal nasal
/ŋ/ n (before k, g) bank [baŋk], gong [ɡɔŋk] sing Velar nasal
/l/ l las [las] leaf (clear L) Alveolar lateral approximant
/w/ ł ładny [ˈwadnɨ] wet Labio-velar approximant
/j/ j jutro [ˈjutrɔ] yes Palatal approximant
/r/ r ręka [ˈrɛŋka] Spanish r (rolled) Alveolar trill

Diacritical Marks & Stress

Symbol Name Function Example
ˈ Primary stress Marks the onset of the stressed syllable Warszawa /varˈʂa.va/
. Syllable boundary Divides phoneme sequences into syllables okno /ˈɔk.nɔ/
ʲ Palatalization marker Indicates secondary narrowing of the vocal tract (softening) pies /pʲjɛs/
͡ Tie bar Indicates that two letters form a single affricate sound noc /nɔt͡s/ (not t followed by s)

Polish Phonetics & Pronunciation Guide

Oral Vowels

Polish oral vowels are remarkably stable and monophthongal. The language does not have phonemic vowel length distinctions (like English ship vs. sheep or German Staat vs. Stadt). However, minor contextual shifts exist:

Nasal Vowels (ą, ę) in Depth

Polish is unique among Slavic languages in retaining nasal vowels (spelled ą and ę). In modern standard Polish, these are actually pronounced as nasal diphthongs or decompose into oral vowels + nasal consonants based on context. The Lua transcription module mimics this behavior using the following rules:

1. Before Obstruent Stops & Affricates (Decomposition)

The nasal vowel loses its vowel nasality and splits into a standard oral vowel (/ɔ/ or /ɛ/) followed by a homorganic nasal consonant matching the place of articulation of the following stop/affricate:

Context (Next Sound) Type ą becomes... ę becomes... Example Grapheme Phonetic Output
b, p Bilabial [ɔm] [ɛm] dąb, zęby [dɔmp], [ˈzɛmbɨ]
d, t, c, dz, cz, dż Dental / Retroflex [ɔn] [ɛn] kąt, chętny [kɔnt], [ˈxɛntnɨ]
ć, dź, ci, dzi Alveolo-palatal [ɔɲ] [ɛɲ] pięć, dociąć [pʲjɛɲt͡ɕ], [ˈdɔt͡ɕɔɲt͡ɕ]
g, k Velar [ɔŋ] [ɛŋ] mąka, ręka [ˈmɔŋka], [ˈrɛŋka]

2. Before Fricatives (Retained Nasal Diphthong)

Before fricatives (f, v, s, z, sz, ż, rz, ś, ź, ch, h), the nasalization cannot resolve into a distinct nasal stop. Instead, it is pronounced as an oral vowel followed by a nasalized labiovelar approximant: ą[ɔw̃], ę[ɛw̃]. E.g., wąski [ˈvɔw̃skʲi], węch [vɛw̃x].

3. Before Laterals and Approximants (Denasalization)

Before the letters l and ł, the nasal vowels completely lose their nasality. ą becomes standard [ɔ] and ę becomes [ɛ]. This is handled at the grapheme-to-phoneme conversion stage — the letters2phones table has explicit entries for ą + l, ą + ł, ę + l, ę + ł that produce denasalized output directly. E.g., wziął [vʑɔw], wzięli [ˈvʑɛlʲi].

4. Word-Final Position

Palatalization & Soft Consonants

Palatalization represents the division between "hard" and "soft" sounds in Polish orthography. There are two primary mechanisms:

  1. Diacritics: Consonants with acute accents (ć, dź, ś, ź, ń) are inherently soft alveolo-palatal sounds (/t͡ɕ, d͡ʑ, ɕ, ʑ, ɲ/).
  2. The letter i:
    • If followed by another vowel, the i is silent and acts purely as a softening marker for the preceding consonant. E.g., pies [pʲjɛs].
    • If followed by a consonant or at the end of a word, the i both softens the preceding consonant and acts as the vowel nucleus [i]. E.g., pili [ˈpʲilʲi].

When dental fricatives/affricates (s, z, c, dz) are palatalized by i or converted to diacritics, they transform completely: s → ś [ɕ], z → ź [ʑ], c → ć [t͡ɕ], dz → dź [d͡ʑ].

Voicing & Devoicing Rules

Polish exhibits pervasive voicing assimilations within consonant clusters and at word boundaries:

Consonant Cluster Assimilations

Polish contains complex consonant clusters. In some contexts, progressive voicing/devoicing rules apply to specific consonants:

Stress Patterns & Exceptions

Polish stress is predominantly fixed on the penultimate (second-to-last) syllable. E.g., do-MU [ˈdɔmu], war-SZA-wa [varˈʂawa]. The Lua module supports several grammatical exceptions:

Orthography to IPA Mapping

Below is a highly detailed, comprehensive spelling-to-IPA mapping guide reflecting the rule structures of the pl-pron_wasm.lua engine. These tables list every possible Polish grapheme, digraph, and context-dependent sound modification.

1. Vowels (Oral & Nasal Contexts)

Unlike English or German, Polish vowels maintain stable lengths. However, nasal vowels (ą and ę) split or lose nasality depending on their phonological environment:

Spelling Primary IPA Context / Phonological Rule Example Word IPA Output
a /a/ Always (open central unrounded) sam [sam]
e /ɛ/ Always (open-mid front unrounded) ten [tɛn]
i /i/ Vowel nucleus (when not preceding another vowel) mit [mʲit]
o /ɔ/ Always (open-mid back rounded) nos [nɔs]
u, ó /u/ Always (close back rounded) but, bóg [but], [buk]
y /ɨ/ Always (close central unrounded; only after hard consonants) syn [sɨn]
au /aw/ Vowel + glide (loanwords, e.g., auto) auto [ˈawtɔ]
eu /ɛw/ Vowel + glide (loanwords, e.g., euro) euro [ˈɛwrɔ]
ą /ɔw̃/ Word-final position (nasal diphthong) robią [ˈrɔbʲɔw̃]
Before fricatives (e.g., s, z, sz, ż, rz, ś, ź, f, w, ch) wąski [ˈvɔw̃skʲi]
/ɔm/ Before bilabial stops (p, b) dąb [dɔmp] (with final devoicing)
/ɔn/ Before dental/alveolar stops/affricates (t, d, c, dz, cz, dż) kąt [kɔnt]
/ɔɲ/ Before alveolo-palatal affricates/stops (ć, dź, ci, dzi) bądź [bɔɲt͡ɕ]
/ɔŋ/ Before velar stops (k, g) mąka [ˈmɔŋka]
/ɔ/ Before laterals and semivowels (l, ł) (denasalized) wziął [vʑɔw]
ę /ɛ/ Word-final position (standard denasalization) idę [ˈidɛ]
/ɛw̃/ Before fricatives (e.g., s, z, sz, ż, rz, ś, ź, f, w, ch) węch [vɛw̃x]
/ɛm/ Before bilabial stops (p, b) zęby [ˈzɛmbɨ]
/ɛn/ Before dental/alveolar stops/affricates (t, d, c, dz, cz, dż) chętny [ˈxɛntnɨ]
/ɛɲ/ Before alveolo-palatal affricates/stops (ć, dź, ci, dzi) pięć [pʲjɛɲt͡ɕ]
/ɛŋ/ Before velar stops (k, g) ręka [ˈrɛŋka]
/ɛ/ Before laterals and semivowels (l, ł) (denasalized) wzięli [ˈvʑɛlʲi]

2. Standard Consonants (Hard / Unpalatalized)

Polish consonants articulated without palatalization or subsequent assimilation:

Spelling IPA Description Example Word IPA Output
b /b/ Voiced bilabial stop bas [bas]
p /p/ Voiceless bilabial stop pan [pan]
d /d/ Voiced dental stop dom [dɔm]
t /t/ Voiceless dental stop ten [tɛn]
g /ɡ/ Voiced velar stop gra [ɡra]
k /k/ Voiceless velar stop kot [kɔt]
f /f/ Voiceless labiodental fricative fala [ˈfala]
w /v/ Voiced labiodental fricative woda [ˈvɔda]
s /s/ Voiceless dental sibilant fricative sen [sɛn]
z /z/ Voiced dental sibilant fricative ząb [zɔmp]
h, ch /x/ Voiceless velar fricative hak, chleb [xak], [xlɛp]
m /m/ Bilabial nasal mama [ˈmama]
n /n/ Dental nasal noc [nɔt͡s]
l /l/ Alveolar lateral approximant (always clear) las [las]
ł /w/ Labiovelar semivowel ładny [ˈwadnɨ]
r /r/ Alveolar trill rok [rɔk]
j /j/ Palatal approximant (semivowel) jutro [ˈjutrɔ]

3. Digraphs & Affricates

Polish makes extensive use of digraphs to represent retroflex, alveolo-palatal, and dental affricates/fricatives:

Spelling IPA Description Example Word IPA Output
c /t͡s/ Voiceless dental affricate co [t͡sɔ]
cz /t͡ʂ/ Voiceless retroflex affricate czas [t͡ʂas]
dz /d͡z/ Voiced dental affricate dzban [d͡zban]
/d͡ʐ/ Voiced retroflex affricate dżem [d͡ʐɛm]
sz /ʂ/ Voiceless retroflex fricative szum [ʂum]
ż /ʐ/ Voiced retroflex fricative żona [ˈʐɔna]
rz /ʐ/ Voiced retroflex fricative (standard context) rzeka [ˈʐɛka]
/ʂ/ Voiceless retroflex fricative (after voiceless: p, t, k, ch) trzy, przy [tʂɨ], [pʂɨ]
ć /t͡ɕ/ Voiceless alveolo-palatal affricate ćma [t͡ɕma]
/d͡ʑ/ Voiced alveolo-palatal affricate dźwig [d͡ʑvʲik]
ś /ɕ/ Voiceless alveolo-palatal fricative śpi [ɕpʲi]
ź /ʑ/ Voiced alveolo-palatal fricative źle [ʑlɛ]
ń /ɲ/ Alveolo-palatal nasal consonant koń [kɔɲ]

4. Palatalized (Soft) Consonants Before "i"

When consonants are followed by the letter i, they undergo palatalization. If i is followed by a vowel, it remains a silent palatal glide marker (marked as /ʲ/ or /j/). If followed by a consonant or word-finally, it acts both as the softening marker and the vowel nucleus /i/:

Spelling IPA (Before Vowel) Example (Before Vowel) IPA (Before Consonant/Final) Example (Before Consonant/Final)
ci /t͡ɕ/ ciasto [ˈt͡ɕastɔ] /t͡ɕi/ cicho [ˈt͡ɕixɔ]
dzi /d͡ʑ/ działo [ˈd͡ʑawɔ] /d͡ʑi/ dziki [ˈd͡ʑikʲi]
si /ɕ/ siarka [ˈɕarka] /ɕi/ siła [ˈɕiwa]
zi /ʑ/ ziarno [ˈʑarnɔ] /ʑi/ zima [ˈʑima]
ni /ɲ/ niebo [ˈɲɛbɔ] /ɲi/ niski [ˈɲiskʲi]
bi /bʲ/ or /bʲj/ biały [ˈbʲawɨ] /bʲi/ bitwa [ˈbʲitfa]
pi /pʲ/ or /pʲj/ pies [pʲjɛs] /pʲi/ piwo [ˈpʲivɔ]
wi /vʲ/ or /vʲj/ wiatr [vʲjatr] /vʲi/ wino [ˈvʲinɔ]
fi /fʲ/ or /fʲj/ fiord [fʲjɔrt] /fʲi/ figa [ˈfʲiɡa]
mi /mʲ/ or /mʲj/ miasto [ˈmʲastɔ] /mʲi/ mili [ˈmʲilʲi]
ki /kʲ/ or /kʲj/ kiedy [ˈkʲɛdɨ] /kʲi/ kino [ˈkʲinɔ]
gi /ɡʲ/ or /ɡʲj/ giąć [ɡʲɔɲt͡ɕ] /ɡʲi/ gips [ɡʲips]
hi, chi /xʲ/ or /xʲj/ hiena [ˈxʲɛna] /xʲi/ chichot [ˈxʲixɔt]
li /lʲ/ liść [lʲiɕt͡ɕ] /lʲi/ list [lʲist]

5. Voicing & Devoicing Assimilation Contexts

Consonants shift their voicing state systematically in clusters and at the end of words:

Spelling Underlying Phoneme Resulting IPA Triggering Context / Position Example Word IPA Output
b /b/ /p/ Word-final (devoicing) ząb [zɔmp]
Regressive (before voiceless consonant) babka [ˈbapka]
d /d/ /t/ Word-final (devoicing) sad [sat]
Regressive (before voiceless consonant) wódka [ˈvutka]
g /ɡ/ /k/ Word-final (devoicing) róg [ruk]
Regressive (before voiceless consonant) lekki [ˈlɛkʲi]
w /v/ /f/ Word-final (devoicing) krew [krɛf]
Progressive (following voiceless consonant) twój, kwiat [tfuj], [kfjat]
z /z/ /s/ Word-final (devoicing) mróz [mrus]
Regressive (before voiceless consonant) rozkaz [ˈrɔskas]
ż, rz /ʐ/ /ʂ/ Word-final (devoicing) nóż, talerz [nuʂ], [ˈtalɛʂ]
Regressive (before voiceless consonant) drzewko [ˈdʐɛfkɔ]
dz /d͡z/ /t͡s/ Word-final (devoicing) ksiądz [kɕɔnt͡s]
/d͡ʑ/ /t͡ɕ/ Word-final (devoicing) łódź [wut͡ɕ]
ź /ʑ/ /ɕ/ Word-final (devoicing) gałąź [ˈɡawɔɲɕ]
k /k/ /ɡ/ Regressive voicing (before voiced obstruent) także [ˈtaɡʐɛ]
p /p/ /b/ Regressive voicing (before voiced obstruent) jakby [ˈjaɡbɨ]
t /t/ /d/ Regressive voicing (before voiced obstruent) liczba [ˈlʲid͡ʐba] (with cz → dż)
ś /ɕ/ /ʑ/ Regressive voicing (before voiced obstruent) prośba [ˈprɔʑba]
ch, h /x/ /ɣ/ Regressive voicing (before voiced obstruent) niechże [ˈɲɛɣʐɛ]

6. Foreign Letters & Loanword Graphemes

Polish handles recent loanwords and non-standard letters through phonetic approximations matching standard Polish orthography rules:

Spelling IPA Output Phonetic Mapping Principle Example Word IPA Output
x /ks/ Decomposed to /k/ + /s/ xero [ˈksɛrɔ]
q /k/ q maps to voiceless velar stop; u follows as /v/ quiz [kfis] (q+u always gives /kf/ in cluster)
v /v/ Mapped directly to voiced labiodental fricative (same as w) video [viˈdɛɔ]
th /t/ + /x/ Not a digraph; t maps to /t/, h maps to /x/ theatre [txɛˈatrɛ]
ph /p/ + /x/ Not a digraph; p maps to /p/, h maps to /x/ graph [ɡrapx]
qu /kf/ q+u mapped to /k/+/v/; /v/ devoices to /f/ after voiceless /k/ quasi [ˈkfaɕi]

How the Lua Module Works

The Lua file pl-pron_wasm.lua transcribes Polish words through a pipeline of character-by-character grapheme-to-phoneme conversion, contextual nasal resolution, voicing assimilation, and syllabification. Unlike the German or French modules, it does not use a lexicon — all output is generated from orthographic rules.

Module Execution Flow

  1. Stress Identification (get_stressed_syllable):

    Analyzes the verb suffixes (like -byśmy, -liście) and grammatical rules to establish which vowel nucleus receives the primary stress. If none matches, it defaults to the second-to-last vowel nucleus.

  2. Phoneme Mapping (convert_to_phones):

    Processes the string from left to right. Characters are matched against the letters2phones table. digraphs (like sz, cz, ch) are prioritized over single letters. When i is encountered, it sets a soft flag on the preceding consonant and determines if the i itself should be kept as a vowel or discarded as a spelling marker.

  3. Nasal Resolution (simplify_nasals):

    Scans for nasal vowels and looks ahead to the next phoneme. It replaces the abstract nasals with their contextual combinations (e.g., converting nasal to oral + velar nasal before velars).

  4. Terminal Devoicing (terminal_devoice):

    If the final phoneme is marked as a voiced obstruent, it is converted to its voiceless counterpart using a direct lookup table (e.g., d → t, g → k).

  5. Consonant Cluster Voicing (process_consonant_clusters):

    Identifies consonant clusters (sequences of non-vowel phones). For each cluster, finds the "determining consonant" — the last obstruent that is neither a sonorant nor a forward assimilant. The voicing state of this consonant propagates to the entire cluster. Forward assimilants (w/v and rz) are devoiced progressively — they become voiceless when preceded by a voiceless consonant (e.g., kwiat: k→voiceless, so w→f).

  6. Syllabification (collect_syllables):

    Segments the phone array into syllables using the Sonority Sequencing Principle. Vowels form nuclei, and valid clusters are shifted to syllable onsets.

  7. Formatting & Post-Processing:

    Applies stress markers (ˈ) and syllable boundaries (.), then performs final IPA cleanup: removes redundant ʲ before i/j, converts remaining ʲ to j glide, and replaces nasal vowels with diphthong notation (ɛ̃ɛw̃, ɔ̃ɔw̃).

Execution Pipeline

Raw Polish Word
       │
       ▼
┌─────────────────────────┐
│ 1. get_stressed_syllable │  Identify stress from verb suffixes
└────────────┬────────────┘
             ▼
┌─────────────────────────┐
│ 2. convert_to_phones     │  letters2phones lookup, digraph priority
└────────────┬────────────┘
             ▼
┌─────────────────────────┐
│ 3. simplify_nasals       │  ą/ę → oral V + nasal C (contextual)
└────────────┬────────────┘
             ▼
┌─────────────────────────┐
│ 4. terminal_devoice      │  Final voiced obstruent → voiceless
└────────────┬────────────┘
             ▼
┌─────────────────────────┐
│ 5. process_consonant_    │  Regressive voicing assimilation
│    clusters              │  Progressive devoicing of w/rz
└────────────┬────────────┘
             ▼
┌─────────────────────────┐
│ 6. collect_syllables     │  Sonority-based syllabification
└────────────┬────────────┘
             ▼
┌─────────────────────────┐
│ 7. Format output         │  Stress ˈ, syllable ., post-processing
└────────────┬────────────┘
             ▼
       IPA Output

Key Data Structures

Structure Type Purpose
letters2phones Nested table Maps graphemes to IPA phones. Uses [false] key as fallback for unknown characters. Supports multi-character lookahead (e.g., chi → vowel chain).
valid_phone Set All recognized IPA phone strings. Used to distinguish phones from boundary markers during syllabification.
vowel Set {a, ɛ, ɛ̃, i, ɨ, ɔ, ɔ̃, u} — vowel nuclei for syllable detection.
devoice / voice Map Voicing pairs. Sonorants (m, n, ɲ, ŋ, l, j, r, w) map to themselves — they are "neutral" and do not trigger assimilation.
nasal_map Map Following consonant → homorganic nasal: p/b→m, t/d→n, k/g→ŋ, ć/dź→ɲ, etc.
forward_assimilants Set {v, vʲ} (and rz via flags). These consonants are devoiced progressively (by a preceding voiceless C), unlike standard regressive assimilation.
third_last_syllable_stress Suffix list Conditional mood suffixes (-łbym, -łabym, -łbyś, etc.) that shift stress to antepenultimate syllable.
fourth_last_syllable_stress Suffix list -libyśmy, -łybyśmy, -libyście, -łybyście — stress on 4th from end.

Post-Processing Rules

After syllabification, the module applies final transformations to the IPA string:

Rule Input Output Description
Palatalization cleanup ʲi, ʲj i, j ʲ before i or j is redundant — removed.
ʲ → j bʲa bja All remaining ʲ converted to j glide.
Nasal diphthongization ɛ̃ ɛw̃ Nasal vowels rewritten as oral + nasalized approximant.
Nasal diphthongization ɔ̃ ɔw̃ Same for back nasal vowel.

Extending the Module

To add custom grapheme mappings (e.g., handling recent loanwords), insert entries into the letters2phones dictionary at the top of pl-pron_wasm.lua. The table uses nested lookahead: each key maps to either a phone string, a phone array, or a sub-table for the next character.

Example structure for a simple consonant:

["b"] = {
    ["i"] = { "bʲ", "i" },  -- before i → palatalized
    [false] = "b"            -- default/fallback
}

Known Issues & Limitations

Input System Output Expected Cause
Rzeczpospolita [ˈʐɛt͡ʂpɔspɔlʲita] Stress on -po- Compound boundary not detected; defaults to penultimate
quiz [kfis] [kfis] (some speakers) Loanword qu mapped as /kv/ by module
partner [ˈpartnɛr] [ˈpartnɛr] Correct — but stress may vary in casual speech
wziął [vʑɔw] [vʑɔw] Correct — ą denasalized before ł


For technical issues or suggestions, please visit our GitHub repository.