LangSync · learn

What is ICU MessageFormat?

ICU MessageFormat is a syntax for embedding plural, gender, and formatting rules inside translation strings — so one source string can correctly render "1 item" / "2 items" / "5 items" in any language, including the ones with three or four plural categories. It is the de-facto standard for software UI strings that depend on counts, choices, or formatted numbers.

Supported by formatjs, i18next, messageformat, react-intl Plural / select / selectordinal built into the spec CLDR plural-category data covers ~200 languages
Definition

The 30-second version

ICU MessageFormat is a small templating syntax that lives inside the value half of an i18n key. Instead of writing one source string per plural form ("1 item" / "items"), you write a single template and the i18n library substitutes the right form at render time based on the argument value and the user's locale.

A minimal example:

{count, plural, one {# item} other {# items}}

Read left to right: take an argument called count, switch on its plural category, render # item when the category is one, render # items otherwise. The # is a magic placeholder for the original number.

That's the whole concept. The complexity comes from two places: languages have different plural categories (Czech has three, Russian has four, Arabic has six), and you can nest ICU clauses to handle gender + count + tense in one string.

Plurals

Why "one and other" is not enough

English has two plural categories: one (for the number 1) and other (for everything else, including 0 and fractions). Most languages don't fit that shape.

The Unicode CLDR project maintains the canonical list of plural categories per language. The six possible categories are zero, one, two, few, many, and other — each language uses a subset.

Language Categories Example for "items"
English (en) one, other 1 item / 0, 2, 5 items
Czech (cs) one, few, other 1 položka / 2–4 položky / 5+ položek
Polish (pl) one, few, many, other 1 plik / 2–4 pliki / 5+ plików / 1.5 pliku
Russian (ru) one, few, many, other 1 файл / 2–4 файла / 5+ файлов / 1.5 файла
Arabic (ar) zero, one, two, few, many, other 6 categories, including a dedicated dual
Japanese / Chinese / Korean other only One form for every count

A correct ICU plural block for Czech needs three branches; for Polish and Russian, four; for Arabic, up to six. If your i18n library only ships one and other because the source language is English, you will render incorrect Czech at counts like 2 or 5.

{count, plural,
  one {# soubor}
  few {# soubory}
  many {# souboru}
  other {# souborů}
}

The =N literal form is also valid for exact-count overrides:

{count, plural,
  =0 {No items}
  one {1 item}
  other {# items}
}

=0 matches when count is exactly zero — overrides the other branch that would otherwise catch it.

Select

Branching on arbitrary tags — gender, tone, role

select is the same shape as plural but matches an arbitrary string instead of a CLDR plural category. It's most often used for grammatical gender:

{gender, select,
  female {She added a comment}
  male {He added a comment}
  other {They added a comment}
}

The other branch is always required as a fallback — if the argument doesn't match a named branch, other renders. The i18n library will error at build time if you omit it.

select works for any closed set: user role, payment status, device type, anything where one of N strings should render.

{role, select,
  admin {Manage all settings}
  editor {Manage content}
  viewer {Read only}
  other {No access}
}
Selectordinal

Ordinal numbers — 1st, 2nd, 3rd

English uses 1st, 2nd, 3rd, then 4th through 20th, then 21st again, and so on. That's three ordinal categories: one, two, few, other. ICU's selectordinal keyword handles them the same way plural handles cardinals — via CLDR ordinal-category data per language.

{place, selectordinal,
  one {#st place}
  two {#nd place}
  few {#rd place}
  other {#th place}
}

Most languages have a much simpler ordinal system than their cardinal system — many use a single ordinal suffix regardless of number. Check CLDR for your target locale before assuming.

Nesting

Combining plural, select, and free text

ICU clauses nest. A string can switch on gender AND count in the same template:

{gender, select,
  female {She {count, plural,
    one {posted # photo}
    other {posted # photos}
  }}
  male {He {count, plural,
    one {posted # photo}
    other {posted # photos}
  }}
  other {They {count, plural,
    one {posted # photo}
    other {posted # photos}
  }}
}

The string above takes two arguments (gender and count) and renders a single grammatically correct sentence in English. The same source string, translated to Czech, would need three plural branches inside each gender branch.

Practical tip: nesting beyond two levels gets unreadable fast, and translators struggle with deeply nested templates. If the grammar of every target language allows it, prefer breaking the string into two keys with separate placeholders.

Formatting

Numbers, dates, and currency

ICU also defines argument formatters for numbers, dates, and times. The library uses Intl.NumberFormat / Intl.DateTimeFormat under the hood and respects the user's locale:

{amount, number, ::compact-short}
{amount, number, ::currency/USD}
{when, date, medium}
{when, time, short}

Real-world example for a notification timestamp:

Posted {when, date, medium} at {when, time, short}

Renders as Posted Jun 2, 2026 at 9:48 PM in en-US, Veröffentlicht 2. Juni 2026 um 21:48 in German, and so on. The locale handles the rendering; you just declare the format.

For currencies, the ::currency/USD skeleton picks the symbol and decimal placement from the user's locale — $1,000.00 in en-US, 1.000,00 $ in de-DE. Your AI translator does not need to know that Germans use a period as the thousands separator; the formatter handles it.

Tooling

Which i18n libraries actually parse ICU?

ICU support varies. The libraries with first-class ICU MessageFormat parsing:

  • @formatjs/intl / react-intl — full ICU support, the reference JavaScript implementation. Also ships a CLI to validate ICU syntax at build time.
  • @messageformat/core — standalone ICU parser, used as a backend by many other libraries.
  • @formatjs/cli — extracts ICU strings from source code and emits the JSON.
  • Java ICU4J, Python babel — full ICU support on the backend side.

Libraries with partial support (plurals work, formatters may be limited or rely on a plugin):

  • i18next with i18next-icu postprocessor — adds ICU parsing on top of i18next's own plural-suffix convention.
  • Vue I18n — basic ICU via the IntlNumberFormat / IntlDateTimeFormat integrations; pluralization is i18next-style by default.

Libraries that prefer their own convention:

  • i18next native — cart_one, cart_other key suffixes, no ICU by default.
  • gettext / .pomsgid_plural + msgstr[N] indexed forms, separate from ICU.

Pick your library first, then check whether the ICU syntax you want to write is supported. Mixing ICU and library-native plural conventions in the same file confuses both translators and parsers.

In LangSync

How LangSync handles ICU strings

LangSync's translation pipeline today asks the model to maintain formatting, punctuation style, and special characters in the source string. In practice that's the contract that covers ICU MessageFormat blocks — a source like {count, plural, one {# item} other {# items}} is asked to come back with the ICU structure intact.

A few things worth being honest about:

  • The AI does not auto-generate plural categories the source doesn't already have. If your English source has one and other branches and you translate to Czech (which needs one, few, and other), the model translates the two branches it sees and stops. The Czech output will be grammatically broken at counts like 2.
  • Branch-level fidelity is empirical. Modern LLMs are generally good at preserving JSON-shaped templates, but deeply nested ICU (plural inside select inside plural) can confuse any model — especially when target languages reshuffle word order. Spot-check the output for nested cases before shipping.
  • Glossary terms are fed into every translation call, but whether the model applies them consistently inside every branch of a complex template is up to the model, not an enforced post-processing step. For domain-critical vocabulary, review the output.

Practical recommendation: structure source strings as ICU MessageFormat upfront when the string depends on counts or selectors, use the full set of plural categories your target languages need, and review the translated branches before shipping. For non-ICU strings, the same "maintain formatting" contract covers {name} and {{name}} interpolation tokens.

This describes how LangSync's translation pipeline works today. The prompt, the underlying model, and the post-processing pipeline may evolve; check the LangSync docs if you build critical workflows on top of specific behavior.

FAQ

Common questions about ICU MessageFormat

Do I need ICU MessageFormat for every string?
No. ICU is for strings that depend on counts, choices, or formatted numbers. A static button label like "Save" does not need ICU. A rule of thumb: if the string contains an argument value that affects grammar or pluralization, ICU pays for itself. If it does not, plain interpolation is shorter and easier to translate.
Why does my i18n library throw an error on `{count, plural, one {# item}}`?
The other branch is mandatory in ICU plurals. Every plural and select block must have an other fallback — the parser rejects the template without it. Same for selectordinal. Add other {# items} and the error goes away.
What does `#` mean inside a plural branch?
It is shorthand for the original argument value. {count, plural, one {# item} other {# items}} with count = 5 renders 5 items. You can also use {count} directly, but # is the canonical ICU placeholder and recognized by every ICU parser.
Can I use ICU with i18next?
Yes, via the i18next-icu postprocessor. By default i18next uses its own plural-suffix convention (cart_one, cart_other). With i18next-icu loaded, ICU strings parse correctly. Some teams use both — ICU for complex strings, suffix convention for simple plurals — but it makes the file harder to read. Pick one per namespace.
How do I know which plural categories my target language needs?
Look up the language code in the Unicode CLDR plural-rules data. The CLDR project publishes the canonical list; libraries like make-plural package it. As a quick reference: English / German / Dutch / Spanish use one + other; Czech / Slovak use one + few + other; Polish / Russian use one + few + many + other; Arabic uses all six. Asian languages like Japanese / Chinese / Korean use other only.
Should the AI generate the extra plural branches my target language needs?
Not today on LangSync. The AI translates the branches present in the source — if the English source has two and Czech needs three, the AI translates the two and stops. The honest fix is to structure source strings with all branches your target languages need from the start. We are tracking auto-expansion of plural categories as a roadmap item.
What about ICU on the backend (server-rendered HTML, emails)?
ICU works the same on the server. Use ICU4J for Java, babel for Python, messageformat for Node.js, or the standard Intl.MessageFormat API in modern Node. The MessageFormat syntax is identical across all of them; only the parser implementation differs.
Ready when you are

Try LangSync in 60 seconds

First 1,000 strings free, no credit card, no sales call.

Free tier 1,000 strings, no card
EU-hosted Every plan, by default
No SDK Keep your i18n library
No training On your strings
// tick. tick. tick.
0 ticks since founding