Five languages from day one: internationalization in a full-stack monorepo

Neural Summary supports five languages: English, Dutch, German, French, and Spanish. Not just for the UI buttons and labels, but also the AI-generated content. Every summary, every lens output, every chat response respects the user's language preference in these languages.

We decided to add i18n on day one to serve both our US and European audience. The first commit already included multi-language support. Read why this was the best architectural decision we made, while we explain how this was implemented across the full stack, and what makes AI-content localization different from a traditional i18n approach.

Three layers of language

Traditional internationalization has one layer: UI strings. Button labels, error messages, page titles. You put them in translation files and swap them based on locale.

Our system has three layers:

Layer 1: UI strings. The standard i18n. These include labels, buttons, navigation, error messages, tooltips. Five JSON translation files, one per locale.

Layer 2: AI-generated content. These are the summaries, lens outputs, and chat responses generated in the user's language. This requires language-aware prompts, not just string replacement.

Layer 3: Content translation. Users can translate an existing summary or lens output into any of the five supported languages. This is on-demand translation using an LLM, not pre-computed.

We learned that each layer has different technical requirements and different failure modes.

Layer 1: UI strings

The frontend uses an i18n library with locale-based routing. URLs include the locale: /en/dashboard, /nl/dashboard, /de/dashboard. The middleware detects the locale from the URL and loads the corresponding translation file.

Each locale has a JSON file with nested keys:

{
  "dashboard": {
    "greeting": "Good morning",
    "conversations": "Conversations",
    "folders": "Folders"
  },
  "lens": {
    "apply": "Apply a lens",
    "generating": "Generating..."
  }
}

The React Components use a hook to access translated strings:

const t = useTranslations('dashboard');
return <h1>{t('greeting')}, {user.name}</h1>;

This is pretty standard. Every modern framework supports it. However, the interesting challenges are in the other two layers.

The capitalization problem

One of the challenges we faced is that different languages have different capitalization rules. And this matters for headings, labels, and UI text.

English: Sentence case for most UI text. "How it works" not "How It Works."

German: Nouns are always capitalized. "So Funktioniert es" where "Funktioniert" is not capitalized (it is a verb) but a heading like "Wichtigste Erkenntnisse" capitalizes "Erkenntnisse" (it is a noun).

Dutch, Spanish, French: Similar to English. Only capitalize the first word and proper nouns.

We decided early against a global capitalization function. There is no rule you can apply across five languages without breaking at least one of them. Instead, each translated string is capitalized correctly for its own language by the translator (human or AI-assisted), and the files simply follow each language's rules.

The worst bug in this layer had nothing to do with capitalization. The i18n library was quietly auto-detecting browser language preferences and overriding the user's explicit choice. A Dutch user who selected English would see Dutch again on their next visit, because their browser's Accept-Language header listed nl first. From the user's perspective, the app simply refused to stay in English. We disabled browser detection entirely and now rely on the URL locale and a preference cookie, nothing else.

Layer 2: AI-generated content

This is where i18n gets interesting.

When Neural Summary generates a summary or lens output, everything must come back in the user's language. Not just the prose, but every field: headings, descriptions, action item text, labels, chart titles.

Our first prompts asked for this politely. The model treated it as a suggestion: roughly 30% of fields in non-English outputs came back in English anyway. Structural text like section headers and labels was the worst offender, because the model has seen far more English examples of those patterns than Dutch or German ones.

So every prompt now carries a hard language rule. In sketch form (the production block is worded differently):

LANGUAGE RULE (non-negotiable): every string value in the
output is written in German. Headings, labels, body content,
all of it. An English fallback in any field is a defect.

The forcefulness is deliberate and tested. Every softer formulation we tried leaked more English.

Template-level language awareness

The blanket rule turned out to be necessary but not sufficient. Each lens template has its own language quirks.

Action items taught us about verb forms. An action item must use the imperative in the target language: "Ship the pricing page" in English, "Preisseite versenden" in German. Without explicit instruction, the LLM drifts into infinitives and subjunctives that read like suggestions instead of tasks.

Professional terminology varies by language, and we learned not to translate it ourselves. A "strategy brief" is a "Strategiebrief" in German, "note stratégique" in French, "nota estratégica" in Spanish. We instruct the LLM to use whatever terminology a professional in that language would actually use.

Date and number formatting stayed out of the prompts entirely. German writes 1.234,56 where English writes 1,234.56. The rendering layer handles this, not the LLM.

Layer 3: On-demand translation

Generating in your own language covers most of the week. But there is a third situation: the output already exists, and you need it in another language. Think of a Dutch consultant who records a client call in Dutch and then owes an English summary to a colleague abroad. That is Layer 3: translating existing content on demand, separate from generating it in a language.

The pipeline:

The translation preserves the JSON structure: a translated executive summary has the same schema as the original, with every string value translated. This is where the decision to generate structured output kept paying off. The same rendering component works for any language, because the shape never changes.

Parallel translation

We also decided users should not wait for translations one piece at a time. When a translation is requested, the summary and every generated lens output translate simultaneously: a conversation with a summary and three lens outputs sends four concurrent requests.

await Promise.all([
  translateSummary(transcriptionId, targetLanguage),
  ...lensOutputs.map(lens => 
    translateLens(lens.id, targetLanguage)
  ),
]);

The effect: a conversation with five lens outputs translates in roughly the same time as a conversation with one.

Translation preference persistence

Translate a conversation to French once and it opens in French from then on. We deliberately store that preference per conversation, not globally: the same user might want their French client meetings in French and their internal team meetings in English. A global setting would get one of the two wrong.

Shared transcript translations

Sharing forced one more design decision. When someone opens a shared conversation link, do we translate on demand for them, or do we hand them whatever translations already exist?

We chose the second. Recipients get read-only access to every translation the sender has generated, and the language switcher in the shared view is instant: no API calls, no cost, no latency, because the data is already there. To the recipient, the document simply exists in three languages.

The monorepo advantage

This is where the monorepo quietly earned its keep.

When we added the preferredTranslationLanguage field to the shared Transcription type, TypeScript immediately flagged every place in both the frontend and the backend that needed to handle it. No grep, no guesswork, no field that the API writes and the UI forgets to read.

The supported locales list (['en', 'nl', 'de', 'fr', 'es']) lives in one shared configuration that both the frontend routing and the backend translation service import. If we ever add a sixth language, the change is one entry in that file plus one translation JSON.

What we would do differently

Introduce human sense-checks earlier. We initially trusted AI translations for UI strings and only reviewed them when something looked off. Introducing native-speaker spot-checks from the start, even informally, would have caught awkward phrasing and false-friend translations before they reached production.

Add a translation memory system. Our current system translates each piece of content independently. If the same phrase appears in 50 conversations, it is translated 50 times. A translation memory that caches common phrases and maintains consistency across translations would reduce costs and improve quality.

Test German more aggressively. German capitalization, compound word formation, and formal/informal address conventions make it the hardest language in our set. We kept finding and fixing German-specific issues throughout development; a German-speaking tester from day one would have caught most of them before we did.

What this changed about our architecture

Adding i18n on day one cost us a few extra hours per feature. Each new component was built with translated strings. Each new prompt included the language injection. Each new data field was designed to support translations. That small incremental effort compounded into something significant.

Here is what we got for free by starting early:

1
Every new feature ships in five languages automatically. There is no separate localization sprint. The translated strings are written alongside the component. The language-aware prompt is part of the template. It is just how we build.
2
The architecture never needs retrofitting. URL routing already includes locale prefixes. Database schemas already support multilingual content. AI prompts already inject language requirements. None of these need to be added after the fact.
3
SEO works across markets from launch. Each locale has its own metadata, structured data, and hreflang tags. We rank in five languages without a separate internationalization project.
4
The codebase stays clean. No hardcoded strings to extract. No rendering components to retrofit. No prompt rewrites to schedule. The i18n patterns are baked into the way we write code, not bolted on.

We have heard from other teams that adding i18n months after launch takes weeks of refactoring with high regression risk. By investing a few hours per feature from the start, we avoided that entirely. It is one of the clearest cases for early architectural investment we have encountered.

Five languages. Three layers. Built from the first commit. We have not regretted it once.