On August 8, 2025, the first commit landed. Six months and 671 commits later, Neural Summary processes audio files up to 5GB, generates 30 types of professional documents, supports five languages, and serves users who trust us with their most important conversations.
This is the full story of how we built it. Not a highlight reel. The real timeline, with what worked, what broke, and what we learned along the way.
We are sharing this because we believe in building in public. Because every team that ships something ambitious deserves to know they are not alone in the messy middle. And because our users deserve to know the care that goes into the product they rely on.
Month 1: The prototype (August 2025)
The first version was simple. Upload an audio file. Get a transcript. Get a summary.
But even in week one, we felt the weight of the problem we were solving. People record their most important conversations: client calls, strategy sessions, coaching moments. They trust us with words that matter. That responsibility shaped every technical decision from day one.
The first real engineering problem arrived immediately: audio files are large, and speech-to-text APIs have size limits. We built an audio splitting pipeline that chunks large files into segments, processes them in parallel, and merges the results in chronological order. That pipeline, with hardening added over time, still runs in production today.
In the same week, we made a decision that would pay off for months: multi-language support from day one. English, Dutch, German, French, Spanish. Not just UI translations, but language-aware AI generation. Every document the system produces respects the user's language.
It felt like overkill for a prototype. It was not. Our users work internationally. Their conversations cross language boundaries. Supporting that from the start meant we never had to retrofit it, and our users never had to wait for it.
By mid-August, we had a working deployment with Docker, a monorepo architecture (NestJS backend, Next.js frontend, shared TypeScript types), and the first version of what would become our analysis system.
Key decisions in Month 1:
- >Monorepo from the start. Shared types between frontend and backend prevented an entire class of bugs.
- >Queue-based processing architecture. Audio transcription is slow, so we built it asynchronous from day one with a job queue and WebSocket progress updates.
- >Language-aware from birth. Adding i18n later is painful. Adding it to AI-generated content later is worse.
Month 2: The first users (September 2025)
September was about making the product shareable. We added link-based transcript sharing with email distribution, proper analytics, and a landing page that communicated what we were building.
Real people started using it. That changed everything.
We watched how they interacted with the product. We read their feedback. We noticed the gap between what we built and what they needed. The product worked, but the output was generic. A sales call and a coaching session produced identical summaries. Our users deserved better than one-size-fits-all.
Month 3: The V2 rewrite (October 2025)
October was the biggest single month in the project's history. We rebuilt the analysis system from scratch.
The old system generated markdown blobs. The new system generates structured JSON. This sounds like a technical detail, but it changed everything. Structured output meant we could render the same analysis differently across platforms, search inside generated documents, translate them programmatically, and build entirely new features on top of existing data.
We launched 15 analysis templates in a single release: action items, email drafts, blog posts, coaching notes, CRM notes, executive summaries, and more. Each template was a different "lens" through which the same conversation could be viewed. A sales call could produce a follow-up email, a CRM update, and action items. A strategy session could produce a brief, a decision log, and a process diagram. Same conversation. Multiple outputs. Each one crafted to be something a professional would be proud to send.
In the same release, we added subscription billing, usage tracking, and a free tier. And then we found eight security vulnerabilities in our own code during a pre-launch audit: command injection in our media pipeline, missing rate limiting, weak hashing for verification codes, database injection risks, and cross-site scripting vectors. We fixed all eight before launch. Our users' data is sacred to us. There was no discussion about shipping with known vulnerabilities.
V2.0.0 shipped on October 31.
What we learned:
- >Structured output over markdown. Always. The upfront cost of defining JSON schemas pays for itself within weeks.
- >Security auditing before launch is non-negotiable. We found a CVSS 9.8 command injection vulnerability. It would have been catastrophic in production.
Month 4: Polish and performance (November-December 2025)
This period was three releases in quick succession: V2.1, V2.2, and V2.3.
V2.1 introduced folder organization, profile photos, recording recovery, and a better upload experience. The kind of features that make software feel like software instead of a prototype. These were not glamorous features. They were the features that told our users: we care about your daily experience, not just the headline capabilities.
V2.2 was the brand pivot. We changed the primary color from pink to purple, switched the typeface to Montserrat, and updated 92 component files in a single release. We also ran a comprehensive performance audit: memoized React contexts to prevent cascading re-renders, pre-computed folder counts with Map structures to eliminate O(n x m) rendering, parallelized database writes (cutting folder deletion from 30 seconds to under 1 second for 50+ items), and cached authentication tokens.
V2.3 added dark mode with an inline theme script that prevents the white flash on page load, skeleton loading states that match real layouts, and micro-animations that make the interface feel alive without slowing it down.
Three releases in one month. Looking back, we should have combined V2.2 and V2.3. The rebranding and dark mode were really one initiative. But the urgency came from a good place: we wanted our users to feel the improvement immediately, not wait for a monolithic release.
Month 5: The lens redesign (January-February 2026)
This is the phase we are most proud of.
We took every analysis template and redesigned it from scratch. The goal was simple: if a senior consultant, product manager, or sales professional would not be proud to send the output to a client, the template was not good enough.
We spent hours reading consulting deliverables, sales playbooks, and product specs. Not to copy them, but to understand what makes a professional document feel professional. Structure. Clarity. The right level of detail. A point of view.
Executive summaries became MBB-style consulting briefs with the Pyramid Principle structure. CRM notes became sales intelligence with BANT qualification snapshots and stakeholder mapping. Coaching notes became developmental intelligence with classified key moments and accountability tracking. Action items got consulting-grade quality with domain-expert prompts.
We went from 15 templates to 30, organized into five categories that progress from passive to active: Capture, Analyze, Communicate, Deliver, Act.
The key insight: the prompts matter more than the model. A well-designed prompt with clear structure, examples of good and bad output, word limits, and domain-expert positioning consistently produces better output than a vague prompt on a more powerful model. We iterated on every single template dozens of times. The quality our users experience today is the result of that obsessive attention to detail.
Month 6: Streaming chat and cross-conversation intelligence (March-April 2026)
Two features defined this month.
First, streaming AI chat. Users can now have a conversation about their conversation. Ask questions, get citations with timestamps and speaker names, explore the content interactively. The responses stream token by token using Server-Sent Events, with citation buffering that holds back incomplete references until they resolve. It feels instant. It feels like talking to someone who was in the room.
Second, folder-level intelligence. Organize conversations into folders, then ask questions that span all of them. "What decisions have we made about pricing across the last 12 sales calls?" The system indexes all conversations and returns answers with citations referencing the specific conversation, timestamp, and speaker.
These two features together transformed the product from a document generator into a knowledge base. Your conversations are no longer isolated recordings. They are a searchable, queryable body of institutional knowledge.
What we got right
The monorepo. Shared TypeScript types between frontend and backend prevented countless integration bugs. When a type changes in one place, everything that depends on it breaks at compile time, not at runtime.
Queue-based processing from day one. We never had to retrofit asynchronous processing. The architecture scaled from one user to concurrent jobs without a rewrite.
Structured JSON output. This single decision enabled multi-language support, semantic search, programmatic exports, and the entire lens template system. Markdown would have been a dead end.
Investing in the lens templates. The product differentiation is not in the transcription. Every competitor transcribes. The differentiation is in what the system produces afterward: the consulting brief, the backlog, the follow-up email, the process diagram. The templates are the product. And we treat them with the same care a craftsperson treats their finest work.
What we would change
Earlier user research. We built the first three months primarily from our own intuition about what users needed. Talking to users earlier would have accelerated the lens redesign and helped us find product-market fit faster.
Fewer releases, more testing. Three releases in one month was too many. Each one introduced subtle regressions that took time to find. A two-week stabilization cycle between major releases would have been worth the slower pace.
The brand pivot earlier. We changed the brand twice: once from pink to purple, once from "voice-to-output creation platform" to "AI meeting notes for business professionals." Both pivots were right. But doing them months in meant updating hundreds of components and translations twice. If we had invested two weeks in brand clarity before writing code, we would have saved many more weeks later.
What is next
We are building a mobile companion app so users can record meetings anywhere. The lens template library continues to expand. We are building deeper integrations with the tools teams already use. And the AI interview feature, where Neural Summary asks questions to extract and structure ideas into deliverable documents, is in development.
671 commits. 30 lens templates. Five languages. Streaming AI chat. Cross-conversation intelligence. Built by a small team in six months.
We built this for the professionals who spend hours after every meeting writing the follow-up email, the spec, the backlog, the brief. We built it because that work matters, and because your time matters more.
The product is live. The roadmap is long. And we are publishing the engineering stories behind every major feature in the coming weeks. If you are building something ambitious with a small team, we hope this timeline gives you energy. It is possible. Ship it.


