Lesson 01 · Market

Market: how to read the rings.

Markets are not single numbers. They are nested rings — total opportunity, the slice you can reach, and the wedge you can actually win in three years. This lesson teaches you to read those rings, treat competitors as evidence that the problem is real, and find the empty quadrant on a 2×2 map. We apply each idea to Babelio's space: AI dubbing, speech-to-speech translation, and consumer voice-OS overlays.

Duration

14 min read

Format

read + checklist

Goal

TAM / SAM / SOM

Outcome

5 numbers about your market

why this matters for you

context
Babelio sits inside three growing markets at once — AI dubbing ($1.15B, 17.7% CAGR), speech-to-speech translation, and the voice-OS overlay category that Wispr Flow proved fundable at $700M. You need to know which ring funds which kind of story.
risk
Apple, Microsoft and Google all moved in 2025 — AirPods Live Translation, Teams Interpreter, DeepL Voice. Mis-size the market and you'll either pick a fight you can't win or miss a wedge sitting in plain sight.

What this lesson does / does not do.

Does

Teach the TAM / SAM / SOM frame and how to sanity-check each ring.
Show how to use competitors as validation, not threat.
Read a 2×2 positioning map and find empty quadrants.
Translate "market trends" into specific founder decisions.

Does not

Hand you a finished investor-pitch market slide.
Cover audience psychographics — that is Lesson 02.
Price a category (Lesson 04) or pick channels (Lesson 06).
Replace primary research with paid analyst reports.

01.

Concept 01 · TAM / SAM / SOM

Three rings of the same market.

4 minread

TAM, SAM and SOM are not three different markets — they are three views of the same market at different distances. Confusing them is the most common pitch-deck mistake.

TAM is total addressable market: every dollar that could be spent on this category by everyone, everywhere, if there were no friction. It tells you whether the problem is large enough to matter. SAM is the slice you can serve with your current product, geography and channel — same problem, narrower lens. SOM is what you can realistically capture in three years given your team, money and execution speed.

The mistake is using TAM as a proof of seriousness ("$100B market!") and SOM as if it were guaranteed. The correct reading is inverse: TAM proves the category isn't a hobby; SOM proves you've done the maths. A founder who can't compute SOM bottom-up is asking you to trust their vibes.

in your startup

TAM
AI in Language Translation umbrella: $2.94B (2025) → $3.68B (2026), 25.2% CAGR. Language Translation Software adjacent: $116.55B by 2035.
SAM
Speech-to-Speech Translation: $481.61M (2025) → $1.19B by 2030 (9.5–10.1% CAGR). AI Dubbing Tools adjacent: $1.15B → $2.56B (17.7% CAGR).
SOM Y3
Bottom-up: 32.6M US remote workers × 15% multilingual × 5% reachable × $8/mo × 12 = ~$23M ARR US alone. Add streamer + student segments → ceiling ~$50M ARR, conservative target $5M ARR Y2.
implication
You sit in two double-digit-CAGR rings simultaneously. Story is "consumer voice-OS layer", anchored in the AI-dubbing ring for pitch and the speech-to-speech ring for technical depth.

02.

Concept 02 · Competitors as validation

Competitors are your audit, not your enemy.

4 minread

A market with no competitors is usually a market with no customers. Competitors are third-party evidence that the willingness-to-pay you assume actually exists.

Read competitors in three layers. Direct: same job, same channel — they reveal your real price ceiling and feature floor. Adjacent: same job, different shape — they teach you which assumptions to challenge. Substitutes: not products at all — the hack people use today (a side tab with Google Translate, a teammate translating inline). Substitutes are the truest measure of pain.

The worst category is one with many adjacents and zero direct — it usually means the problem is too narrow to fund a dedicated tool. The best category is one with several well-funded direct competitors all attacking from one side and ignoring another. The ignored side is your wedge.

in your startup

direct
DeepL Voice ($8.74/mo, $2B valuation) — best MT, but app-locked to Teams/Zoom/mobile. Microsoft Teams Interpreter — Teams-only, enterprise-bundled. Krisp ($37.7M revenue, S2S in 2025) — system-audio, but pivoting B2B call-center.
adjacent
HeyGen (~$100M ARR) and ElevenLabs ($3.3B val) — async dubbing for video. Otter.ai ($100M ARR, 35M users) — captions, English-centric. Wispr Flow ($700M val, Nov 2025) — proves OS-overlay is fundable.
substitute
The real competitor: a Chrome tab with Google Translate + manual copy-paste, plus "asking a teammate to repeat".
read
No one occupies system-wide × voice-dub × consumer. DeepL/Teams are app-locked; HeyGen/Rask are async; Krisp is B2B. Wispr proves OS-overlay distribution. Combined, that is your fundability story.

03.

Concept 03 · The 2×2

Find the empty quadrant.

3 minread

A 2×2 is not a marketing artefact. It is the cheapest tool for seeing which orthogonal axis everyone else is ignoring.

Pick two axes that are independent — not correlated with each other and not a restatement of price-vs-quality. Plot every direct and adjacent competitor. The valuable output is not where you sit; it is the quadrant that stays empty. An empty quadrant is either a wedge (no one has solved it yet) or a graveyard (no one wants it). The interview script in Lesson 02 tells you which.

Two warnings. First, do not invent axes to put yourself top-right; investors see this in seconds. Second, an empty quadrant is a hypothesis, not a moat — Lesson 06 explains how to defend it before Big Tech notices.

in your startup

axes
X: per-app scope ↔ system-wide. Y: subtitles ↔ voice dub. Both axes describe real architectural choices, not vibes.
filled
Lower-left: Otter, Chrome Live Caption (per-app captions). Lower-right: Krisp, Wispr (system-wide but no translate). Upper-left: HeyGen, DeepL Voice, Teams Interpreter (voice dub but app-locked).
empty
Upper-right: system-wide voice dubbing for consumers. Babelio's lane. No incumbent has it as of 2026-05.
window
Apple macOS 16 system-wide translate is rumoured for WWDC. Your wedge has a ~12-month moat window; ship inside it.

04.

Concept 04 · Trends become decisions

A trend is a decision in disguise.

3 minread

Most "trends" slides are wallpaper. The useful trick is to convert each trend into a specific decision you make differently because of it.

For every trend you write down, finish this sentence: "because of this, we will do X and not Y." If you cannot finish it, the trend is decoration. Three buckets matter: technology curves (what becomes possible), consumer behaviour (what becomes habitual), and regulation (what becomes mandatory). Each demands different proof — benchmarks for tech, surveys for behaviour, statutes for regulation.

Founders fail at this in opposite directions. They either invoke trends abstractly ("AI is changing everything") or worship one trend so hard they ignore the others ("on-device latency dropped, therefore we're a winner"). The market sits at the intersection of all three. So does your decision.

in your startup

tech
Three latency curves crossed in 2025: STT <300ms, MT <200ms, TTS <200ms — total <700ms end-to-end. Decision: ship now, before the window normalises.
os
macOS 14+ CoreAudio Process Taps + Windows 11 WASAPI per-process loopback. Decision: no kernel extensions, no driver install — ship a 3-10MB Tauri binary, not 80MB Electron.
behaviour
52.9% of EU enterprises run remote meetings; Gen Z consumes ~30% foreign-language content. Decision: consumer wedge first, not enterprise sales cycle.
reg
EU AI Act Article 50 lands Aug 2026 — synthetic voice must be labeled, fines up to 7% revenue. Decision: watermarking + UI disclosure from day 1, not retrofitted later.

05.

Concept 05 · Gaps and moats

Wedges are not moats yet.

3 minread

A gap is what you find on the 2×2. A moat is what keeps the gap closed behind you. They are different objects with different lifecycles.

Wedges expire. The minute your gap is visible — through a Product Hunt launch, a TechCrunch piece, an investor announcement — the clock starts. Moats grow inside that window: distribution lead, integration depth, data flywheel, brand. A founder who confuses "no one has done this" with "no one can do this" is borrowing a moat they have not earned.

The honest read on most software moats is brutal — model quality is commodity, APIs are interchangeable, and design is copyable. What is not copyable in twelve months: cumulative OS-integration work, a base of paying habits, and a creator community that names you as the default.

Don't claim "AI moat" if your AI is GPT/Gemini/Claude APIs

Investors will read it as naïve. Foundation models are commodity in 2026 — your moat is the audio routing, OS integrations, per-app glossaries, and creator distribution you build on top.

Don't pick a wedge that's also Apple's roadmap

macOS 16 is rumoured to ship system-wide translation. If Apple ships it Apple-only at WWDC, you need a Windows-first counter-position and an obvious "works everywhere Apple doesn't" story already in market.

in your startup

wedge
OS-wide audio capture + auto-mute + voice dub at $5–$15/mo consumer price. Empty quadrant on the 2×2. None of DeepL, Teams, HeyGen, Otter or Krisp combine all three.
moat
Three candidates from the research: (1) audio-routing IP (auto-duck + dub mix), (2) latency optimisation across STT→MT→TTS, (3) prosumer distribution before Big Tech ships consumer equivalent.
analog
Krisp proved system-wide audio enhancement scales to $37.7M revenue bootstrapped at $8/mo. Same architecture, 10× larger problem. Use this story directly.

Checklist for this week.

Six concrete actions. By Friday you should be able to recite five numbers about your market from memory and point at one empty quadrant on a 2×2 you drew yourself.

Re-derive SOM bottom-up from three independent inputs (US remote workers, Twitch creators, language learners). Compare to the $5M ARR Y2 target — adjust if reality bites. Update the competitor table from research/market.md with current pricing and ARR — DeepL, Krisp, Otter, HeyGen, Wispr move every quarter. Draw the 2×2 (per-app ↔ system-wide × subtitles ↔ voice dub) by hand on paper, plot every competitor, screenshot it. Pick three trends from §4 of market.md and write the "because of this we will do X not Y" decision for each. Write the one-paragraph Apple/macOS-16 contingency plan — what happens to Babelio if WWDC 2026 ships system-wide translate. List the three substitutes a current user pays $0 for today (Google Translate side tab, teammate translator, YouTube auto-captions). These are what you have to beat.

lesson mantra

«A market is a rumour about money. Your job is to verify it.»

— onward to Lesson 02 · Audience

Next lesson · 02

Audience: how to find your person.

→