Lifecycle

Understanding the journey from voice to action. Every recording flows through distinct phases, each with natural extension points where you can plug in custom logic.

Overview

Talkie processes voice in two main flows: Dictation (real-time, handled by TalkieAgent) and Memos (deliberate recordings, handled by the main app). Both share similar phases but differ in timing and intent.

Dictation

Press hotkey, speak, release. Text appears where your cursor is. Fast, in-flow, ephemeral.

~500ms to paste

Memo

Deliberate recording that becomes a searchable note. Triggers workflows for processing, summarizing, extracting.

Permanent, indexed

Dictation Lifecycle

The dictation flow is optimized for speed. From hotkey press to text appearing, every millisecond counts. Here's the complete journey:

Capture

~50ms setup

Audio flows from microphone to temporary file. Context is captured to know where you were when you started speaking.

Hotkey detected

Carbon event handler fires immediately

Context capturedHook

Which app, window, selected text

Audio capture starts

TalkieAgent begins recording via AudioCapture

State broadcast

XPC notifies main app; UI updates

onCaptureStart

Inspect or modify the capture context. Could auto-route based on which app is active.

When: After context captured, before audio starts

Transcription

~300-800ms

Audio is sent to TalkieEngine for local Whisper transcription. The audio file is saved permanently first—your recording is never lost.

Hotkey released

Stop capture, transition to transcribing

Audio saved

Copied to permanent storage before processing

Transcription requestHook

Sent to TalkieEngine via XPC

Text returnedHook

Whisper model returns transcript

onTranscriptionComplete

Transform or validate the transcript. Apply custom corrections, filter content, or route differently based on what was said.

When: After transcription, before routing

Routing

~50ms

The transcript reaches its destination—pasted into the active app, copied to clipboard, or routed to the scratchpad for editing.

Routing decisionHook

Paste, clipboard, or scratchpad

Text delivered

Keyboard simulation or clipboard write

Sound feedback

Confirmation that delivery succeeded

beforeRoute

Intercept before delivery. Could trigger different behavior based on keywords, app context, or custom rules.

When: After routing decision, before text delivery

Storage

~10ms

The dictation is saved to the local database with full metadata. Available for search, review, and later processing.

Record createdHook

LiveDictation saved to GRDB

Context enrichment

Async enhancement with bridge mapping

XPC notification

Main app notified of new dictation

State reset

Ready for next dictation

onDictationStored

React to completed dictations. Could trigger follow-up actions, sync to external services, or update statistics.

When: After database write, before state reset

Memo Lifecycle

Memos are deliberate recordings that persist and get processed. Unlike dictation, memos trigger workflows that can summarize, extract tasks, or integrate with other systems.

Creation

When you create a memo (via the main Talkie app), the recording follows a similar path but ends differently:

RecordAudio captured via AVAudioRecorder

TranscribeSent to TalkieEngine via EngineClient

Save as MemoStored in GRDB with audio file reference

Trigger WorkflowsAuto-run workflows execute in order

Workflow Execution

Workflows are multi-step pipelines that process memo content. Each step can transform, extract, or route the content.

Workflow Execution Flow

1.Load workflow definition (JSON)

2.Build context from memo (transcript, title, date)

3.Execute steps sequentiallyHook

4.Each step output becomes available to next step

5. Save workflow run record with results

Step types include LLM prompts (transform via AI), file actions (save to disk), clipboard (copy result), and webhooks (POST to external URL).

Extension Points

These are the natural seams in the lifecycle where custom logic could be injected. They represent moments where the flow pauses, a decision is made, or data transforms.

Hook	Phase	Use Cases
onCaptureStart	Capture	Auto-route based on app, disable in certain contexts
onTranscriptionComplete	Transcription	Custom corrections, keyword detection, content filtering
beforeRoute	Routing	Intercept commands ("hey talkie"), transform output
onDictationStored	Storage	Sync to external service, trigger notifications
onMemoCreated	Memo	Auto-categorize, trigger custom workflows
beforeWorkflowStep	Workflow	Inject data, modify prompts, skip steps conditionally

These extension points aren't implemented yet—this is documenting where they could exist. The lifecycle naturally pauses at these moments, making them ideal for hooks.

PreviousYour Data

NextWorkflows