Private beta · macOS 15+

One conversation,
every meeting.

Private, on-device transcription with an AI that knows what was said, who said it, and exactly when — and cites every line.

Download for Mac See how it works

macOS 15+ · Apple silicon · Free during beta

kiromi · 248 meetings

Today, May 10

When did Sarah commit to the launch date?

June 11. Sarah anchored Q2 to the second week of June after the data-readiness exchange — a buffer post-offsite. ▍

Q2 plan · 04:38 Pricing sync · 12:14

Ask 248 meetings… ⌘ K

Audio + transcripts stay on your Mac Apple silicon native No telemetry Citable AI Audio + transcripts stay on your Mac Apple silicon native No telemetry Citable AI

Meetings searchable

From a single index

0.2×

Realtime transcription

On-device, neural

Bytes uploaded

Until you opt in

Citable answers

No cite, no claim

How it works

From meeting to memory in three steps.

No upload. No third party in the room. Nothing leaves your machine until you ask it to.

01

Detects the meeting.

kiromi notices the call start — Zoom, Meet, Teams, FaceTime — and puts a recording offer in your menu bar. One click and you're capturing.
02

Transcribes on your Mac.

FluidAudio transcribes locally on Apple silicon at 1.2× realtime. Mic and system audio stay on separate tracks so speakers stay distinct.
03

Answers across every meeting.

Ask anything. Each answer cites the exact transcript line — click to play the original audio. Filter by person, project, or date.

Local-first

Your meetings never leave your Mac.

FluidAudio's neural transcription runs natively on Apple silicon. Audio, transcripts, and voice profiles stay on your machine — no upload, no third-party processing, no audio crossing the internet unless you explicitly turn on cloud features.

Whisper-class accuracy on-device
Speaker diarization per microphone channel
Voice profiles encrypted in the macOS keychain
Disable network entirely and the app keeps working

Transcribing locally · 1.2× realtime

Cite every line

Every AI answer is citable.

Ask anything across all your meetings. Each claim links back to the exact transcript timestamp — click and you're playing the original audio. Scope conversations to a meeting, person, or date range.

No cite, no claim — answers without sources are flagged
One-click to the exact moment in the audio
Scope conversations to a meeting, person, or date range
Tools execute locally; the LLM never sees your raw transcripts

When did Sarah commit to the launch date?

June 11 — agreed after the data-readiness exchange.

Q2 plan · 04:38 Pricing sync · 12:14

Voice profiles

It learns who's who, once.

Tag a speaker once and kiromi extracts a voice embedding stored in your macOS keychain. Future meetings auto-attribute that voice — across Zoom, Meet, Teams, FaceTime, and the phone.

Tag once, recognized everywhere after
256-dim voice embeddings, encrypted at rest
Confidence-tiered labels (high / medium / low)
Delete a profile and it's gone — no shadow copies

Sarah K.

96%

Marcus L.

91%

Priya R.

87%

Tom B.

79%

Alex P.

74%

and more

Everything kiromi does

Built around the stuff that should have been in calls all along.

Each feature is small, focused, and works whether the network is up or not.

On-device transcription

Apple silicon native ASR + diarization. The audio never leaves your laptop.

Cite-every-line AI

Every claim is hyperlinked to the exact moment in the recording.

Voice profiles

Tag once, recognized forever. Encrypted in the keychain, deletable in one click.

Action items

Spotted automatically and tracked across meetings until they're closed.

Calendar attribution

Pulls invite metadata to give speakers their right names from the start.

Cross-meeting search

Find any phrase, scope by person or project, jump to the audio.

Live waveform inspector

See where each speaker spoke and what they said — at a glance.

Export anywhere

Markdown, JSON, plain text. Your transcripts are your transcripts.

Network-off mode

Disable network entirely and detection, recording, transcription, and search keep working.

Privacy by design

Built for conversations you can't risk leaking.

Voice prints are biometric data, and we treat them that way: encrypted at rest in the macOS keychain, local by default, deletable in one click, with a signed consent receipt for every recording.

Audio + transcripts stay on your Mac
One-click delete with a signed receipt
Voice prints encrypted in the keychain
AI queries via Cloudflare AI Gateway
Open-source detection signals

From beta

What testers tell us.

A few representative quotes from the private beta. Identifying details have been blurred to keep things polite.

"I stopped pretending to take notes. The cited answers are the unlock — every claim links back to the exact moment in the call."

Priya R. · Product, fintech

"We had to verify that nothing left the machine before we could run it. kiromi's network panel and the airplane-mode test sold our security team."

Marcus L. · Engineering lead

"The voice-profile stuff is uncanny. It correctly attributes my co-founder across calls she joined on three different devices."

Sarah K. · Founder

Frequently asked

Things people want to verify.

If your answer isn't here, the docs are at /docs.

Where does my audio go?

Nowhere. Audio files and transcripts are written to your Mac's local storage. The transcription model runs natively on Apple silicon — no upload, no third-party processing.
What about the AI? Doesn't that need a server?

Tool execution and search run locally against your transcript index. When the AI needs to compose a response, kiromi sends the relevant transcript snippets to your chosen LLM (Anthropic, OpenAI, Gemini) via Cloudflare AI Gateway. We never see the content; the gateway logs only token counts.
How do voice profiles work?

When you tag a speaker once, kiromi extracts a 256-dimensional voice embedding and stores it in your macOS keychain — encrypted at rest, never synced. Future meetings auto-recognize that voice. You can delete a profile at any time and it's gone.
Can I use it without an internet connection?

Yes. Detection, recording, transcription, speaker recognition, and search all work offline. Only the AI chat needs network — disable network and the rest of the app keeps working.
Does it work with Zoom / Meet / Teams?

kiromi captures system audio at the OS level, so it works with any meeting app — Zoom, Meet, Teams, Webex, FaceTime, even calls via the Phone app. No per-app integration required.
What does it not do?

kiromi doesn't transcribe video. It doesn't translate. It doesn't summarize meetings you weren't in. It doesn't sell ads against your data. The roadmap is at /changelog.
What macOS versions are supported?

macOS 15 Sequoia and newer, on Apple silicon (M1 or later). Intel Macs are not supported — the on-device model needs the Neural Engine.

One conversation,every meeting.

From meeting to memory in three steps.

Detects the meeting.

Transcribes on your Mac.

Answers across every meeting.

Your meetings never leave your Mac.

Every AI answer is citable.

It learns who's who, once.

Built around the stuff that should have been in calls all along.

On-device transcription

Cite-every-line AI

Voice profiles

Action items

Calendar attribution

Cross-meeting search

Live waveform inspector

Export anywhere

Network-off mode

Built for conversations you can't risk leaking.

What testers tell us.

Things people want to verify.

Stop taking notes.

One conversation,
every meeting.