Cue reads your voice acoustically — no transcription, no cloud upload — and delivers a peripheral nudge in under 500ms. Coaching that arrives while the conversation is still in flight.
On-device only · No audio leaves your device · HIPAA-compatible by architecture
Live session · 04:38
Slow down.
Pace 18% above baseline for 640ms. Listeners read this as defensive.
Architecture
Cue processes raw audio locally using an AudioWorklet, extracting acoustic features 128ms at a time. No speech-to-text. No NLP. No upload.
Volume and projection proxy. Sustained elevation reads as aggression or defensiveness before words register.
Zero-crossing rate as a pace estimator relative to your personal calibrated baseline. Above baseline signals cognitive overload.
Laryngeal tension proxy. Higher values indicate sympathetic activation — tense voices escalate conversations via listener mirroring.
Harmonic-to-noise ratio proxy. Tracks vocal quality and breath support under pressure.
Continuous unbroken speech detection. Triggers after 30 seconds without a 2-second pause — the turn-taking signal.
End-to-end latency
<500ms
Escalation
When all three primary signals sustain above baseline simultaneously, Cue activates a 24-second breathing pacer (4s inhale / 4s exhale) in your peripheral vision.
Post-session
Integration Tape: EQ score, emotional arc, the moment you missed, and one micro-skill to practice next time.
Scientific foundation
Listeners decode vocal emotion in milliseconds — before words are processed (System 1). Cue intervenes at the acoustic layer, where behavior actually forms.
Vocal emotion decodes across cultures in milliseconds
Listeners respond to acoustic features before processing words — your tone is already working on the other person.
Speech rate varies systematically with arousal
Pace above personal baseline reads as defensiveness or anxiety to listeners — independent of what you're actually saying.
Spectral centroid rises with laryngeal tension
Tense voices trigger sympathetic mirroring in listeners, escalating conversations through acoustic contagion.
Feedback timing governs skill transfer
Post-action feedback doesn't build durable habits. In-moment feedback does. This is why post-call tools plateau.
Turn gaps cluster near 200ms cross-culturally
Pause behavior is the primary turn-taking signal. Cue's monologue detector catches the gap before it becomes a rupture.
Real-world labeled emotional signal data
The free consumer tier generates a signal corpus that competitors cannot purchase or replicate. The moat compounds.
Why post-call dashboards fail
Winstein (1994) showed that feedback timing governs skill transfer. Post-action feedback does not produce durable behavior change. In-moment feedback does.
The Cue thesis
Gong, Chorus, and Salesforce Einstein are post-call. Their feedback arrives hours after the moment has passed. Cue arrives in 432ms.
Category distinction
Gong, Chorus, and Otter are transcription-first. By the time their feedback arrives, the conversation is over. Cue operates in a different category.
| Capability | Cue | Gong / Chorus | Otter / Fireflies |
|---|---|---|---|
| Feedback latency | <500ms | 2–8 hours | Post-call |
| Audio uploaded to cloud | |||
| Speech-to-text transcription | |||
| Works during the conversation | |||
| Personal baseline calibration | |||
| HIPAA / legal / clinical safe | |||
| Works on any audio source | Zoom only | Zoom / Meet | |
| Peripheral nudge (non-intrusive) |
The privacy moat
HIPAA, FERPA, attorney-client privilege, clinical confidentiality, and M&A discretion make cloud-based transcription tools unusable in the segments where real-time coaching matters most. Cue's on-device architecture isn't a feature — it's the only viable form factor.
Pricing
Enterprise pricing is outcome-based — you pay on measurable conversation deltas, not per seat.
Full signal processing. Unlimited sessions. Your data never leaves your device.
Session history, trend analysis, and advanced coaching for individuals serious about their craft.
Priced on measurable conversation deltas for revenue teams. Baseline established, outcomes tracked.
Enterprise contracts are priced on measurable outcomes — deal-loss rate reduction, rep ramp time, retention success rate.
You don't pay for seats. You pay for results.