v1.0Sub-500ms on-device signal processing

The moment
before you react.

Cue reads your voice acoustically — no transcription, no cloud upload — and delivers a peripheral nudge in under 500ms. Coaching that arrives while the conversation is still in flight.

On-device only · No audio leaves your device · HIPAA-compatible by architecture

Live session · 04:38

Speech Rate (ZCR)
Energy (RMS)
Spectral Centroid
Spectral Flatness
Monologue Duration
432ms

Slow down.

Pace 18% above baseline for 640ms. Listeners read this as defensive.

Architecture

Five signals.
No words.

Cue processes raw audio locally using an AudioWorklet, extracting acoustic features 128ms at a time. No speech-to-text. No NLP. No upload.

01

Energy (RMS)

Volume and projection proxy. Sustained elevation reads as aggression or defensiveness before words register.

Root Mean Square
02

Speech Rate (ZCR)

Zero-crossing rate as a pace estimator relative to your personal calibrated baseline. Above baseline signals cognitive overload.

Zero-Crossing Rate
03

Spectral Centroid

Laryngeal tension proxy. Higher values indicate sympathetic activation — tense voices escalate conversations via listener mirroring.

Frequency centroid
04

Spectral Flatness

Harmonic-to-noise ratio proxy. Tracks vocal quality and breath support under pressure.

Wiener entropy
05

Monologue Duration

Continuous unbroken speech detection. Triggers after 30 seconds without a 2-second pause — the turn-taking signal.

Unbroken speech ≥ 30s

End-to-end latency

<500ms

Escalation

When all three primary signals sustain above baseline simultaneously, Cue activates a 24-second breathing pacer (4s inhale / 4s exhale) in your peripheral vision.

Post-session

Integration Tape: EQ score, emotional arc, the moment you missed, and one micro-skill to practice next time.

Scientific foundation

Acoustic truth
precedes thought.

Listeners decode vocal emotion in milliseconds — before words are processed (System 1). Cue intervenes at the acoustic layer, where behavior actually forms.

Scherer / Juslin & Laukka

Vocal emotion decodes across cultures in milliseconds

Listeners respond to acoustic features before processing words — your tone is already working on the other person.

Goldman-Eisler

Speech rate varies systematically with arousal

Pace above personal baseline reads as defensiveness or anxiety to listeners — independent of what you're actually saying.

Sundberg / Titze / Banse & Scherer

Spectral centroid rises with laryngeal tension

Tense voices trigger sympathetic mirroring in listeners, escalating conversations through acoustic contagion.

Winstein

Feedback timing governs skill transfer

Post-action feedback doesn't build durable habits. In-moment feedback does. This is why post-call tools plateau.

Duncan / Sacks / Stivers

Turn gaps cluster near 200ms cross-culturally

Pause behavior is the primary turn-taking signal. Cue's monologue detector catches the gap before it becomes a rupture.

Proprietary corpus

Real-world labeled emotional signal data

The free consumer tier generates a signal corpus that competitors cannot purchase or replicate. The moat compounds.

Why post-call dashboards fail

Winstein (1994) showed that feedback timing governs skill transfer. Post-action feedback does not produce durable behavior change. In-moment feedback does.

The Cue thesis

Gong, Chorus, and Salesforce Einstein are post-call. Their feedback arrives hours after the moment has passed. Cue arrives in 432ms.

Category distinction

Post-call analytics
is a different category.

Gong, Chorus, and Otter are transcription-first. By the time their feedback arrives, the conversation is over. Cue operates in a different category.

CapabilityCueGong / ChorusOtter / Fireflies
Feedback latency<500ms2–8 hoursPost-call
Audio uploaded to cloud
Speech-to-text transcription
Works during the conversation
Personal baseline calibration
HIPAA / legal / clinical safe
Works on any audio sourceZoom onlyZoom / Meet
Peripheral nudge (non-intrusive)

The privacy moat

HIPAA, FERPA, attorney-client privilege, clinical confidentiality, and M&A discretion make cloud-based transcription tools unusable in the segments where real-time coaching matters most. Cue's on-device architecture isn't a feature — it's the only viable form factor.

B2B SaaS SalesCustomer SuccessHealthcare / CliniciansLegal / AttorneysExecutive CoachingFinancial AdvisorsTherapistsInsurance

Pricing

Start free.
Scale on outcomes.

Enterprise pricing is outcome-based — you pay on measurable conversation deltas, not per seat.

Free
$0forever

Full signal processing. Unlimited sessions. Your data never leaves your device.

  • All 5 acoustic signal features
  • Personal baseline calibration
  • Peripheral nudge + breathing pacer
  • Integration Tape post-session
  • macOS + Windows desktop app
Download free
Pro
$15/ month

Session history, trend analysis, and advanced coaching for individuals serious about their craft.

  • Everything in Free
  • Full session history + arc charts
  • Cross-session trend analysis
  • Moment-you-missed callouts
  • Priority access to new features
Start Pro
Enterprise
Outcomecontract

Priced on measurable conversation deltas for revenue teams. Baseline established, outcomes tracked.

  • Everything in Pro × entire team
  • Outcome contract pricing (no seats)
  • Baseline + delta measurement
  • CRO / VP Sales outcome reporting
  • Dedicated onboarding + CSM
Contact us

Enterprise contracts are priced on measurable outcomes — deal-loss rate reduction, rep ramp time, retention success rate.
You don't pay for seats. You pay for results.