TECHNOLOGY
Proprietary voice AI pipeline + real-time database — conversation, analysis, and alerts work simultaneously at sub-500ms end-to-end latency.
Voice AI Pipeline
Co-located Model Architecture
Speech recognition, synthesis, turn-taking, and voice activity models run on co-located infrastructure. Streaming LLM integration delivers consistent sub-500ms end-to-end latency.
Encrypted Real-time Transport
Captures real-time audio from the browser microphone with end-to-end encryption. P2P streaming with sub-100ms transport latency.
Voice Activity Detection
Proprietary VAD model detects speech start/end boundaries. Co-optimized with turn-taking for natural conversation timing with elderly users.
Real-time Recognition
Real-time speech recognition engine with ~80ms latency across 90+ languages. Predictive transcription generates text before speech completes.
Memory + Mood + Medicine
Fetches conversation history, mood state, and medication info from real-time DB instantly. Generates personalized, contextual responses.
Large Language Model
Streaming-connected LLM generates first token in ~150ms. Handles emotion classification, crisis detection, and response generation in parallel.
High-quality Synthesis
Multilingual voice synthesis at ~75ms inference latency. Streaming response delivers first audio byte immediately.
Real-time Streaming
Synthesized voice streams to the user in real-time. High-quality audio delivered reliably at low bandwidth.
Async Processing Channels
Parallel analysis systems running alongside the main voice pipeline.
Voice transcript → Emotion classification → Mood journal storage
Psychology-based model analyzes conversation tone and topics in real-time. Results are automatically logged as mood journal entries.
Conversation pattern analysis → Loneliness scoring → Family alert trigger
Applies validated clinical scales to conversation data. When thresholds are exceeded, real-time alerts are sent to the family dashboard.
Conversation transcription → Summary generation → Real-time DB storage
AI auto-generates conversation summaries. Real-time sync reflects changes instantly on the dashboard.
Camera capture → Vision AI → Drug info extraction → Voice guidance
Vision AI model performs OCR analysis on prescriptions. Extracted medication info is injected into voice conversation context.
IntuneLabs Voice AI Platform
Vertically integrated voice AI stack architecture
STT, TTS, VAD, and turn-taking models run on co-located infrastructure. IntuneLabs connects its optimized LLM and RAG context on top of this platform.
Real-time Transport
End-to-end encryption, high-quality audio codec, NAT traversal
Fallback Transport
Bidirectional streaming, inactivity auto-close
SDK
Web and Mobile (iOS/Android) multi-platform support
Real-time STT
~80ms latency, 90+ languages, predictive transcription, auto VAD
High-quality TTS
~75ms inference, multilingual voices, Expressive Mode
Turn-Taking Model
Proprietary conversation timing, optimized for elderly users, natural interruption handling
LLM Server
Streaming response, real-time function calling support
Large Language Model
Fast first-token generation, strong instruction following, Vision support
RAG Knowledge Base
Conversation memory, mood state, medication info — real-time injection
Emotion Analysis
Clinically validated loneliness scale + emotion classification
Family Dashboard
Real-time mood tracking, loneliness alerts, auto-sent conversation summaries
Medicine Guide
Vision AI OCR → drug info extraction → voice guidance integration
Full Technology Stack
Integrated Voice Platform
STT + TTS + VAD all-in-one agent
Real-time STT
~80ms latency, 90+ languages, predictive transcription
High-quality TTS
~75ms inference, multilingual voices
Expressive Voice
Natural intonation and emotion in speech
Real-time Transport
End-to-end encrypted, high-quality audio streaming
VAD + Turn-Taking
Proprietary speech/turn detection models
Large Language Model
Streaming LLM server, real-time function calling
Vision AI
Prescription OCR, image analysis
Loneliness Detection
Clinically validated conversation-based scoring
Emotion Classification
Real-time emotion/sentiment analysis
RAG Context Engine
Conversation memory + mood + medicine context
Auto Summarization
AI-powered conversation title/summary generation
React Full-stack
Server Components, Streaming SSR
Real-time Database
Serverless functions, real-time sync
Edge Network
Global CDN, Edge Functions
TypeScript
Strict mode, full type safety
Utility CSS
Component-based styling system
Interaction Animations
Physics-based animation system
SSO / MFA / RBAC
Enterprise-grade authentication platform
E2E Encryption
AES-256 end-to-end encryption
GDPR / PIPA
EU/Korea privacy regulation compliance
SOC 2 Type II
Service organization security certification
Zero-log Policy
Voice data never stored, immediate discard
HIPAA Ready
Healthcare information protection readiness
CI/CD
Automated pipeline, build, deploy
Auto Deployments
Preview + Production auto-deploy
Real-time Monitoring
Data and performance dashboards
Internationalization
Multilingual (ko/en) i18n framework
Error Tracking
Error and performance monitoring
Server-side PDF
Report PDF server-side rendering
Voice Agent SDK
Voice agent client integration
Real-time Data SDK
Real-time data subscriptions/mutations
Auth SDK
Auth UI components, session management
Accessible UI
Accessibility-first headless components
Image Optimization
Automatic optimization, lazy loading
React Server Components
Server Components, Suspense