Deepgram Teams with Fortanix and NVIDIA to Offer Private On‑Prem Voice AI

Deepgram announced a partnership with Fortanix that lets enterprises run its real‑time voice AI models on‑premises while keeping both audio data and proprietary model weights encrypted during active inference. The solution leverages Fortanix Confidential AI and NVIDIA Confidential Computing‑enabled GPUs, targeting highly regulated sectors that must protect data at rest, in transit, and in use. By combining Deepgram’s low‑latency, high‑accuracy speech‑to‑text, text‑to‑speech, and speech‑to‑speech capabilities with hardware‑rooted isolation, the joint offering promises the same real‑time performance of cloud deployments without ever exposing raw audio or intellectual property to the underlying infrastructure. This on‑prem stack is positioned for organizations that handle patient conversations, financial transactions, or classified briefings—use cases where traditional cloud‑based APIs are often ruled out by strict compliance and data‑sovereignty requirements.

Deepgram, Fortanix, and NVIDIA Enable In‑Use Encryption for Voice AI

The three companies have built a pre‑integrated stack that runs Deepgram’s speech‑to‑text, text‑to‑speech, and speech‑to‑speech models inside a hardware‑isolated Trusted Execution Environment (TEE). NVIDIA GPUs provide the Confidential Computing platform, while Fortanix Confidential AI creates the enclave that encrypts both the incoming audio and the model weights while they are being processed. According to the announcement, this architecture prevents the host operating system, privileged administrators, or any other infrastructure component from accessing the data or the model during inference.

The partnership is grounded in the same technical narrative described in the source: Fortanix Confidential AI builds on NVIDIA’s Confidential Computing to form TEEs that isolate AI workloads from the underlying OS and hardware, keeping data and model weights encrypted in memory. This means that even a privileged insider cannot read the audio stream or extract the model’s proprietary parameters. Deepgram’s CEO and Co‑Founder Scott Stephenson emphasized that the joint solution “makes it possible to run voice AI without compromise. Your data will stay protected, your models will stay protected, and you still get real‑time performance.” Fortanix CEO Anand Kashyap added that the approach gives model owners a “secure, proven path to unlock new markets and enterprise customers without exposing their most valuable IP.” NVIDIA’s VP of Enterprise AI Platforms Justin Boitano highlighted that the combined platform offers “sovereignty and trust” for regulated industries.

Beyond the high‑level description, the source notes that the stack is engineered for “any environment including those with the highest confidentiality and regulatory needs,” and that Deepgram’s models deliver “accuracy, consistency, and low latency that enterprise use demands.” By pre‑integrating the three technologies, the solution eliminates the need for customers to stitch together separate security layers, providing a single, production‑grade pathway to deploy voice AI on‑premises while meeting stringent compliance mandates.

Platform Architecture and Security Guarantees

Fortanix Confidential AI builds on NVIDIA’s Confidential Computing to create TEEs that isolate AI workloads from the underlying OS and hardware. Within these enclaves, data and model weights remain encrypted in memory, making them inaccessible even to privileged insiders. The solution is designed to meet compliance regimes such as HIPAA, GDPR, and national‑data‑residency requirements.

Deepgram’s voice models are engineered for low latency and high accuracy, and the partnership claims the on‑prem stack delivers “real‑time performance” comparable to cloud deployments. The joint offering is positioned for environments where security policies prohibit sending raw audio or model parameters to external clouds, such as hospitals processing patient conversations, financial firms handling transaction calls, or government agencies dealing with classified briefings.

Implications for Regulated Enterprises

The announcement opens several use cases for organizations that have previously avoided voice AI due to data‑privacy concerns:

Private, on‑prem voice agents that can handle sensitive customer or patient interactions without exposing recordings or model logic.
Enterprise‑wide transcription layers that capture calls, meetings, and internal conversations for analytics, compliance, or search while keeping the content encrypted during processing.
Voice‑enabled IT, operations, and service‑desk tools that run entirely within a secure perimeter, eliminating the need for external API calls.

By keeping both audio and model weights encrypted throughout inference, the solution aims to satisfy the “security, confidentiality, and regulatory requirements” that many enterprises cite as blockers to adopting real‑time voice interfaces.

Key Takeaways

Deepgram can now run its voice AI models on‑premises using Fortanix Confidential AI and NVIDIA Confidential Computing‑enabled GPUs, encrypting data and model weights during active inference.
The joint stack creates a hardware‑isolated Trusted Execution Environment that meets compliance standards such as HIPAA and GDPR, protecting against access by the host OS or privileged administrators.
The solution targets regulated sectors—healthcare, finance, government—by enabling private voice agents, secure transcription layers, and voice‑enabled IT tools without sacrificing real‑time performance.

TechInsyte's Take

This collaboration provides a concrete path for enterprises that must keep audio and AI assets within their own security perimeter, removing a key barrier to adopting voice AI in regulated environments. Buyers should verify integration complexity, performance benchmarks on their specific hardware, and ongoing support commitments, as the practical rollout details remain limited in the announcement.

Source: Businesswire