What is pii redaction?

Glossary

PII Redaction

PII redaction automatically detects and masks sensitive personal data in voice AI transcripts and logs before it gets stored.

All terms

Definition

PII redaction is the automatic detection and masking of sensitive personal information — such as full card numbers, CVVs, Aadhaar, PAN and passwords — from voice AI transcripts, logs and recordings before they are written to storage. Masked values are typically replaced with tokens like [CARD] or [AADHAAR], so operations teams can still audit calls without ever seeing the raw sensitive data.

What PII redaction does

Redaction is the last line of defence between a live voice conversation and the long-term log of that conversation. As the caller speaks an account number or reads a card's CVV, the redaction layer identifies the sensitive span and rewrites it to a placeholder like [CARD] or [OTP] in the stored transcript. The audio recording can be muted or beeped over the same span.

Why it matters

For any business handling payments, collections or KYC, redaction is what makes voice AI legally usable. It is required for DPDP Act data minimisation, and PCI-DSS explicitly forbids storing CVVs and full PANs in plaintext. Redacted logs also make audits and QA review safe: supervisors can listen to coaching calls without being exposed to customer secrets.

How ThinnestAI does it

ThinnestAI uses a hybrid approach. Deterministic regex catches well-structured fields — 12-digit Aadhaar, 10-character PAN, 16-digit card numbers, OTPs — with near-zero false negatives. A lightweight LLM classifier catches free-form sensitive spans that regex misses, like a spoken password. Detection runs on both the ASR output and the TTS input, so the agent also cannot accidentally speak back a full card number. Redaction rules are configurable per agent workload.

Limits and tradeoffs

Redaction is never perfect. Aggressive rules can over-redact and destroy audit value; permissive rules risk leakage. Voice adds extra difficulty because ASR can mis-hear a digit and break a regex match. Teams should tune redaction per use case and regularly sample redacted transcripts for quality.

More definitions

Voice AI Agent

A voice AI agent is an AI system that holds real-time spoken conversations via phone, web or SIP — combining speech recognition, an LLM and speech synthesis.

Voice AI

Voice AI is the umbrella term for AI that understands and generates human speech in real time — powering voice assistants, phone agents and translation.

Conversational AI

Conversational AI is the category of AI that interacts with humans in natural language across chat, voice, email and messaging — using NLU, LLMs and tools.

IVR vs Voice AI

IVR is a rigid scripted tree (press 1 for sales). Voice AI is a natural-language agent that understands free-form speech, reasons and calls tools.

BYOK (Bring Your Own Key)

BYOK lets you bring your own LLM, STT and TTS API keys — the voice AI platform routes usage through your accounts instead of bundling provider costs.

BYON (Bring Your Own Number)

BYON lets you bring your own phone number — via Twilio, Vobiz or Exotel — and connect it to the voice AI platform via SIP instead of renting one.

See all glossary entries