Production ML Track

Speech Safety, Privacy, And Security

Speech engineers must protect users from prompt injection, impersonation, unsafe voice cloning, PII leakage, retention mistakes, and abuse while still shipping useful ASR, TTS, and speech-to-speech systems.

Threat Model

Speech Systems Expand The Attack Surface

Audio systems inherit normal ML risks, then add speaker identity, acoustic context, transcription uncertainty, replay attacks, consent, and the emotional impact of generated voices.

Spoken Prompt Injection

A user or background speaker says instructions that try to override policy, reveal private context, or change tool behavior.

Hidden answer: strong mitigation

Treat ASR output as untrusted user input. Keep system and tool policy outside the transcript, classify tool intent separately, require explicit confirmations for sensitive actions, evaluate retrieval grounding, and log sanitized decision traces rather than raw audio by default.

Voice Impersonation

A generated or replayed voice attempts to pass as a trusted person, approve a transaction, or access private data.

Hidden answer: strong mitigation

Do not treat voice likeness as authentication. Use separate factors, liveness and replay checks where appropriate, risk-based step-up confirmation, consented enrollment, anti-spoof evals, and clear product boundaries around what a cloned voice may do.

PII In Audio And Transcripts

Raw audio, transcripts, embeddings, traces, and labels may contain private names, addresses, health data, or account facts.

Hidden answer: strong mitigation

Design retention before collection. Prefer derived metrics, explicit opt-in review pools, scoped access, redaction, encryption, dataset lineage, deletion workflows, and synthetic fixtures for CI. A debugging need is not a blanket reason to store raw audio.

TTS Abuse And Harmful Output

TTS can produce deceptive calls, harassment, unsafe instructions, or brand-damaging speech if generation is under-controlled.

Hidden answer: strong mitigation

Gate text before synthesis, restrict high-risk voices, watermark or disclose where suitable, rate-limit abuse patterns, maintain voice owner consent, and create rapid takedown and rollback paths. Measure abuse response latency as an operational metric.

Control Plane

Build Safety Into Release Gates

Safety should not be a final checklist after model quality passes. It should be part of dataset contracts, eval design, CI/CD, monitoring, incident response, and cost-aware serving.

  1. Data contract: define consent, retention, redaction, allowed features, labeling access, and deletion behavior.
  2. Model contract: document input modalities, output risks, unsupported use cases, safety filters, and fallback modes.
  3. Eval contract: include prompt injection, replay, PII leakage, toxic synthesis, language slices, and noisy-background cases.
  4. Serving contract: isolate tenants, version policies, record sanitized traces, enforce rate limits, and keep rollback fast.
  5. Incident contract: define severity, owner, mitigation lever, user communication path, and evidence preservation rules.
Question: What is the difference between model safety and system safety?

Model safety asks whether the model behaves acceptably for tested inputs. System safety asks whether the full product remains safe when transcription is wrong, retrieval is stale, policies change, tools are available, attackers adapt, latency spikes, and operators need to debug without exposing private data.

Coding Labs

Small Safety Utilities

These exercises are deliberately compact. They train the habit of turning abstract safety concerns into inspectable controls and tests.

Lab 1: Transcript Redaction Gate

Given a transcript and a list of sensitive patterns, return a redacted transcript and a flag saying whether storage requires restricted handling.

Hidden answer: invariant, tests, and Python solution

Invariant: each sensitive detector is applied before persistence, and any match upgrades handling. Test no matches, repeated matches, overlapping categories, and transcripts that should be discarded rather than stored.

import re


def redact_transcript(text, detectors):
    restricted = False
    redacted = text
    for label, pattern in detectors:
        redacted, count = re.subn(pattern, f"[REDACTED_{label}]", redacted)
        if count:
            restricted = True
    return redacted, restricted


detectors = [
    ("EMAIL", r"\b[\w.+-]+@[\w-]+\.[\w.-]+\b"),
    ("PHONE", r"\b\d{3}[-.]\d{3}[-.]\d{4}\b"),
]

Lab 2: Sensitive Tool Confirmation

Given a parsed voice intent, decide whether the assistant may execute it, needs confirmation, or must refuse.

Hidden answer: invariant, tests, and Python solution

Invariant: high-risk action classes require explicit confirmation from the active user context, and disallowed actions are refused even when the transcript sounds confident. Test background speaker commands, money movement, account deletion, and harmless queries.

def tool_decision(intent, policy):
    action = intent["action"]
    if action in policy["blocked"]:
        return "refuse"
    if action in policy["requires_confirmation"]:
        if intent.get("confirmed") and intent.get("speaker") == "active_user":
            return "execute"
        return "confirm"
    return "execute"


policy = {
    "blocked": {"share_secret", "impersonate_person"},
    "requires_confirmation": {"send_message", "delete_file", "place_order"},
}

Lab 3: Abuse Spike Detector

Given per-window counts for blocked TTS requests, detect a possible abuse spike while ignoring tiny sample sizes.

Hidden answer: invariant, tests, and Python solution

Invariant: alert only when volume is large enough and the blocked fraction meaningfully exceeds baseline. Test zero traffic, low volume, high volume with normal rate, and high volume with a sharp blocked-rate increase.

def abuse_spike(window, baseline_rate, min_requests, multiplier):
    total = window["total"]
    blocked = window["blocked"]
    if total < min_requests:
        return False
    if baseline_rate <= 0:
        return blocked > 0
    return (blocked / total) >= baseline_rate * multiplier

Interview Prompts

Advanced Safety And Privacy Questions

Prompt 1: A Voice Agent Executes Background Instructions

A user reports that a support agent followed instructions spoken by a TV in the background. How do you triage and prevent recurrence?

Hidden answer: strong response

Mitigate first by disabling or confirming sensitive tools, then inspect sanitized traces for ASR segments, speaker labels, intent confidence, tool policy version, and confirmation state. Prevent recurrence with active-speaker checks, tool-risk tiers, confirmation prompts, injection evals, and a canary gate that includes noisy background audio.

Prompt 2: A Customer Requests Deletion Of Their Voice Data

What should a production speech platform delete, and what metadata may be kept?

Hidden answer: strong response

Delete raw audio, transcripts, labels, embeddings, derived examples, and model-training references linked to the user according to the retention contract. Keep only permitted aggregate metrics and audit records that do not reconstruct the user's content. The answer should mention lineage, backups, downstream datasets, and proof of deletion.

Prompt 3: A TTS Voice Is Used For Fraudulent Calls

Your platform detects a spike in generated calls impersonating a public-facing employee. What actions do you take?

Hidden answer: strong response

Rate-limit or disable the abusive route, preserve evidence, notify safety/legal owners, revoke the voice if consent or policy is violated, add detection rules, and review gaps in voice enrollment, output filtering, and abuse monitoring. Measure time to mitigation and ship regression tests before re-enabling risky paths.