A practical guide to SOC2 compliance for AI and LLM applications—controls, audit gaps, and how to build a compliance-ready AI stack.
SOC2 audits are expanding to cover AI systems. Auditors are asking: Where does AI data flow? Is it logged? Encrypted? Who has access? If you're building or scaling an AI application, you need to understand how the Trust Service Criteria apply to LLM integrations and what controls auditors expect. This guide walks through each criterion, common gaps, and how to build a compliance-ready AI stack from day one.
SOC2 (Service Organization Control 2) is a framework for demonstrating that your organization has controls over security, availability, processing integrity, confidentiality, and privacy. It's required by enterprise customers in healthcare, finance, and SaaS. As AI features become core to products, auditors are scrutinizing AI data flows.
AI introduces new data pipelines: user prompts → your backend → third-party LLM provider → responses. Each step must be controlled. Auditors will ask:
Without answers, you risk audit findings that delay deals or require costly remediation.
Security covers protection of systems and data against unauthorized access.
Encryption: Prompts and responses often contain sensitive data. Encrypt at rest (AES-256) and in transit (TLS 1.2+). Logs must also be encrypted—audit trails that store full prompts are a high-value target.
Access controls: Who can view AI logs? Who can change guardrail rules? Implement role-based access (RBAC) and principle of least privilege. API keys for AI providers should be stored in a secrets manager, not in code.
API key management: Rotate keys periodically. Use separate keys per environment. Revoke keys when employees leave. Document key lifecycle in your security policies.
Incident response: Define how you respond to AI-related incidents (e.g., PII leak, prompt injection). Include AI systems in your incident response plan.
Availability covers system uptime and resilience.
Uptime: If your AI proxy or guardrail service is down, do requests fail open or closed? Document the failure mode. Many teams fail open (bypass guardrails) to avoid blocking users; that creates a compliance gap during outages.
Error handling: How do you handle LLM provider errors (rate limits, timeouts)? Retries, fallbacks, and circuit breakers should be documented.
Rate limiting: Protect your AI integration from abuse. Rate limits prevent a single user or attacker from exhausting your API quota or causing cascading failures.
Processing integrity ensures systems achieve their intended purpose without unauthorized modification.
Audit trails: Every AI request and response must be logged with sufficient metadata (user, timestamp, model, outcome). Auditors will sample logs to verify completeness.
Rule enforcement: If you have guardrails (PII detection, secret scanning), prove they're enforced. Log rule evaluations and violations. Demonstrate that blocked requests never reached the LLM provider.
Data validation: Validate inputs before sending to AI providers. Document validation rules and how they're applied.
Change management: Changes to guardrail rules, logging configuration, or AI integrations should follow change control. Auditors expect evidence of review and approval.
Confidentiality covers protection of designated confidential information.
PII handling: Prompts may contain PII. Define how you handle it: block, redact, or allow with logging. Document the decision and implement consistently.
Encryption at rest and in transit: Already covered under Security, but confidentiality explicitly requires protection of confidential data. Ensure AI logs and any stored prompts/responses are encrypted.
Data classification: Classify AI data (e.g., "confidential" if it contains PII or business data). Apply controls based on classification.
Privacy covers collection, use, retention, and disposal of personal information.
Data minimization: Send only the data necessary for the AI task. Avoid including unnecessary PII in prompts. Document what you collect and why.
Retention: Define how long you keep AI logs. Align with your privacy policy and regulatory requirements. Automate deletion—manual processes fail.
Deletion: Support the right to erasure. Can you delete a user's AI interaction history? Implement export and delete capabilities.
Consent and lawful basis: Under GDPR, sending personal data to AI providers requires a lawful basis (consent, legitimate interest, etc.). Document your basis and ensure users are informed.
No AI-specific logging: General application logs don't capture prompts, responses, token counts, or user attribution. Auditors need dedicated AI audit trails.
Unencrypted logs: Storing prompts in plain text in a database or log files creates a finding. Encrypt at rest.
Missing retention policy: "We keep logs forever" or "we delete when we feel like it" fails. Document and automate.
No export capability: Compliance teams need to produce logs for auditors. If there's no export (CSV, JSON, API), you'll scramble during audit prep.
Guardrails not enforced: You have PII detection rules but they're optional or in development only. Auditors will ask for evidence that rules are active in production.
Third-party AI providers: Sending data to OpenAI, Anthropic, etc. without a DPA or data processing agreement. Ensure DPAs are in place and documented.
Log everything: Encrypted audit trail for every AI request and response. Include user, timestamp, model, tokens, and rule outcomes.
Enforce guardrails: PII detection, secret scanning, token limits. Block or redact before data leaves your infrastructure.
Set retention policies: Define retention (e.g., 90 days) and automate deletion. Support export before deletion.
Encrypt at rest: Logs and any stored prompts/responses must be encrypted.
Document everything: Document your AI architecture, data flows, and controls. Keep policies up to date.
DPAs with AI providers: Ensure data processing agreements are signed and on file.
SignalVault provides an encrypted audit trail for every AI interaction, automatic retention based on plan, and export (CSV/JSON) for compliance requests. Guardrails (PII detection, secret scanning, token limits, model allowlists) run inline and are logged with each evaluation. Access is controlled via the dashboard and API keys. Together, these features address Security (encryption, access), Processing Integrity (audit trails, rule enforcement), Confidentiality (PII handling, encryption), and Privacy (retention, export).
AI Audit Logging in the Agent Era
Six months ago, logging LLM calls was enough. Now agents invoke tools, chain actions, and operate autonomously - and most audit logs miss the events that matter. Here's what the next version looks like.
The Complete Guide to AI Audit Logging
Learn what AI audit logging is, what to log, encryption requirements, retention policies, and how audit logs enable SOC2/GDPR compliance.
PII Detection in LLM Applications: A Complete Guide
Learn how to detect and handle PII in AI prompts—detection methods, redaction, GDPR/CCPA implications, and building a PII detection pipeline.
Get started with SignalVault in under 5 minutes.