The Complete Guide to AI Audit Logging
Learn what AI audit logging is, what to log, encryption requirements, retention policies, and how audit logs enable SOC2/GDPR compliance.
AI audit logging is the practice of recording every interaction between your application and an LLM provider—prompts, responses, metadata, and context. It's not just a debugging tool. For production AI systems, audit logs are the foundation of compliance, security, and cost governance. This guide covers what to log, how to store it, and common pitfalls that leave teams exposed during audits.
What AI Audit Logging Is and Why It Matters
When you call an LLM API, you're creating a data flow: user input → your backend → third-party AI provider → response back to the user. Without logging, that flow is invisible. You can't answer basic questions: What was sent? When? By whom? What did the model return?
**Compliance.** SOC2, GDPR, HIPAA, and PCI-DSS all require demonstrable controls over data processing. Auditors ask: "Show me how you handle AI requests." If you can't produce logs, you fail the control.
**Security.** Logs let you detect abuse—prompt injection attempts, credential stuffing, or anomalous usage patterns. They also support incident response: when a breach occurs, you need to know what data was exposed.
**Cost tracking.** LLM APIs charge per token. Without logging token counts per request, you can't attribute costs to users, features, or environments. Budget overruns become mysteries.
**Debugging.** Yes, logs help with debugging. But treating them as a debugging-only tool leads to ad-hoc implementations that don't meet compliance requirements.
What to Log
A complete AI audit log entry should capture:
Prompts and Responses
- **Input prompt** (or a hash if you must redact for storage)
- **Model response** (full text or truncated with metadata)
- **System prompt** (if used)—often contains sensitive instructions
Storing full text creates retention and PII concerns. Some teams hash prompts and store only metadata for compliance; others encrypt and store full content. The key is consistency and a documented policy.
Metadata
- **Model identifier** (e.g., `gpt-4o`, `claude-3-opus`)
- **Token counts**: input tokens, output tokens, total
- **Latency** (ms from request to response)
- **User ID** or session identifier (for attribution)
- **Environment** (production, staging, development)
- **Request ID** (for correlation across systems)
- **Timestamp** (ISO 8601, UTC)
Context
- **Application or feature** (e.g., "support chatbot", "code assistant")
- **API key or client identifier** (for multi-tenant cost allocation)
- **Rule violations** (if guardrails blocked or warned)
Example log structure:
{
"id": "req_abc123",
"timestamp": "2026-02-27T14:32:01.234Z",
"user_id": "user_xyz",
"environment": "production",
"model": "gpt-4o",
"input_tokens": 450,
"output_tokens": 120,
"latency_ms": 1234,
"violations": ["pii_detected"],
"prompt_hash": "sha256:..."
}
Encryption Requirements
Plain-text logs are a liability. If logs contain PII, prompts, or responses, they must be encrypted at rest. Regulatory frameworks (SOC2, GDPR) expect encryption for sensitive data.
**At rest:** Use AES-256 or equivalent. Cloud providers (AWS, GCP, Azure) offer managed encryption for object storage and databases.
**In transit:** TLS 1.2+ for all log ingestion and retrieval.
**Key management:** Rotate keys periodically. Use a key management service (KMS) rather than hardcoding keys. If an attacker gains access to your log store, encrypted data with proper key separation limits exposure.
Many teams discover too late that their `console.log` or file-based logs are unencrypted. Migrating to an encrypted audit trail after an audit finding is expensive and risky.
Retention Policies
How long you keep logs depends on:
- **Regulatory requirements:** GDPR allows retention only as long as necessary. Some industries (finance, healthcare) have specific retention mandates (e.g., 7 years).
- **Storage costs:** Logs grow quickly. A high-traffic AI app can generate gigabytes per day.
- **Operational needs:** Debugging and incident response typically need 30–90 days of hot data.
**Recommendations:**
- Define a retention policy in writing (e.g., "90 days hot, 1 year cold, then delete").
- Automate deletion. Manual processes fail.
- Support export for compliance requests before deletion.
- Document exceptions (e.g., litigation hold).
Structured vs. Unstructured Logging
**Unstructured logging** (plain text, `printf`-style) is human-readable but hard to query. Searching for "all requests from user X that used gpt-4" requires grep or full-text search. It doesn't scale.
**Structured logging** (JSON, key-value pairs) enables:
- Filtering by user, model, environment
- Aggregation (cost by user, latency percentiles)
- Integration with SIEM and analytics tools
Prefer structured logs with a consistent schema. If you use multiple AI providers, normalize the schema so you can query across them.
Common Mistakes
Logging to stdout
Developers often log to stdout and rely on log aggregation (e.g., Datadog, Splunk). Problems:
- Stdout logs are often unencrypted in transit and at rest
- Retention is tied to your log vendor, not your policy
- Export for audits may be difficult or costly
- No guarantee of completeness (buffers, truncation)
Use a dedicated audit log store with encryption and retention controls.
Missing metadata
Logging only the prompt and response misses the "who, when, where." Auditors need user attribution and timestamps. Cost allocation requires token counts and model identifiers.
No encryption
Storing prompts and responses in plain text creates a compliance gap. Encrypt sensitive fields at minimum; ideally, encrypt the entire log store.
Inconsistent retention
Ad-hoc retention (some logs kept forever, others deleted quickly) confuses auditors and increases storage costs. Define and automate.
No export capability
Compliance teams need to produce logs for auditors. If your system has no export (CSV, JSON, or API), you'll scramble during audit prep.
How Audit Logs Enable SOC2 and GDPR Compliance
**SOC2** focuses on security, availability, processing integrity, confidentiality, and privacy. AI audit logs support:
- **Security (CC6.1):** Logs demonstrate access controls and monitoring
- **Processing Integrity (PI1.1):** Logs show that AI requests were processed as intended
- **Confidentiality (C1.1):** Encrypted logs protect sensitive data
**GDPR** requires lawful processing, data minimization, and the right to erasure. Audit logs help you:
- Demonstrate what data was sent to AI providers
- Support data subject access requests (export a user's AI interactions)
- Implement retention and deletion policies
Tools like SignalVault provide an encrypted audit trail, automatic retention based on plan, and export capabilities (CSV/JSON) so you can produce evidence for auditors without building custom infrastructure.