The AI attack surface & secure lifecycle
Before the specific attacks, fix the two maps you’ll reuse throughout. The surface has four regions, and every later section lives in one of them: data (training, fine-tune, RAG corpora - II.2, II.13), model (weights, the inference behavior - II.1), application (prompts, tools, agent logic, the protocols - II.3, II.5-II.10), and infrastructure (serving, vector stores, pipelines, cloud - II.7, II.11, II.12, II.13). Google’s SAIF maps cleanly onto these four areas, which is why it crosswalks well to everything else.
[ ] Which features are model-backed? (search, summarize, chat, autocomplete)[ ] What model/version + guardrail sits behind each? (fingerprint, II.17 Ch2)[ ] What can the model reach? tools, RAG corpus, memory, other agents (MCP/A2A)[ ] Which actions are irreversible / outbound? (email, payments, code exec)[ ] Where does untrusted content enter? (user, web fetch, files, tool results)# the answers are the map you attack (II.17) and defend (III.1)The lifecycle is the second map: data collection → training/fine-tuning → evaluation → deployment → monitoring → retirement. Attacks attach at each stage (poisoning at training, extraction and injection at inference, drift and abuse in production), and so do controls. Thinking in lifecycle stages is what turns a list of attacks into a defensible program - it tells you where a given control belongs.