Model APIs & the tool-use loop

An AI model API is a stateless HTTPS endpoint: you POST messages, the model returns a completion. The security-relevant evolution is tool use (function calling): you declare tools (name, description, JSON-schema args) and the model emits a structured call your code executes, feeding the result back. This loop turns a chatbot into an agent - the moment output becomes action.

sequenceDiagram
  autonumber
  participant App as Client App
  participant API as Model API
  participant Tool as External Tool / API
  App->>API: messages + tool definitions
  API-->>App: tool_use request (name, args)
  App->>Tool: execute call (real credentials)
  Tool-->>App: result data
  App->>API: tool_result appended to context
  API-->>App: final answer (or another tool_use)
  Note over App,API: Untrusted tool output re-enters the same channel as trusted instructions

Each return trip is a chance for attacker-controlled content (a page, file, email) to enter the model’s context and be read as an instruction.

Classic API hygiene - still mandatory

# the AI feature is still a web API - test authz, IDOR/BOLA, injection on its params
POST /v1/chat   { "session_id": "../victim-tenant/42", "prompt": "summarize my data" }
# BOLA: swap an object/tenant id to read another user context or RAG corpus
# also probe: unauthenticated /v1/embeddings, verbose errors leaking model/version, no rate-limit

Key management. Hardcoded keys leak via git history, client bundles, decompiled mobile binaries, container logs. Use a secrets manager, separate keys per environment, rotate, and front shared provider keys with an identity-aware gateway issuing per-agent virtual keys.
Token-aware rate limiting. An agent chains 10-20 calls per task in bursts that look like a DDoS, and an 8k-token completion costs ~100× a metadata lookup yet ticks the same “one request.” Limit by tokens/cost per identity with hard spend caps. (LLM10.)
Monitoring. Calls from unexpected geographies, off-hours spikes, sudden volume - treat as possible key compromise.