Skip to content

How LLMs work

A large language model is a network trained to do one deceptively small thing: predict the next token. Everything else — answering, coding, reasoning — emerges from doing that extremely well, repeatedly.

How an LLM turns a prompt into output (concrete trace)
"AI security is" --tokenize--> ["AI"," security"," is"] (-> token ids)
-> model scores next-token probabilities -> sample/argmax -> " hard"
-> append and repeat (autoregressive) -> "AI security is hard to get right."
# everything the model "knows" lives in weights; the prompt is the only runtime control surface
# which is exactly why prompt injection is the defining new attack class
  • Tokens & tokenization. Text is chopped into subword units called tokens (roughly ¾ of a word each). The model only ever reads and writes tokens, not characters or “words” as you think of them.
  • Embeddings & vector space. Each token is turned into an embedding — a long list of numbers, a vector. Vectors that sit close together mean similar things. This is the basis of search-by-meaning, of RAG, and of the embedding attacks.
  • The transformer & attention. Today’s LLMs use the transformer, whose key trick is attention: for every token, the model weighs how much every other token in view matters. Crucially, attention makes no distinction between tokens from a trusted system prompt and tokens from a web page it just read — they’re all in one stream.
  • The context window. The fixed-size span of tokens the model can “see” at once — its entire working memory for this request. The system prompt, your message, the conversation, and any retrieved or tool-returned content all live together inside it.
  • Generation & temperature. The model emits one token, appends it, and predicts again. A temperature setting controls how random the choice is. Because output is sampled, behavior is inherently variable — which is why, later, defenses are measured as success rates, not pass/fail.
flowchart LR
  subgraph CTX["Context window - one shared stream"]
    PR["System prompt + user input<br/>+ retrieved / tool content"]
  end
  PR --> TK["Tokenize"]
  TK --> EM["Embed → vectors"]
  EM --> TF["Transformer layers<br/>attention weighs relationships"]
  TF --> NT["Predict next token"]
  NT -->|"append, repeat"| TF
  NT --> OUT["Generated text"]
  classDef c fill:#26200c,stroke:#e4a23f,color:#f3dca0;
  classDef n fill:#0f1a18,stroke:#5bd1c5,color:#bdeee2;
  class PR c;
  class TK,EM,TF,NT,OUT n;