The Q, K, V Matrices

https://news.ycombinator.com/rss Hits: 19
Summary

At the core of the attention mechanism in LLMs are three matrices: Query, Key, and Value. These matrices are how transformers actually pay attention to different parts of the input. In this write-up, we will go through the construction of these matrices from the ground up. Why Q, K, V Matrices Matter When we read a sentence like “The cat sat on the mat because it was comfortable,” our brain automatically knows that “it” refers to “the mat” and not “the cat.” This is attention in action. Our brain is selectively focusing on relevant words to understand the context. In neural networks, we need a similar mechanism. Traditional recurrent neural networks processed sequences one token at a time, maintaining hidden states that carry information forward from the previous steps. RNN process looks something like this Step 1: Process "The" → Hidden state h1 (knows only about "The") Step 2: Process "cat" → Takes h1 + "cat" → produces h2 → Now h2 knows about "The" and "cat" Step 3: Process "sat" → Takes h2 + "sat" → produces h3 → Now h3 knows about "The", "cat", and "sat" Step 4: Process "on" → Takes h3 + "on" → produces h4 ... and so on The transformer architecture introduced in 2017 flipped this approach by replacing recurrence with attention. The attention mechanism solved this by allowing the model to look at all words simultaneously and decide which words are important for understanding each word. These three matrices are what let the model decide which words matter for each other. They reshape the input so the model can highlight useful connections instead of treating every word equally. Instead of processing tokens sequentially, allow each token to directly attend to every other token in the sequence simultaneously. Every word can check every other word to see how much it should care about it. For example, the model can link “sat” and “cat” right away, instead of passing information along one word at a time. "sat" attends to: - "The": 5% (low attention) - "cat": 60% (high...

First seen: 2026-01-08 00:45

Last seen: 2026-01-08 18:48