March 31, 2023 Joseph Thornton MD Tweet by Cameron R. Wolfe on Twitter l l m. Models Cameron R. Wolfe @cwolferesearch Self-attention is the primary building block of large language models (LLMs) and transformers in general. But, how exactly does it work? 🧵 [1/8] pic.twitter.com/55mRSdRqOR 3/31/23, 6:34 PM Joseph Thornton Share this: Share on X (Opens in new window) X Share on Facebook (Opens in new window) Facebook Like Loading... Related