Member-only story
Evolution of Large Language Models (LLMs): From Trasnformers to Agentic AI — Part 1
How did we get from simple text generators to AI that retrieves data, reasons step-by-step, and even improves itself?
For those who don’t have the medium subscription, you can access this article for free here.
From Transformer breakthroughs to self-learning AI agents, we are witnessing an era where machines are no longer passive text generators. They retrieve knowledge, interact with APIs, make decisions, and optimize themselves through feedback loops. In this article series, i wish to explore the evolution of LLMS, starting from discussing transformer architectures to agentic AI and agentic RAGs. These are the topics i aim to explain in this short article series:
Part 1(this article) covers:
- How Transformers work and power LLMs like GPT-4, LLaMA, and Falcon
- How casual and Masked Language Models are teaching AI to understand context
- How Reinforcement Learning from Human Feedback (RLHF) is aligning AI with human values
Part 2 covers:
- Reasoning Step-by-Step using Chain-of-Thought Prompting
- How Retrieval-Augmented Generation (RAG) are enhancing LLMs with external knowledge
- How Agentic AI is revolutionizing autonomy
1. Transformers: The Architecture That Powers LLMs
“If you only pay attention to one thing at a time, you might miss the bigger picture.”
This statement holds true in human cognition, and it’s the core idea behind Multi-Head Attention (MHA) in Transformer architectures. Unlike traditional models that process words sequentially, MHA allows AI to focus on multiple words at once, capturing the relationships between them in parallel. Multi-Head Attention is the reason LLMs like GPT-4, LLaMA, and Falcon can understand context so well.

Multihead attention is a refined and updated form of “Self-attention” which is a mechanism used in machine learning, particularly in natural language processing (NLP) and computer vision tasks, to capture dependencies and relationships within input sequences.
It consists of 3 major parts…