UNDERSTANDING TRANSFORMERS

Understanding Transformers: The Mathematical Foundations of Large Language Models

In recent years, two major breakthroughs have revolutionized the field of Large Language Models (LLMs): 1. 2017: The publication of Google’s seminal paper, (https://arxiv.org/abs/1706.03762) by Vaswani et al., which introduced the Transformer architecture – a neural network that fundamentally changed Natural Language Processing (NLP). 2. 2022: The launch of ChatGPT by OpenAI, a transformer-based chatbot…

Read More
TYPES OF LLMs

Understanding Different Types of LLMs: Distilled, Quantized, and More – A Training Guide

Large Language Models (LLMs) come in various optimized forms, each designed for specific use cases, efficiency, and performance. In this guide, we’ll explore the different types of LLMs (like distilled, quantized, sparse, and MoE models) and how they are trained. In the fast-evolving world of Large Language Models (LLMs), different model types serve different performance and deployment goals…

Read More
CHATGPT VS DEEPSEEK

DeepSeek vs ChatGPT: A Technical Deep Dive into Modern LLM Architectures

The large language model (LLM) landscape is rapidly evolving, and two powerful contenders—DeepSeek and ChatGPT—are emerging as core engines in generative AI applications. While they both excel at generating human-like text, answering questions, and powering chatbots, they differ significantly in architecture, training objectives, inference capabilities, and deployment paradigms. Not long ago, I had my first…

Read More
Home
Courses
Services
Search