About 8,740 results
Open links in new tab
  1. Transformer (deep learning) - Wikipedia

    In deep learning, the transformer is an artificial neural network architecture based on the multi-head attention mechanism, in which text is converted to numerical representations called tokens, and each …

  2. Architecture and Working of Transformers in Deep Learning

    Oct 18, 2025 · Transformer model is built on encoder-decoder architecture where both the encoder and decoder are composed of a series of layers that utilize self-attention mechanisms and feed-forward …

  3. How Transformers Work: A Detailed Exploration of Transformer Architecture

    Feb 27, 2026 · Explore the architecture of Transformers, the models that have revolutionized data handling through self-attention mechanisms, surpassing traditional RNNs, and paving the way for …

  4. The Transformer Architecture Explained | Let's Data Science

    4 days ago · The complete guide to the Transformer architecture: self-attention, multi-head attention, positional encoding, and why this single paper changed AI forever.

  5. 11.7. The Transformer Architecture — Dive into Deep Learning 1.0.3 ...

    Now we provide an overview of the Transformer architecture in Fig. 11.7.1. At a high level, the Transformer encoder is a stack of multiple identical layers, where each layer has two sublayers …

  6. Transformer Architecture Explained: How LLMs Work

    What is Transformer Architecture? Transformer architecture is a neural network design that processes sequential data through a mechanism called self-attention, allowing the model to weigh the …

  7. Transformers AI Architecture Explained | Complete Guide 2026

    Comprehensive guide to Transformer architecture: attention mechanism, encoder-decoder, multi-head attention, real-world applications from GPT to BERT. With interactive experience.

  8. Transformer models: an introduction and catalog - arXiv.org

    We described the Transformer architecture as being made up of an Encoder and a Decoder, and that is true for the original Transformer. However, since then, different advances have been made that have …

  9. Understanding Transformer Architecture and Attention Mechanisms

    Explore the core components of transformer architecture including positional encoding, masking, layer normalization, and Flash Attention for AI engineering.

  10. Transformer Architectures - Hugging Face LLM Course

    In this section, we’re going to dive deeper into the three main architectural variants of Transformer models and understand when to use each one. Remember that most Transformer models use one of …