Google Unveils Eighth-Generation TPUs With Two New Chips Built For The Agentic AI Era

Google is redesigning its AI infrastructure strategy as enterprises move beyond chatbots and copilots toward autonomous AI agents capable of reasoning, planning, and executing tasks independently. To support that shift, the company has launched its eighth-generation Tensor Processing Units (TPUs), introducing two separate chips built for distinct workloads, one for training and another for inference.

The announcement came during Google Cloud Next 2026, where the company positioned the launch as a foundational step toward supporting the growing demands of agentic AI systems that require persistent reasoning, memory handling, tool usage, and real-time decision making.

TL;DR

Google launched TPU 8t for large-scale AI model training.
Google launched TPU 8i for inference and real-time AI agent execution.
TPU 8t scales up to 9,600 chips and 121 exaflops of computing power.
Google says AI agents require specialized infrastructure instead of one universal chip.
The launch increases competition with Nvidia in enterprise AI infrastructure.
Google also introduced new agent deployment tools for enterprises.

Why Google Split Its TPU Strategy

For years, Google designed its TPU lineup to handle both training and inference workloads. However, the company now believes the rise of AI agents has fundamentally changed infrastructure requirements.

AI agents differ significantly from traditional generative AI tools because they constantly process live data, retrieve memory, interact with applications, and execute multi-step tasks in real time. This creates vastly different demands compared to training large language models.

According to Google, using one chip architecture for both functions is no longer efficient.

TPU 8t For Training Frontier Models

TPU 8t is designed for training large foundation models and handling compute-intensive workloads.

Google said the chip can scale up to 9,600 liquid-cooled chips, delivering 121 exaflops of performance. The infrastructure also includes nearly 2 petabytes of high-bandwidth memory, allowing organizations to train increasingly large AI models faster and more efficiently.

This makes TPU 8t ideal for frontier AI developers building next-generation models similar to Google’s Gemini family.

TPU 8i For Inference And AI Agents

Meanwhile, TPU 8i is built for inference workloads, where AI models respond to live user requests.

This chip is optimized for low latency, cost efficiency, and large-scale deployment of AI agents. Google said enterprises deploying customer service agents, coding assistants, enterprise automation systems, and digital workers would benefit from the architecture.

This also directly challenges Nvidia’s inference dominance as enterprises search for cheaper alternatives to expensive GPU infrastructure.

Bigger Push Into Agentic AI

Alongside the TPU announcement, Google introduced new enterprise tools that help businesses deploy autonomous AI agents. These tools allow organizations to build agents capable of handling workflows with minimal human supervision.

The launch reinforces Google’s broader strategy of controlling the full AI stack, from chips and cloud infrastructure to foundation models and enterprise software.

As companies race toward agentic AI adoption, infrastructure providers are now competing just as aggressively as model developers. Google’s latest TPU announcement suggests the AI arms race is no longer just about building smarter models, it is increasingly about building the hardware required to run them efficiently at scale.