Cerebras Expands Data Center Network for High-Performance AI Inference | Hands-on large language models book pdf download | Is chatgpt a large language model | What is llm | Turtles AI
Cerebras Systems announces six new AI inference data centers across North America and Europe, strengthening its position in the industry with high-performance infrastructure based on CS-3 systems. The expansion addresses increased demand for real-time inference services and improves the company’s competitiveness in the industry.
Key Points:
- New data centers: Six new sites operational in 2025, with two wholly owned facilities and four in partnership with G42.
- Increased capacity: New centers will handle more than 40 million tokens per second.
- Strategic partnerships: Agreements with leading companies such as Mistral, Perplexity and Hugging Face.
- Advanced technology: Use of CS-3 systems with wafer-scale architecture for superior performance compared to traditional GPUs.
Cerebras Systems, a pioneer in accelerating generative AI, announced the upcoming opening of six new AI inference data centers, adding to its existing network to deliver unprecedented processing capacity. This expansion, which quadruples the current operating capacity, includes sites in Oklahoma City and Montreal that will be fully operated by Cerebras, while the other four facilities will be operated through a partnership with G42, thus strengthening the company’s technological leadership. The infrastructure will be based on thousands of Cerebras CS-3 systems, AI platforms designed to ensure efficiency and high performance in the inference of advanced models.
With an aggregate capacity that will allow to process more than 40 million Llama 70B tokens per second, Cerebras establishes itself as the leading provider of high-speed AI inference in the world. The investment in the infrastructure expansion responds to the growing demand for solutions for processing large language models, with performance superior to GPU-based systems. This capacity will be crucial to support next-generation applications, including AI search engines, conversational assistants and advanced analytics tools.
Companies already using Cerebras systems include Mistral, a leading French AI startup, Perplexity, an innovative AI search engine, and AlphaSense, a market intelligence platform. Hugging Face, a leader in the open-source AI community, recently integrated Cerebras inference into its offering, allowing developers to access the technology through a unified interface.
One of the most advanced centers will be the Scale data center in Oklahoma City, which will be operational by June 2025. Equipped with over 300 CS-3 systems, it will be built with Level 3+ infrastructure, integrated with advanced cooling solutions to support large-scale AI processing requirements. The Montreal facility, operated by Enovum, will instead be active by July 2025 and will bring wafer-scale AI inference to Canada for the first time, with performance up to ten times faster than the latest generation GPUs.
Cerebras infrastructure is designed to meet the inference needs of increasingly complex models, such as DeepSeek R1 and OpenAI o3, which require significant computational power for sequential reasoning. The acceleration provided by CS-3 systems dramatically reduces processing times, enabling near-instantaneous responses for even the most advanced models. This is achieved through a combination of wafer-scale architecture and speculative decoding techniques, which improve token processing efficiency.
With memory bandwidth of up to 21 petabytes per second, CS-3 processors overcome the limitations of traditional GPUs, streamlining data flow and improving performance in AI workloads. However, memory per node remains critical to support models larger than 70 billion parameters, requiring multi-node configurations to run even larger-scale AI architectures.
Accelerating AI inference represents a strategic advantage in the industry, allowing companies to implement advanced solutions with greater efficiency and reduced response times.
With the new datacenters, Cerebras is positioning itself as a key player in the evolution of AI, offering a high-performance infrastructure to support the development of next-generation models and increasingly sophisticated applications.