GEO

NVIDIA H100 GPU在MLPerf基准测试中表现如何?2026年生成式AI性能实测

2026/4/17
NVIDIA H100 GPU在MLPerf基准测试中表现如何?2026年生成式AI性能实测

AI Summary (BLUF)

NVIDIA H100 Tensor Core GPUs set new records across all eight MLPerf training benchmarks, delivering exceptional performance for generative AI and large language models at both per-accelerator and massive scale configurations.

原文翻译: NVIDIA H100 Tensor Core GPU在MLPerf训练基准测试的所有八项测试中均创下新纪录,在单加速器和大规模配置下均能为生成式AI和大语言模型提供卓越性能。

领先的用户和行业标准基准测试一致认为:NVIDIA H100 Tensor Core GPU 提供了最佳的人工智能性能,尤其是在驱动生成式 AI 的大型语言模型上。

领先的用户和行业标准基准测试一致认为:NVIDIA H100 Tensor Core GPU 提供了最佳的人工智能性能,尤其是在驱动生成式 AI 的大型语言模型上。

在今天发布的最新 MLPerf 训练基准测试中,H100 GPU 在所有八项测试中均创下新纪录,并在新增的生成式 AI 测试中表现卓越。这种卓越性能不仅体现在单个加速器上,也在大规模服务器集群中得以实现。

In the latest MLPerf training benchmarks released today, H100 GPUs set new records across all eight tests, excelling particularly in the newly added generative AI benchmark. This excellence is demonstrated both per-accelerator and at-scale within massive server clusters.

例如,在一个由初创公司 Inflection AI 联合开发、并由专注于 GPU 加速工作负载的云服务提供商 CoreWeave 运营的、包含 3,584 个 H100 GPU 的商业化集群上,该系统在不到 11 分钟内完成了基于大规模 GPT-3 的训练基准测试。

For instance, on a commercially available cluster of 3,584 H100 GPUs, co-developed by startup Inflection AI and operated by CoreWeave—a cloud service provider specializing in GPU-accelerated workloads—the system completed the massive GPT-3-based training benchmark in under eleven minutes.

“得益于我们部署在高速、低延迟 InfiniBand 网络上的数千个 H100 GPU,我们的客户如今正在大规模构建最先进的生成式 AI 和大型语言模型,” CoreWeave 的联合创始人兼首席技术官 Brian Venturo 表示。“我们与 NVIDIA 联合提交的 MLPerf 结果清晰地展示了我们的客户所享有的卓越性能。”

"Thanks to our thousands of H100 GPUs deployed on fast, low-latency InfiniBand networks, our customers are building state-of-the-art generative AI and large language models at scale today," said Brian Venturo, co-founder and CTO of CoreWeave. "Our joint MLPerf submission with NVIDIA clearly demonstrates the outstanding performance our customers are experiencing."

当前可用的顶级性能

Inflection AI 利用这一性能构建了其首款个人 AI “Pi”(代表个人智能)背后的先进大型语言模型。该公司将作为一个 AI 工作室,创建用户可以通过简单、自然方式与之互动的个人 AI。

Inflection AI leveraged this performance to build the advanced large language model behind its first personal AI, "Pi," which stands for personal intelligence. The company will operate as an AI studio, creating personal AIs that users can interact with in simple, natural ways.

“任何人都可以基于我们在 CoreWeave 强大的 H100 GPU 网络上训练出的最先进大型语言模型,体验个人 AI 的力量,” Inflection AI 的首席执行官 Mustafa Suleyman 说。

"Anyone can experience the power of a personal AI today, based on our state-of-the-art large language model trained on CoreWeave's powerful network of H100 GPUs," said Mustafa Suleyman, CEO of Inflection AI.

Inflection AI 由 Mustafa、DeepMind 的 Karén Simonyan 和 Reid Hoffman 于 2022 年初联合创立,旨在与 CoreWeave 合作,使用 NVIDIA GPU 构建全球最大的计算集群之一。

Co-founded in early 2022 by Mustafa Suleyman, Karén Simonyan of DeepMind, and Reid Hoffman, Inflection AI aims to collaborate with CoreWeave to build one of the largest computing clusters in the world using NVIDIA GPUs.

性能数据一览

这些用户体验反映了在今天宣布的 MLPerf 基准测试中所展示的性能。

These user experiences reflect the performance demonstrated in the MLPerf benchmarks announced today.

H100 GPU 在包括大型语言模型、推荐系统、计算机视觉、医学成像和语音识别在内的每一项基准测试中都提供了最高性能。它们是唯一运行了全部八项测试的芯片,展示了 NVIDIA AI 平台的多功能性。

H100 GPUs delivered the highest performance on every benchmark, including large language models, recommendation systems, computer vision, medical imaging, and speech recognition. They were the only chips to run all eight tests, demonstrating the versatility of the NVIDIA AI platform.

大规模运行的卓越表现

训练通常是由多个 GPU 协同工作的大规模任务。在每一项 MLPerf 测试中,H100 GPU 都为 AI 训练创下了新的大规模性能纪录。

Training is typically a large-scale task run by many GPUs working in tandem. On every MLPerf test, H100 GPUs set new at-scale performance records for AI training.

全技术栈的优化使得在要求严苛的 LLM 测试中,随着提交结果从数百个扩展到数千个 H100 GPU,性能实现了近乎线性的扩展。

Optimizations across the full technology stack enabled near-linear performance scaling on the demanding LLM benchmark as submissions scaled from hundreds to thousands of H100 GPUs.

此外,CoreWeave 从云端提供的性能与 NVIDIA 在本地数据中心运行的 AI 超级计算机所实现的性能相似。这证明了 CoreWeave 所使用的 NVIDIA Quantum-2 InfiniBand 网络的低延迟特性。

Furthermore, CoreWeave delivered performance from the cloud that was similar to what NVIDIA achieved from an AI supercomputer running in a local data center. This is a testament to the low-latency characteristics of the NVIDIA Quantum-2 InfiniBand networking used by CoreWeave.

在本轮测试中,MLPerf 还更新了其推荐系统的基准测试。新测试使用了更大的数据集和更现代的 AI 模型,以更好地反映云服务提供商面临的挑战。NVIDIA 是唯一在增强版基准测试上提交结果的公司。

In this round, MLPerf also updated its benchmark for recommendation systems. The new test uses a larger dataset and a more modern AI model to better reflect the challenges faced by cloud service providers. NVIDIA was the only company to submit results on this enhanced benchmark.

不断扩展的 NVIDIA AI 生态系统

本轮测试中,有近十几家公司基于 NVIDIA 平台提交了结果。他们的工作表明,NVIDIA AI 拥有机器学习领域最广泛的生态系统支持。

In this round, nearly a dozen companies submitted results using the NVIDIA platform. Their work demonstrates that NVIDIA AI is backed by the industry's broadest ecosystem in machine learning.

提交方包括华硕、戴尔科技、技嘉、联想和 QCT 等主要系统制造商。超过 30 份提交结果是在 H100 GPU 上运行的。

Submissions came from major system manufacturers including ASUS, Dell Technologies, GIGABYTE, Lenovo, and QCT. More than 30 submissions were run on H100 GPUs.

这种参与度让用户知道,无论是在云端还是在自有数据中心运行的服务器中,他们都能通过 NVIDIA AI 获得卓越性能。

This level of participation assures users that they can achieve great performance with NVIDIA AI, both in the cloud and on servers running in their own data centers.

全工作负载性能表现

NVIDIA 生态系统合作伙伴参与 MLPerf 是因为他们知道,这对于客户评估 AI 平台和供应商来说是一个有价值的工具。

NVIDIA ecosystem partners participate in MLPerf because they recognize it as a valuable tool for customers evaluating AI platforms and vendors.

这些基准测试涵盖了用户关心的各种工作负载——除了生成式 AI 和推荐系统,还包括计算机视觉、翻译和强化学习。

The benchmarks cover workloads that users care about—including computer vision, translation, and reinforcement learning, in addition to generative AI and recommendation systems.

用户可以信赖 MLPerf 的结果来做出明智的购买决策,因为这些测试是透明和客观的。该基准测试得到了包括 Arm、百度、Facebook AI、谷歌、哈佛、英特尔、微软、斯坦福大学和多伦多大学在内的广泛团体支持。

Users can rely on MLPerf results to make informed purchasing decisions because the tests are transparent and objective. The benchmarks enjoy support from a broad consortium that includes Arm, Baidu, Facebook AI, Google, Harvard University, Intel, Microsoft, Stanford University, and the University of Toronto.

目前,MLPerf 结果已在 H100、L4 和 NVIDIA Jetson 平台上发布,涵盖 AI 训练、推理和 HPC 基准测试。我们也将在未来的 MLPerf 轮次中提交基于 NVIDIA Grace Hopper 系统的结果。

MLPerf results are now available for the H100, L4, and NVIDIA Jetson platforms across AI training, inference, and HPC benchmarks. We will also be making submissions based on NVIDIA Grace Hopper systems in future MLPerf rounds.

能源效率的重要性

随着 AI 性能需求的增长,提高实现该性能的效率至关重要。这正是加速计算的作用。

As the performance demands of AI grow, it is essential to improve the efficiency with which that performance is achieved. This is precisely the role of accelerated computing.

采用 NVIDIA GPU 加速的数据中心使用更少的服务器节点,因此占用更少的机架空间和能源。此外,加速网络提高了效率和性能,持续的软件优化在相同硬件上带来了额外的增益。

Data centers accelerated with NVIDIA GPUs use fewer server nodes, thereby consuming less rack space and energy. Additionally, accelerated networking enhances both efficiency and performance, while ongoing software optimizations deliver extra gains on the same hardware.

高能效性能对地球和商业都有利。提高性能可以加快产品上市时间,并让组织构建更先进的应用程序。

Energy-efficient performance is beneficial for both the planet and business. Increased performance can accelerate time-to-market and enable organizations to build more advanced applications.

能源效率还降低了成本,因为采用 NVIDIA GPU 加速的数据中心使用更少的服务器节点。事实上,在最新的 Green500 榜单中,排名前 30 的超级计算机里有 22 台由 NVIDIA 提供动力。

Energy efficiency also reduces costs because data centers accelerated with NVIDIA GPUs require fewer server nodes. Indeed, NVIDIA powers 22 of the top 30 supercomputers on the latest Green500 list.

面向所有人的可用软件

NVIDIA AI Enterprise 作为 NVIDIA AI 平台的软件层,可在领先的加速计算基础设施上实现优化性能。该软件提供了在企业数据中心运行 AI 所需的企业级支持、安全性和可靠性。

NVIDIA AI Enterprise, the software layer of the NVIDIA AI platform, enables optimized performance on leading accelerated computing infrastructure. This software comes with the enterprise-grade support, security, and reliability required to run AI in corporate data centers.

用于这些测试的所有软件都可以从 MLPerf 存储库获取,因此几乎任何人都可以获得这些世界级的结果。

All the software used for these tests is available from the MLPerf repository, meaning virtually anyone can achieve these world-class results.

优化成果持续集成到 NGC(NVIDIA 的 GPU 加速软件目录)上提供的容器中。

Optimizations are continuously integrated into containers available on NGC, NVIDIA's catalog for GPU-accelerated software.

阅读这篇技术博客,深入了解推动 NVIDIA 在 MLPerf 中实现卓越性能和效率的优化技术。

Read this technical blog for a deeper dive into the optimizations that fuel NVIDIA's performance and efficiency in MLPerf.

常见问题(FAQ)

NVIDIA H100 GPU在MLPerf基准测试中具体取得了哪些成绩?

NVIDIA H100 Tensor Core GPU在MLPerf训练基准测试的所有八项测试中均创下新纪录,包括生成式AI、推荐系统、计算机视觉等,展示了全工作负载的顶级性能表现。

H100 GPU在大规模集群上的性能表现如何?

在包含3,584个H100 GPU的商业化集群上,系统在11分钟内完成了GPT-3训练基准测试,实现了近乎线性的性能扩展,证明其在大规模配置下同样具备卓越表现。

这些性能突破对实际应用有什么意义?

Inflection AI等公司已利用H100 GPU集群训练出先进的大型语言模型,如个人AI“Pi”,这体现了当前可用的顶级性能正推动生成式AI的实际创新与应用。

← 返回文章列表
分享到:微博

版权与免责声明:本文仅用于信息分享与交流,不构成任何形式的法律、投资、医疗或其他专业建议,也不构成对任何结果的承诺或保证。

文中提及的商标、品牌、Logo、产品名称及相关图片/素材,其权利归各自合法权利人所有。本站内容可能基于公开资料整理,亦可能使用 AI 辅助生成或润色;我们尽力确保准确与合规,但不保证完整性、时效性与适用性,请读者自行甄别并以官方信息为准。

若本文内容或素材涉嫌侵权、隐私不当或存在错误,请相关权利人/当事人联系本站,我们将及时核实并采取删除、修正或下架等处理措施。 也请勿在评论或联系信息中提交身份证号、手机号、住址等个人敏感信息。