Tech Field Day

The Independent IT Influencer Event

  • Home
    • The Futurum Group
    • FAQ
    • Staff
  • Sponsors
    • Sponsor List
      • 2026 Sponsors
      • 2025 Sponsors
      • 2024 Sponsors
      • 2023 Sponsors
      • 2022 Sponsors
    • Sponsor Tech Field Day
    • Best of Tech Field Day
    • Results and Metrics
    • Preparing Your Presentation
      • Complete Presentation Guide
      • A Classic Tech Field Day Agenda
      • Field Day Room Setup
      • Presenting to Engineers
  • Delegates
    • Delegate List
      • 2025 Delegates
      • 2024 Delegates
      • 2023 Delegates
      • 2022 Delegates
      • 2021 Delegates
      • 2020 Delegates
      • 2019 Delegates
      • 2018 Delegates
    • Become a Field Day Delegate
    • What Delegates Should Know
  • Events
    • All Events
      • Upcoming
      • Past
    • Field Day
    • Field Day Extra
    • Field Day Exclusive
    • Field Day Experience
    • Field Day Live
    • Field Day Showcase
  • Topics
    • Tech Field Day
    • Cloud Field Day
    • Mobility Field Day
    • Networking Field Day
    • Security Field Day
    • Storage Field Day
  • News
    • Coverage
    • Event News
    • Podcast
  • When autocomplete results are available use up and down arrows to review and enter to go to the desired page. Touch device users, explore by touch or with swipe gestures.
You are here: Home / Appearances / ML Commons Presents at AI Field Day 6

ML Commons Presents at AI Field Day 6



AI Field Day 6

Curtis Anderson and David Kanter presented for ML Commons at AI Field Day 6

This Presentation date is January 29, 2025 at 8:00-9:00.

Presenters: Curtis Anderson, David Kanter


MLCommons and MLPerf – An Introduction


Watch on YouTube
Watch on Vimeo

MLCommons is a non-profit industry consortium dedicated to improving AI for everyone by focusing on accuracy, safety, speed, and power efficiency. The organization boasts over 125 members across six continents and leverages community participation to achieve its goals. A key project is MLPerf, an open industry standard benchmark suite for measuring the performance and efficiency of AI systems, providing a common framework for comparison and progress tracking. This transparency fosters collaboration among researchers, vendors, and customers, driving innovation and preventing inflated claims.

The presentation highlights the crucial relationship between big data, big models, and big compute in achieving AI breakthroughs. A key chart illustrates how AI model performance significantly improves with increased data, but eventually plateaus. This necessitates larger models and more powerful computing resources, leading to an insatiable demand for compute power. MLPerf benchmarks help navigate this landscape by providing a standardized method of measuring performance across various factors including hardware, algorithms, software optimization, and scale, ensuring that improvements are verifiable and reproducible.

MLPerf offers a range of benchmarks covering diverse AI applications, including training, inference (data center, edge, mobile, tiny, and automotive), storage, and client systems. The benchmarks are designed to be representative of real-world use cases and are regularly updated to reflect technological advancements and evolving industry practices. While acknowledging the limitations of any benchmark, the presenter emphasizes MLPerf’s commitment to transparency and accountability through open-source results, peer review, and audits, ensuring that reported results are not merely flukes but can be validated and replicated. This approach promotes a collaborative, data-driven approach to developing more efficient and impactful AI solutions.

Personnel: David Kanter

MLCommons MLPerf Client Overview


Watch on YouTube
Watch on Vimeo

MLCommons presented MLPerf Client, a new benchmark designed to measure the performance of PC-class systems, including laptops and desktops, on large language model (LLM) tasks. Released in December 2024, it’s an installable, open-source application (available on GitHub) that allows users to easily test their systems and provides early access for feedback and improvement. The initial release focuses on a single large language model, LLaMA 2.7 billion, using the Open Orca dataset, and includes four tests simulating different LLM usage scenarios like content generation and summarization. The benchmark prioritizes response latency as its primary metric, mirroring real-world user experience.

A key aspect of MLPerf Client is its emphasis on accuracy. While prioritizing performance, it incorporates the MMLU (Massive Multitask Language Understanding) benchmark to ensure the measured performance is achieved with acceptable accuracy. This prevents optimizations that might drastically improve speed but severely compromise the quality of the LLM’s output. The presenters emphasized that this is not intended to evaluate production-ready LLMs, but rather to provide a standardized and impartial way to compare the performance of different hardware and software configurations on common LLM tasks.

The benchmark utilizes a single-stream approach, feeding queries one at a time, and supports multiple GPU acceleration paths via ONNX Runtime and Intel OpenVINO. The presenters highlighted the flexibility of allowing hardware vendors to optimize the model (LLaMA 2.7B) for their specific devices, even down to 4-bit integer quantization, while maintaining sufficient accuracy as judged by the MMLU threshold. Future plans include expanding hardware support, adding more tests and models, and implementing a graphical user interface (GUI) to improve usability.

Personnel: David Kanter

MLCommons MLPerf Storage


Watch on YouTube
Watch on Vimeo

MLCommons’ MLPerf Storage benchmark addresses the rapidly growing need for high-performance storage in AI training. Driven by the exponential increase in data volume and the even faster growth in data access demands, the benchmark aims to provide a standardized way to compare storage systems’ capabilities for AI workloads. This benefits purchasers seeking informed decisions, researchers developing better storage technologies, and vendors optimizing their products for AI’s unique data access patterns, which are characterized by random reads and massive data volume exceeding the capacity of most on-node storage solutions.

The benchmark currently supports three training workloads (UNET 3D, ResNet-50, and CosmoFlow) using PyTorch and TensorFlow, each imposing distinct demands on storage systems. Future versions will incorporate additional workloads, including a RAG (Retrieval Augmented Generation) pipeline with a vector database, reflecting the evolving needs of large language model training and inference. A key aspect is the focus on maintaining high accelerator utilization (aiming for 95%), making the storage system’s speed crucial for avoiding costly GPU idle time. The benchmark offers both “closed” (apples-to-apples comparisons) and “open” (allowing for vendor-specific optimizations) categories to foster innovation.

MLPerf Storage has seen significant adoption since its initial release, with a substantial increase in the number of submissions and participating organizations. This reflects the growing importance of AI in the market and the need for a standardized benchmark for evaluating storage solutions designed for these unique demands. The benchmark’s community-driven nature and transparency are enabling more informed purchasing decisions, moving beyond arbitrary vendor claims and providing a more objective way to assess the performance of storage systems in the critical context of modern AI applications.

Personnel: Curtis Anderson, David Kanter

  • Bluesky
  • LinkedIn
  • Mastodon
  • RSS
  • Twitter
  • YouTube

Event Calendar

  • Oct 22-Oct 23 — Cloud Field Day 24
  • Oct 29-Oct 30 — AI Field Day 7
  • Nov 5-Nov 6 — Networking Field Day 39
  • Nov 11-Nov 12 — Tech Field Day at KubeCon North America 2025
  • Jan 28-Jan 29 — AI Infrastructure Field Day 4
  • Mar 11-Mar 12 — Cloud Field Day 25
  • Apr 29-Apr 30 — Security Field Day 15
  • May 6-May 8 — Mobility Field Day 14

Latest Coverage

  • Unifying Storage Management: Pure Fusion & Pure Storage Cloud at Cloud Field Day 24
  • Oxide Delivers Couture Hyperscale Infra for the Enterprise
  • Cloud Field Day 24: reviewing Pure Storage – Fusion 2.0
  • Enhancing Security in the Age of AI and Agents
  • NetApp Has Some Interesting AI Features In Their New AFx Product Lineup

Tech Field Day News

  • Exploring the Future of Enterprise AI Deployment and Innovation at AI Field Day 7
  • The Evolution of Cloud at Cloud Field Day 24

Return to top of page

Copyright © 2025 · Genesis Framework · WordPress · Log in