Watch on YouTube
Watch on Vimeo
Guy Currier of The Futurum Group presented on “The Next Wave of AI,” focusing on the increasing importance of inference outside hyperscale data centers. He highlighted a maturing AI market characterized by a significant shift in investment from AI training towards AI inference. This trend, supported by Futurum Group research, indicates faster growth in inference, particularly between 2025 and 2027, with fine-tuning and smaller model training also categorized under this evolving landscape. This development aligns with the Open Compute Project’s new AI Compute Continuum initiative, which acknowledges the critical role of AI extending beyond traditional enterprise data centers to the furthest reaches of the edge.
Currier employed a historical analogy of the Roman Empire’s expansion through “colonias” or outposts to illustrate the progression of AI deployment. He described the initial phase of AI development as being concentrated in elite, core, cloud-based environments, similar to these powerful yet contained Roman settlements. The current and future wave involves extending AI capabilities, much like building Roman roads, by distributing AI services and applications to the edge. This expansion is driven by the practical need to place AI where it offers optimal performance, considering factors such as data proximity and specific physical environment requirements, ultimately providing users and architects with enhanced control and flexibility across a continuum from core to edge.
The presentation underscored the emerging concept of “hyper-hybridity,” suggesting that every edge deployment is fundamentally a hybrid one, spanning cloud, on-premise, and diverse edge locations. While the core AI market remains larger, the edge market is projected to grow substantially, nearing parallel in size by 2030. Critical enduring questions for this expansion include the necessity for robust standards, particularly given the traditional hyperscale focus of organizations like OCP. There is also a recognition of immense diversity in processing units beyond GPUs, the indispensable role of resilient networking to overcome edge discontinuities, and the intricate interplay between hardware and software in driving efficient, distributed AI solutions, including the rapid development of custom chips.
Personnel: Guy Currier
Watch on YouTube
Watch on Vimeo
Ryan Booth presented his exploratory work on agent-to-agent (A2A) and agent-to-payment (A2P) protocols, focusing on their potential to revolutionize how knowledge is shared and communication occurs between AI agents. A2A provides a structured method for agents to interact, even across different infrastructures or without inherent trust, by managing communication structures more precisely than human-centric chatbots. A2P extends this by integrating transactional and financial details, enabling payment systems between agents, though its infrastructure is still developing. As a network engineer turned software developer and AI consultant, Booth specializes in “pathfinding” to understand the true capabilities of emerging technologies, encouraging open-minded thinking about future applications.
His primary project demonstrating these concepts is an agent-first personal portfolio, designed to distribute his professional information more effectively than traditional platforms. This interactive portfolio uses an embedded chatbot, powered by a Retrieval Augmented Generation (RAG) system hosted on Cloudflare, to respond to specific user queries about his skills and projects. The goal is to allow an individual’s chat agent to find and retrieve relevant information, leading to more efficient engagement. Any interaction can be passed with full context to him, streamlining initial discovery calls and accelerating the connection process.
Booth is further developing this idea into a marketplace platform called Tokuru, which operates similarly to federated social networks like Mastodon. This platform allows individuals to deploy and manage their own agent-friendly portfolios, presenting detailed skill sets that go beyond what traditional sites can offer. The vision is to enable nuanced, natural language queries to match talent with needs, moving beyond simple keyword searches. While acknowledging challenges in user experience and agent discovery, Booth aims for this open source-aligned initiative to cut through hiring inefficiencies and better connect skilled individuals with opportunities.
Personnel: Ryan Booth
Watch on YouTube
Watch on Vimeo
David Kanter detailed the ongoing evolution of MLPerf benchmarks, which have been an industry standard for seven years. He highlighted the need for fundamental changes, particularly in the visualization of results, moving from an outdated, spreadsheet-like format to a more modern and understandable interface. MLPerf, backed by MLCommons, is widely used by over 100 members for internal testing, showcasing capabilities, and informing purchasing decisions. Its success stems from core principles of relevance, fairness, neutrality, reproducibility, and inclusiveness, all working together to foster trust and drive industry advancement.
The landscape of AI performance has radically shifted with the explosion of generative AI, marked by immense user adoption and an unprecedented velocity of change, with new models appearing almost fortnightly. To keep pace and better serve buyers, MLPerf is transitioning to an API-centric benchmarking approach. This involves moving away from a complex, locally installed load generator to a decoupled, Python-based test infrastructure that interacts with the system under test via a standard API, similar to the OpenAI API. This new architecture simplifies setup, accelerates the integration of new datasets and benchmarks, and supports comprehensive measurement across varying concurrency levels, capturing critical metrics like time-to-first token, throughput, and full response latency without relying on interpolation.
This strategic shift aims to significantly increase the velocity of benchmark submissions, allowing for more frequent updates than the current six-month cycle, while rigorously maintaining peer review and auditability to preserve trust. Kanter acknowledged the complex and multidimensional challenge of assessing quality in generative AI and agentic applications, a problem MLPerf is actively addressing in its long-term roadmap. He concluded by inviting feedback from the community, especially from enterprise buyers and analysts, to ensure the benchmarks remain relevant, understandable, and valuable for the widespread deployment of generative AI.
Personnel: David Kanter
Thank you for being part of the Tech Field Day community! Our mailing list is a great way to stay up to date on our events and technical content, and we appreciate your signup.
We promise that we’ll never spam you, send ads, or sell your information. This list will only be used to communicate with our community about our events and content. And we’ll limit it to no more than one message per week.
Although we only need your email address, it would be nice if you provided a little more information to help us get to know you better!