Mirantis IaaS Technology Stack with Shaun O’Meara

Event: AI Infrastructure Field Day 3

Appearance: Mirantis presents at AI Infrastructure Field Day 3

Company: Mirantis

Video Links:

Personnel: Anjelica Ambrosio, Shaun O’Meara

Shaun O’Meara, CTO at Mirantis, described the infrastructure layer that underpins Mirantis k0rdent AI. The IaaS stack is designed to manage bare metal, networking, and storage resources in a way that removes friction from GPU operations. It provides operators with a tested foundation where GPU servers can be rapidly added, tracked, and made available for higher level orchestration.

O’Meara emphasized that Mirantis has long experience operating infrastructure at scale. This history informed a design that automates many of the tasks that traditionally consume engineering time. The stack handles bare metal provisioning, integrates with heterogeneous server and network vendors, and applies governance for tenancy and workload isolation. It includes validated drivers for GPU hardware, which reduces the risk of incompatibility and lowers the time to get workloads running.

Anjelica Ambrosio demonstrated how the stack works in practice. She created a new GPU cluster through the Mirantis k0rdent AI interface, with the system automatically discovering hardware, configuring network overlays, and assigning compute resources. The demo illustrated how administrators can track GPU usage down to the device level, observing both allocation and health data in real time. What would normally involve manual integration of provisioning tools, firmware updates, and network templates was shown as a guided workflow completed in minutes.

O’Meara pointed out that the IaaS stack is not intended as a general-purpose cloud platform. It is narrowly focused on preparing infrastructure for GPU workloads and passing those resources upward into the PaaS layer. This focus reduces complexity but also introduces tradeoffs. Operators who need extensive support for legacy virtualization may need to run separate systems in parallel. However, for organizations intent on scaling AI, the IaaS layer provides a clear and efficient baseline.

By combining automation with vendor neutrality, the Mirantis approach reduces the number of unique integration points that operators must maintain. This lets smaller teams manage environments that previously demanded much larger staff. O’Meara concluded that the IaaS layer is what makes the higher levels of Mirantis k0rdent AI possible, giving enterprises a repeatable way to build secure, observable, and tenant-aware GPU foundations.


Mirantis Solution Approach: GPU Cloud in a Box with Shaun O’Meara

Event: AI Infrastructure Field Day 3

Appearance: Mirantis presents at AI Infrastructure Field Day 3

Company: Mirantis

Video Links:

Personnel: Anjelica Ambrosio, Shaun O’Meara

Shaun O’Meara, CTO at Mirantis, presented the company’s approach to simplifying GPU infrastructure with what he described as a “GPU Cloud in a Box.” The concept addresses operational bottlenecks that enterprises and service providers face when deploying GPU environments: fragmented technology stacks, resource scheduling difficulties, and lack of integrated observability. Rather than forcing customers to assemble and maintain a full hyperscaler-style AI platform, Mirantis packages a complete, production-ready system that can be deployed as a single solution and then scaled or customized as requirements evolve.

The design is centered on Mirantis k0rdent AI, a composable platform that converts racks of GPU servers into consumable services. Operators can partition GPU resources into tenant-aware allocations, apply policy-based access, and expose these resources through service catalogs aligned with existing cloud consumption models. Lifecycle automation for Kubernetes clusters, GPU-aware scheduling, and tenant isolation are embedded into the system, reducing the engineering burden that is typically required to make such environments reliable.

A live demonstration was presented by Anjelica Ambrosio, AI Developer Advocate. For the first demo, she reviewed the customers’ experience using the Product Builder. She showed how a user can log into the Mirantis k0rdent AI self-service portal and provision products with the Product Builder within minutes, selecting from preconfigured service templates. The demo included creating a new cluster product, setting parameters, and deploying the product to the marketplace. Real-time observability dashboards displayed GPU utilization, job performance, and service health. The demonstration highlighted how the platform turns what was once a multi-week manual integration process into a repeatable and governed workflow. The next demo Anjelica presented was the Product Builder from the Operator’s experience, showing how products can be created using nodes and configuring dependencies with Graph View.

O’Meara explained that the “Cloud in a Box” model is not a closed appliance but a composable building block. It can be deployed in a data center, at an edge location, or within a hybrid model where a public cloud-hosted control plane manages distributed GPU nodes. Customers can adopt the system incrementally, beginning with internal workloads and later extending services to external markets or partners. This flexibility is particularly important for organizations pursuing sovereign cloud strategies, where speed of deployment, transparent governance, and monetization are essential.

The value is both technical and commercial. Technically, operators gain a validated baseline architecture that reduces common failure modes and accelerates time-to-service. Commercially, they can monetize GPU investments by offering consumption-based services that resemble hyperscaler offerings without requiring the same level of capital investment or staffing. O’Meara positioned the solution as a direct response to the core challenge confronting enterprises and service providers: transforming expensive GPU hardware into sustainable and revenue-generating AI infrastructure.


Mirantis Company Overview

Event: AI Infrastructure Field Day 3

Appearance: Mirantis presents at AI Infrastructure Field Day 3

Company: Mirantis

Video Links:

Personnel: Kevin Kamel, Shaun O’Meara

Kevin Kamel, VP of Product Management at Mirantis, opened with a wide-ranging overview of the company’s heritage, its evolution, and its current mission to redefine enterprise AI infrastructure. Mirantis began as a private cloud pioneer, gained deep expertise operating some of the world’s largest clouds, and later played a formative role in advancing cloud-native technologies, including early stewardship of Kubernetes and acquisitions such as Docker Enterprise and Lens. Today, Mirantis leverages this pedigree to address the pressing complexity of building and operating GPU-accelerated AI infrastructure at scale.
Kamel highlighted three key challenges driving market demand: the difficulty of transforming single-tenant GPU hardware into multi-tenant services; the talent drain that leaves enterprises and cloud providers without the expertise to operationalize these environments; and the rising expectation among customers for hyperscaler-style experiences, including self-service portals, integrated observability, and efficient resource monetization. Against this backdrop, Mirantis positions its Mirantis k0rdent AI platform as a turnkey solution that enables public clouds, private clouds, and sovereign “NeoClouds” to operationalize and monetize GPU resources quickly.

What sets Mirantis apart, Kamel emphasized, is its composable architecture. Rather than locking customers into vertically integrated stacks, Mirantis k0rdent AI provides configurable building blocks and a service catalog that allows operators to design bespoke offerings—such as proprietary training or inference services—while maintaining efficiency through features like configuration reconciliation and validated GPU support. Customers can launch services internally, expose them to external markets, or blend both models using hybrid deployment approaches that include a unique public-cloud-hosted control plane.

The section also introduced Nebul, a sovereign AI cloud in the Netherlands, as a case study. Nebul initially struggled with the technical sprawl of standing up GPU services—managing thousands of Kubernetes clusters, enforcing strict multi-tenancy, and avoiding stranded GPU resources. By adopting Mirantis k0rdent AI, Nebul streamlined cluster lifecycle management, enforced tenant isolation, and gained automation capabilities that allowed its small technical team to focus on business growth rather than infrastructure firefighting.

Finally, Kamel discussed flexible pricing models (OPEX consumption-based and CAPEX-aligned licensing), Mirantis’ ability to support highly regulated environments with FedRAMP and air-gapped deployments, and its in-house professional services team that can deliver managed services or bridge skills gaps. He drew parallels to the early OpenStack era, where enterprises faced similar knowledge gaps and relied on Mirantis to deliver production-grade private clouds. That same depth of expertise, combined with long-standing open source and ecosystem relationships, underpins Mirantis’ differentiation in today’s AI infrastructure market.


Jericho4: Enabling Distributed AI Computing Across Data Centers with Broadcom

Event: AI Infrastructure Field Day 3

Appearance: Broadcom presents at AI Infrastructure Field Day 3

Company: Broadcom

Video Links:

Personnel: Henry Wu

Jericho4 – Ethernet Fabric Router is a purpose-built platform for the next generation of distributed AI infrastructure. In this session, we will examine how Jericho4 pushes beyond traditional scaling limits, delivering unmatched bandwidth, integrated security, and true lossless performance—while interconnecting more than one million XPUs across multiple data centers.

The presentation discusses the Jericho 4 solution for scaling AI infrastructure across data centers. Current limitations in power and space capacity necessitate interconnecting smaller data centers via high-speed networks. Jericho 4 addresses the growing challenges of load balancing, congestion control, traffic management, and security at scale by offering four key features. First, it allows building a single system with 36K ports, acting as a single, non-blocking routing domain. Second, it provides high bandwidth with hyper ports (3.2T), a native solution for the large data flows characteristic of AI workloads. Third, its embedded deep buffer supports lossless RDMA interconnections over distances exceeding 100 kilometers. Finally, Jericho 4 has embedded security engines to enable security without impacting performance.

The Jericho 4 family offers various derivatives to suit different deployment scenarios, including modular and centralized systems. The architecture supports scaling as a single system through various form factors, from compact boxes to disaggregated chassis, and further scaling across a fabric. Hyper ports improve link utilization by avoiding hashing and collisions, leading to reduced training times. The deep buffer handles the bursty nature of AI workloads, minimizing congestion and ensuring lossless data transmission even over long distances. The embedded security engine addresses security concerns by enabling point-to-point MACsec and end-to-end IPsec with no performance impact.


Broadcom Tomahawk Ultra Low latency High performance and Reliable Ethernet for HPC and AI

Event: AI Infrastructure Field Day 2

Appearance: Broadcom presents at AI Infrastructure Field Day 3

Company: Broadcom

Video Links:

Personnel: Robin Grindley

Tomahawk Ultra shatters the myths about Ethernet’s ability to address high-performance networking. In this session we will show how we added features for lossless networking, reduced latency and increased performance – all while maintaining compatibility with Ethernet.

The presentation introduces the Broadcom Tomahawk Ultra, a 51.2 terabit per second switch chip designed to bring high-performance Ethernet to markets traditionally dominated by InfiniBand, specifically HPC and AI. Addressing perceived limitations of Ethernet such as high latency, small frame size constraints, packet overhead, and lossy nature, the Tomahawk Ultra is a clean-slate design focused on ultra-low latency, high packet rates, and reliability. The chip is pin-compatible with Tomahawk 5, enabling quick adoption by OEMs and ODMs, and it’s currently shipping to partners who are building boxes with it.

Key features of the Tomahawk Ultra include a 250 nanosecond ball-to-ball (first bit in to first bit out) latency, high packet-per-second processing optimized for small message sizes common in HPC and AI inferencing, and support for in-network collectives (INC) to offload computation from XPUs during AI training. The chip also incorporates an optimized header format to reduce packet overhead in managed networks and advanced reliability features like link-layer retry (LLR) and credit-based flow control (CBFC) for lossless networking. Topology aware routing, enabling optimized packet paths in complex HPC networks, is also implemented.

The speaker emphasized that the Tomahawk Ultra aims to provide an open and standards-based approach to high-performance networking, adhering to Ethernet standards for compatibility and ease of management. It utilizes standard Ethernet tools for configuration and monitoring, with features like LLR automatically negotiating between the switch and endpoints. Broadcom has contributed the Scale Up Ethernet (SUE) specification to OCP to encourage an open ecosystem. The Tomahawk Ultra is positioned as an end-to-end solution for high performance, offering an alternative to technologies like NVLink in scale-up architectures while ensuring compatibility and openness.


Broadcom Tomahawk 6 Scaling AI Networks with the World’s First 102.4 Tbps Ethernet Switch

Event: AI Infrastructure Field Day 3

Appearance: Broadcom presents at AI Infrastructure Field Day 3

Company: Broadcom

Video Links:

Personnel: Pete Del Vecchio

Tomahawk 6 is the world’s first 102.4 Tbps Ethernet switch, designed to meet the demands of massive AI infrastructure. In this session, we’ll show how it enables both scale-up and scale-out networking, with unmatched bandwidth, energy efficiency, and congestion control. We’ll also debunk common myths about AI networking and explain why Ethernet has become the fabric of choice for the world’s largest GPU clusters.

Pete DelVecchio introduced Broadcom’s Tomahawk 6, a 102.4 Tbps Ethernet switch designed for both scale-up and scale-out AI networking. Tomahawk 6 doubles the bandwidth and SERDES speed of its predecessor, Tomahawk 5, and incorporates features to enhance load balancing and congestion control. DelVecchio emphasized that Ethernet has become the dominant choice for AI scale-out networks and is gaining traction in scale-up environments due to its open ecosystem and the ability to partition large clusters for different customers.

Tomahawk 6 comes in two versions: one with 512 lanes of 200G SERDES and another with 1,024 lanes of 100G SERDES. The presenter highlighted that Tomahawk 6 is built on a multi-die implementation, with a central core for packet processing and chiplets for I/O. The chip is designed for both scale up and scale out applications. For scale-up applications, Tomahawk 6 can support 512 XPUs in a single-hop network.

The presentation also touched upon power efficiency, emphasizing that Tomahawk 6 enables two-tier network designs, which significantly reduce the number of optics required compared to three-tier networks, leading to lower power consumption and reduced latency. DelVecchio also discussed advanced features like cognitive routing with global load balancing, telemetry, and diagnostics for proactive link management. Broadcom emphasized that Tomahawk 6 is an open, interoperable solution that works with any endpoint and offers flexibility in telemetry, load balancing, and congestion control.


Accelerated Mainframe App Delivery Demonstration Using PopUp Mainframe

Event: Tech Field Day Extra at SHARE Cleveland 2025

Appearance: PopUp Mainframe Presents at Tech Field Day Extra at SHARE Cleveland 2025

Company: PopUp Mainframe

Video Links:

Personnel: Gary Thornhill

PopUp Mainframe enables accelerated application delivery on the mainframe through modern DevOps practices and automation. In this demonstration, we will showcase how to quickly set up a fully functional mainframe development and test environment using PopUp Mainframe, demonstrate a CI/CD pipeline that includes code modifications, testing, and rollback capabilities, and provide insight into how this empowers development teams and simplifies operations with tools like Ansible.

In the demonstration at Tech Field Day Extra at SHARE Cleveland 2025, Gary Thornhill from PopUp Mainframe showcased how their virtualized mainframe solution can dramatically streamline mainframe application development and delivery. The demo illustrated how PopUp Mainframe enables a modern CI/CD pipeline using open-source tools and IBM’s deployment technologies, exemplified through a sample application called NextGen Bank. It emphasized the speed at which development environments can be created and destroyed—minutes instead of hours—as well as the use of snapshot and rewind capabilities that allow developers to easily rollback environments to previous states. Code changes were made using both VS Code and IDZ, committed to GitHub, and deployed via automated pipelines that performed building, unit testing with Cobol Check, and integration testing with Galasa. Issues were discovered via failing tests, the environment was rolled back instantly, and fixes were redeployed in a seamless fashion.

Further, the presentation covered how PopUp Mainframe integrates with Ansible for simplified automation and self-service operations, including user provisioning and environment management. Thornhill stressed the value of using pop-up environments for early and frequent testing—reducing cost and dependence on physical mainframe MSUs—and how PopUp enables developers to work independently without interrupting central production or test environments. He addressed cultural resistance within mainframe teams, comparing it to the adoption challenges faced during the rise of server virtualization. By showing tangible benefits and empowering both younger developers and experienced professionals, he argued, PopUp Mainframe serves as a bridge to modernize legacy environments. The tool also supports sustainability goals by allowing environments to be shut down when not in use, reducing cloud costs and mainframe license impacts.

In conclusion, Thornhill emphasized that PopUp Mainframe offers a breakthrough opportunity in the mainframe paradigm by enabling faster delivery, easier access for non-traditional mainframe users, and flexible test environments that mirror real-world production. The technology not only simplifies and accelerates app delivery but also supports risk-free experimentation, training, and modernization efforts. He cited real-world client success stories with up to 400% improvement in time to market, and reiterated that this tool aligns with organizational goals for agility, cost reduction, and environmental responsibility.


Harnessing IFL and Linux for Accelerated Mainframe Delivery with PopUp Mainframe

Event: Tech Field Day Extra at SHARE Cleveland 2025

Appearance: PopUp Mainframe Presents at Tech Field Day Extra at SHARE Cleveland 2025

Company: PopUp Mainframe

Video Links:

Personnel: Gary Thornhill

Today’s enterprises span mission-critical industries such as financial services, logistics, retail, and government, many of which continue to rely on the IBM Mainframe for its unparalleled resilience and processing power—especially in light of advancements like the AI-enabled IBM Z17. However, software delivery on the mainframe is often slowed by environment availability, siloed team structures, and lack of flexible tooling. Gary Thornhill, Founder and CEO of PopUp Mainframe, presented at Tech Field Day Extra at SHARE Cleveland 2025 to address these issues. His solution centers on PopUp Mainframe, a platform designed to provision virtual mainframe environments quickly across a range of hardware—from IFLs and LinuxONE on IBM Z systems to cloud-hosted and x86 environments—making mainframe development and testing faster, more agile, and accessible.

The PopUp Mainframe platform emulates full Z environments for non-production use and helps alleviate bottlenecks in development cycles by enabling organizations to replicate environments in under 10 minutes. It supports automated provisioning, rollback, and snapshot capabilities through its FastTrack feature, ensuring rapid, repeatable testing and development. Thornhill emphasized the platform’s flexibility to run where customers already have compute capacity—including air-gapped environments or the cloud—supported by a floating license model that allows dormant environments to be spun up only when needed. Critically, running on IFLs offers a 5-10x performance gain versus x86 platforms, leveraging the power of z/Architecture while maintaining total separation from production workloads. Licensing restricts PopUp’s use solely for dev/test purposes, addressing IBM’s compliance policies while giving developers hands-on access to real data (masking supported) and real workloads without regulatory or security compromises.

Security and compatibility are key components of the PopUp proposition. The platform is pre-integrated with IBM, BMC, and open-source tools and supports broader ecosystems including Broadcom and Rocket Software solutions. PopUp maintains full ZOS compatibility regardless of code age or complexity, from modern subsystems to legacy stacks. Mainframe teams gain the ability to serialize and isolate development pipelines and shift left without incurring infrastructure procurement overhead. Integration with tools like VS Code, Galasa, and COBOL Check helps extend modern DevOps practices to mainframe, enabling continuous integration and delivery. Users can securely push clones into any environment, even cloud, backed by persistent ZFS snapshots which achieve 90% disk compression and utilize commodity storage in place of expensive DASD. As Thornhill explained, PopUp allows organizations to quadruple delivery speed, reduce emissions, lower cost, and improve compliance by centralizing masking and maintaining data governance entirely on-Z. PopUp is not just an emulator—it’s a strategic enabler for agile mainframe modernization.


Bringing the Mainframe to the Whole Enterprise with Broadcom WatchTower

Event: Tech Field Day Extra at SHARE Cleveland 2025

Appearance: Broadcom Presents at Tech Field Day Extra at SHARE Cleveland 2025

Company: Broadcom

Video Links:

Personnel: Angelika Heinrich

In this presentation at Tech Field Day Extra at SHARE Cleveland 2025, Angelika Heinrich, Product Manager for Broadcom’s WatchTower real-time streaming capability, discusses the growing imperative for organizations to improve customer satisfaction and operational efficiency through enhanced observability and service reliability. As digital interactions increasingly define customer experiences, enterprises must quickly detect and respond to issues that impact application performance—especially ones involving critical back-end systems like the mainframe. Heinrich emphasizes that WatchTower was developed to align IT visibility with these business needs, enabling technical teams to better understand and respond to end user experiences in real time.

Through the session, Heinrich explains how the traditional siloing of mainframe systems poses challenges in unified observability. While modern DevOps and SRE teams rely on platforms like New Relic, Datadog, and Grafana for monitoring, these tools often lack native support for mainframe environments. WatchTower bridges this gap by streaming mainframe telemetry—encompassing traces, metrics, and logs—in the open telemetry format, a widely adopted standard across observability platforms. This allows SREs, without deep mainframe expertise, to access actionable data insights and correlate performance metrics across distributed and mainframe systems. By doing so, teams can analyze trace-level data, understand latency issues in applications such as KIX transactions, and relate these to environmental conditions or infrastructure constraints.

Heinrich further demonstrates how this unified observability framework empowers operational roles to detect and respond to service degradation more effectively using service level objectives (SLOs). Rather than relying on static thresholds, SLOs allow enterprises to evaluate “good” versus “bad” events through dynamic measurement of user experiences. With WatchTower, contextualized insights—including detailed trace information, correlated logs, and relevant error documentation—all become readily accessible within mainstream observability tools. This not only facilitates faster root cause analysis by SREs but also enables mainframe SMEs to prioritize their efforts on platform innovation rather than acting solely as data interpreters. Ultimately, Broadcom’s WatchTower creates a modern, integrated approach to mainframe observability, democratizing access to critical insights across the entire enterprise.


Greater Insights into Application Performance Management with Broadcom WatchTower

Event: Tech Field Day Extra at SHARE Cleveland 2025

Appearance: Broadcom Presents at Tech Field Day Extra at SHARE Cleveland 2025

Company: Broadcom

Video Links:

Personnel: Petr Klomfar

Broadcom’s presentation, delivered by Petr Klomfar at Tech Field Day Extra at SHARE Cleveland 2025, explored the emerging necessity for sophisticated Application Performance Management (APM) solutions to address modern IT challenges. Titled “Greater Insights into Application Performance Management with Broadcom WatchTower,” the session demonstrated how reduced visibility into mainframe applications, increasing resource costs, and complex modernization efforts are creating new demands for observability, optimization, and operational efficiency. Klomfar introduced Broadcom WatchTower and complementary tools like Application Performance for Z (AP4Z) and Mainframe Application Tuner (MAT), emphasizing their value in providing continuous, scalable, and low-overhead performance data to drive smarter and faster decision-making.

Klomfar began with a historical overview of computing trends, highlighting the shift from early, resource-constrained environments that necessitated deep optimization to an era where increased computing power led to more complacent development practices. However, with today’s economic pressures and pricing models like IBM’s tailor-fit pricing, there’s a renewed emphasis on efficiency. Mainframes now require high levels of visibility and analytical depth to optimize and modernize legacy environments. Klomfar identified challenges such as the “black box” nature of mainframe applications, difficulty in identifying ROI for performance improvements, and the complexity of root cause analysis. To address these, Broadcom’s AP4Z continuously monitors system performance with negligible (often 0.1%) overhead, supporting optimization, modernization planning, and troubleshooting with real-time, context-rich data.

A major focus of the presentation was the synergy between WatchTower and performance tools like AP4Z and MAT. AP4Z provides application profiling that can identify hot spots using Pareto principles, support modernization pathfinding by highlighting application components, and flag anomalies for root cause investigations. Meanwhile, MAT performs deeper, targeted analysis with slightly higher overhead (typically around 3–4%) and feeds into operational workflows via WatchTower alerts. This integrated architecture empowers organizations to decrease mean time to resolution (MTTR), improve SLAs, and make measurable reductions in costs through smarter resource usage. Klomfar concluded by championing the benefits of clear and accessible performance data for diverse roles, from system engineers to developers, affirming Broadcom’s commitment to continual improvement and client collaboration.


Expanding Value with Broadcom WatchTower’s Metric Analysis

Event: Tech Field Day Extra at SHARE Cleveland 2025

Appearance: Broadcom Presents at Tech Field Day Extra at SHARE Cleveland 2025

Company: Broadcom

Video Links:

Personnel: Machhindra Nale, Tom Quinn

At Tech Field Day Extra at SHARE Cleveland 2025, Broadcom presented their advancements in mainframe observability through the WatchTower platform, an evolution designed to improve how performance and optimization data is collected, correlated, and acted upon. The core problem addressed is that traditional domain-specific monitoring tools, like SysView for CICS or NetMaster for networks, operate in functional silos, leading to inefficiencies when diagnosing cross-domain performance issues. WatchTower eliminates this obstacle by aggregating and contextualizing alert data across different mainframe subsystems. As a result, users gain broader visibility through Alert and Contextual Insights—a holistic view that improves incident resolution speed and accuracy while making it easier to distinguish between isolated and system-wide failures.

One of the hallmark features of WatchTower is its intelligent dashboards, which provide both real-time and historical views across previously siloed data domains. These dashboards tap directly into established data collectors from key Broadcom tools such as Vantage, SysView, and NetMaster, ensuring that users can visualize telemetry data from CICS, DB2, MQ, IMS, and network components in one cohesive interface. The dashboards emphasize usability, allowing users to create custom views without writing complex queries. This flexibility accommodates both infrastructure-centric roles, like a CICS administrator, and higher-level application-focused personas. Complementing this visual capability is embedded machine learning (ML), which not only detects anomalies but tracks gradual performance degradations—informing users of potential issues before service level agreements (SLAs) are breached.

Broadcom’s ML Insights extends the usefulness of WatchTower by delivering proactive anomaly detection built on each organization’s unique dataset, rather than relying on static or generic thresholds. This approach tailors models dynamically based on ongoing operations, learning from historical patterns and adjusting its behavior to recognize events like seasonal peaks or daily load cycles. While immediate automation is not triggered by the platform directly, Broadcom allows integration with OpsMVS for customers who wish to design their own event-based playbooks or remediation rules. Although historical data cannot be retroactively loaded into the system at this time, WatchTower begins modeling from initial deployment onward, allowing it to curate dynamic baselines over time. These ML models, however, currently lack a formal feedback loop, though this capability is under consideration for future updates to further enhance adaptive learning.


Building Incremental Observability with Broadcom’s WatchTower Platform

Event: Tech Field Day Extra at SHARE Cleveland 2025

Appearance: Broadcom Presents at Tech Field Day Extra at SHARE Cleveland 2025

Company: Broadcom

Video Links:

Personnel: Michael Kiehl

Broadcom’s WatchTower Platform™ is an observability platform designed to provide a unified view of mainframe and distributed systems. This platform empowers users to identify and resolve issues more swiftly. It seamlessly integrates data from diverse sources into a single interface, leveraging capabilities such as alerting, machine learning, application profiling, and real-time streaming. These features enable enhanced troubleshooting, performance optimization, and overall operational efficiency.

In their presentation at Tech Field Day Extra at SHARE Cleveland 2025, Broadcom showcased how they are building on their longstanding mainframe tools by layering the WatchTower observability platform over them to streamline diagnostics and integrate across the enterprise. The goal is to relieve operators and subject matter experts from navigating disparate systems by centralizing data for problem detection and resolution. The platform caters to both traditional mainframe operators and modern Site Reliability Engineers (SREs), enabling end-to-end visibility from applications through to mainframe systems, and integrates with popular observability tools like Datadog and Splunk.

WatchTower expands on legacy products such as SysView, NetMaster, OpsMVS, and MAT, offering new capabilities like data streaming, data visualization, role-based access, and topology mapping. These innovations allow for automated correlation of events, targeted alerting with contextual insight, and easier collaboration between SREs and performance analysts. Importantly, these enhancements are available without additional licensing for customers already using Broadcom’s core products. Broadcom emphasizes incremental value by progressively adding toolsets that improve real-time observability and reduce mean time to resolution, while still allowing experts to dive into native interfaces for deeper analysis when needed.


Next-Level Security and Resilience with VMware Cloud Foundation 9.0

Event:

Appearance: VMware Cloud Foundation 9.0 Showcase – Modern Private Cloud

Company: Broadcom

Video Links:

Personnel: Bob Plankers

When you think about cloud infrastructure security there are three main goals you are trying to achieve. First, you want to be secure quickly and stay that way. Second, you want to drive trust in your infrastructure. Third, you want to be resilient, easily. Broadcom’s Bob Plankers will take you through the latest security innovations in VMware Cloud Foundation 9.0 for providing next-level security, trust and resilience, empowering IT operations amidst regulatory complexities and geopolitical uncertainty.

The presentation focused on security and trust in VCF 9.0, emphasizing a “security first” approach, prioritizing ongoing security practices over infrequent compliance audits. A key theme was enabling customers to be secure faster, recognizing that security is a means to delivering services and running workloads. Plankers highlighted the importance of resilience, referencing features like vMotion and the EU’s Digital Operational Resilience Act, addressing both tactical and strategic scenarios such as failed application upgrades and disaster recovery.

The core differentiator of VCF 9.0 is inherent trust in the stack, moving towards less trust and more continuous verification. This includes verifying the platform’s security state, data sovereignty, and controlled access. The discussion covered lifecycle patching enhancements with Lifecycle Manager, aiming to simplify updates and manage multi-vendor cluster images. Features like live patching, custom EVC profiles, and improved GPU usage were also discussed as facilitating easier maintenance and patching, reducing friction.

The presentation went into deep dive on enhancements inside the hypervisor for security, including code signing, secure boot, and sandboxing. Confidential computing with AMD SEV-ES and Intel SGX technologies was explored, along with the introduction of a user-level monitor to de-privilege VM escapes. Workload security improvements encompass secure boot, hardened virtual USB, TPM 2.0 updates, and forensic snapshots. Cryptographic enhancements included TLS 1.3 by default, cipher suite selection, and key wrapping. Centralized password management, unified security operations, and standardized APIs for role-based access control further enhance security and automation.


Unpacking Storage In VMware Cloud Foundation 9.0

Event:

Appearance: VMware Cloud Foundation 9.0 Showcase – Modern Private Cloud

Company: Broadcom

Video Links:

Personnel: John Nicholson

VMware vSAN as a part of VMware Cloud Foundation 9.0 brings new functionality that not only extends its capabilities in ways never seen before, but also integrates into VCF in a manner that makes it a natural and cohesive extension. vSAN is clearly the premier storage solution for VMware Cloud Foundation. Broadcom’s John Nicholson will take you through the latest storage innovations, and how they deliver enhanced TCO and flexibility, secure and resilient storage, multi-site operations, and a storage platform for all workloads.

John Nicholson from Broadcom detailed the storage enhancements in VMware Cloud Foundation 9.0, focusing on improvements to operations, disaster recovery, performance, and security. He highlighted new operational consoles and tools for multi-site management, including diagnostics capabilities and IO Insight for workload analysis. The presentation included a demo showcasing the IO Trip Analyzer for end-to-end IO path troubleshooting and discussed the overhead of VSCSI tracing.

A key feature discussed was the new cluster-wide global deduplication for vSAN ESA, which uses a 4K fixed block granularity and is performed asynchronously to minimize impact on write performance. Nicholson addressed concerns about encrypted storage, emphasizing that vSAN offers data-at-rest and data-in-transit encryption to meet compliance requirements while still enabling compression and deduplication where possible. The presentation also covered support for multiple vSAN deployment types, including single-site clusters, disaggregated storage clusters, and imported clusters, along with the ability to split networking for vSAN storage clusters.

Nicholson also presented vSAN to vSAN replication, enhancing data protection by integrating with VMware Live Recovery (formerly Site Recovery Manager). He showed how this combined solution supports replication, disaster recovery, and ransomware protection, all managed through a single appliance. He also covered improvements in stretch cluster support, like site-based maintenance mode and forced recovery takeover. The presentation concluded with a discussion about the current state of storage technology, highlighting the cost-effectiveness and scalability of NVMe drives and the benefits of vSAN within the VMware ecosystem.


VMware Cloud Foundation 9.0 – A Unified Platform for All Applications

Event:

Appearance: VMware Cloud Foundation 9.0 Showcase – Modern Private Cloud

Company: Broadcom

Video Links:

Personnel: Katarina Brookfield

As modern applications continue to evolve, so must the platforms that support them. VMware Cloud Foundation (VCF) is uniquely positioned as a single platform that seamlessly runs both VMs and containers – bridging the gap between traditional workloads and modern, cloud-native applications. In this session, Broadcom’s Katarina Brookfield will explore the latest innovations in vSphere Supervisor, the integrated Kubernetes-based declarative API layer that’s become foundational to the private cloud experience in VCF. Learn how these enhancements accelerate Kubernetes operations while preserving the control and consistency enterprises demand. We’ll dive into the latest capabilities that elevate flexibility, isolation, and operational efficiency – highlighting enhancements like Management and Workload Zone separation, modular enablement of the Supervisor, namespace isolation integrated with VPCs, and significant improvements to the VM Service, including support for importing existing VMs. We’ll also showcase updates to the vSphere Kubernetes Service (VKS), offering a powerful, built-in Kubernetes runtime optimized for VCF environments.

The VCF 9 presentation highlighted its unified platform approach, leveraging the vSphere Supervisor declarative API to manage both VMs and containers, providing a cloud-like experience within a private cloud environment. The core idea is extensibility, allowing users to select capabilities from a catalog and introduce new functionalities, while abstracting away the underlying infrastructure complexities like compute, storage, and networking. Katarina Brookfield demonstrated deploying a virtual machine and a Kubernetes cluster through a single user interface, emphasizing new features in VCF9 such as deploying VMs from ISO images, enhanced network configuration with VPC integration, and guided CloudInit inputs, plus improvements to customization of VMs, all handled through a curated interface by administrators.

A significant portion of the presentation focused on vSphere Kubernetes Service (VKS), showcasing its ease of operation and extensive functionality. Users can customize Kubernetes clusters, mixing operating systems and adding labels. The VCF CLI facilitates managing these clusters, allowing users to register clusters, create contexts, and manage packages, including Istio support. Brookfield demonstrated how cloud admins can update the VKS service version, unlocking new Kubernetes releases for consumer deployment, ensuring governance remains with the cloud admin while empowering consumers with the flexibility to update their clusters.

The presentation concluded with a demonstration of GitOps patterns using Argo CD service, a new addition that enables continuous delivery of applications. Katarina Brookfield showed how to deploy an Argo CD instance and integrate it with a GitHub repository containing YAML files for both Kubernetes clusters and virtual machines. The talk also touched on how the Supervisor layer is decoupled to expedite release of new features. Broadcom emphasized that the latest functionalities are best experienced by making VCF Automation the single point of entry to the whole ecosystem.


VMware Cloud Foundation’s Shift to Self-Service Private Cloud Consumption

Event:

Appearance: VMware Cloud Foundation 9.0 Showcase – Modern Private Cloud

Company: Broadcom

Video Links:

Personnel: Vincent Riccio

Unlock the next era of private cloud innovation and discover how VCF Automation within VMware Cloud Foundation is shaping the future of cloud infrastructure. This session explores how VCF Automation facilitates modern private cloud operations, enabling quick provisioning and simplified scaling in multi-tenant environments through self-service IaaS. Gain the speed to bring applications to market faster—without compromising control, thanks to policy-based governance designed for the mordern enterprise. Broadcom’s Vincent Riccio will take you on a technical deep dive into the innovations powering this shift: a Modern Cloud Interface that delivers public cloud-like IaaS straight out-of-the-box, advanced tenant management, centralized content control, policy as code, and more. See how your organization can build, run, and manage diverse workloads—faster, smarter, and more securely—as you step into a future-ready private cloud.

Vincent Riccio’s presentation at the VCF 9.0 Showcase focused on Broadcom’s efforts to automate private cloud investments within VCF9. He emphasized the shift towards a self-service consumption model, enabling business units to deploy applications and services with greater agility. Key components of this automation include improved tenant management through the introduction of “organizations,” centralized content control via content libraries, and policy-as-code capabilities for governance. Riccio also highlighted the integration of vSphere services, such as the VM service and VKS service, into the automation framework, enabling users to deploy VMs and Kubernetes clusters more easily.

The presentation also delved into the architecture of the new solution, emphasizing the importance of the supervisor in vCenter for enabling the “all-apps” experience. Riccio explained how regions, comprised of one or more supervisors, abstract resources across the VCF fleet for consumption. He introduced the concept of projects within organizations, enabling further isolation and management of users and namespaces. The presentation concluded with a demonstration of the new features, including the deployment of VMs and Kubernetes clusters using the services UI and the exploration of the catalog for more curated, “anything as a service” type deployments.


VMware Cloud Foundation 9.0 – The Smarter Way to Operate Private Cloud

Event:

Appearance: VMware Cloud Foundation 9.0 Showcase – Modern Private Cloud

Company: Broadcom

Video Links:

Personnel: Kelcey Lemon, Kyle Gleed

Effectively managing a large-scale Private Cloud environment demands robust operational strategies to effectively operate a large-scale private cloud. VMware Cloud Foundation Operations delivers these capabilities, enabling IT teams to ensure consistent access, security, lifecycle management as well as performance, cost efficiency, resource utilization, and infrastructure and application health Whether you’re an experienced VCF administrator or beginning your VCF journey, you’ll leave this session equipped to confidently operate and optimize VMware Cloud Foundation at enterprise scale.

The presentation highlights the new innovative features available with VCF operations and VMware Cloud Foundation 9.0, focusing on fleet management and chargeback capabilities. Fleet management includes a unified single sign-on (SSO) capability via the VCF Identity Broker (VIDB), centralized certificate and password management, configuration drift updates, and simplified lifecycle management for applying patches and upgrades. The goal is to reduce the number of UIs and management points, providing a seamless operating experience from VCF operations, also with automation and API improvements.

The chargeback feature streamlines FinOps processes by integrating financial management with operational processes, enabling cost transparency and accountability. Key capabilities include defining rate cards for compute, storage, and networking, generating bills on demand or via scheduling, and sharing bills with tenants who can view detailed cost breakdowns within the automation console of VCF. The chargeback feature complements VCF’s showback capabilities, which provide visibility into the total cost of ownership, potential savings opportunities, and resource optimization. The demonstration illustrates the cost-saving opportunities through resource reclamation, rightsizing, and transparent resource cost.


Best Practices for Adopting & Deploying VMware Cloud Foundation 9.0

Event:

Appearance: VMware Cloud Foundation 9.0 Showcase – Modern Private Cloud

Company: Broadcom

Video Links:

Personnel: Jared Burns

VMware Cloud Foundation 9.0 introduces significant architectural enhancements that impact how modern private clouds are built and managed. This session provides some of the real-world upgrade pathways for both existing VCF 5.x users and the broader base of non-VCF customers looking to adopt the platform. From greenfield deployments to brownfield upgrades, Broadcom’s Jared Burns walks through the best practices, key considerations, and deployment strategies that align with diverse IT environments and business needs. This session is designed for IT professionals, cloud architects, and decision-makers who want to understand VCF 9’s transformative architecture and gain actionable insights into a smooth upgrade.

Jared Burns of Broadcom highlights the new VCF 9 architecture centered around the concept of a “VCF Cloud Foundation Fleet.” This fleet consists of one instance along with operations and automation that run across it. The fleet enables centralized management across multiple VCF instances, and multiple VCF fleets can be grouped into a VMware Cloud Foundation and Private Cloud. Key design considerations include centralized operations management, initial deployment based on the VCF fleet deployment basic design, and flexibility with multiple clusters within a single domain. Four deployment designs are presented: Basic, Site High Availability, Disaster Recovery, and a combined HA/DR approach, with the Basic design serving as the foundation for the others.

A key shift in VCF 9 is the increased flexibility in storage options. While vSAN remains supported, Fibre Channel and NFS are now supported out of the box for the management domain, offering more choices for Greenfield deployments. The presentation outlines detailed design decisions for Greenfield deployments, including considerations for fault domains, operations placement, scale, and organizational separation. Two deployment models for operations, Simple and High Availability, are discussed, along with scalability options.  Additional considerations include vCenter limits, host limits, HCL compliance, and IP/DNS requirements. The upgrade process emphasizes the importance of design planning and performing all prerequisites, due to changes like the removal of Enhanced Linked Mode and VMware Update Manager.

For upgrades from vSphere environments, a nine-step process is outlined, emphasizing the shift to keyless licensing and the move to vSphere Lifecycle Manager. The VCF installer now handles the conversion process, simplifying upgrades compared to previous versions. Customers are expected to be able to perform these upgrades themselves with the help of available upgrade guides. Significant changes include the replacement of Enhanced Link Mode with VMware Cloud Foundation operations and VMware Identity Broker, along with new IP address requirements and licensing procedures. Various import scenarios for workload domains are supported, including NSX-attached domains and standalone hosts.  Two distinct depots must be configured: SDC Manager’s depot and the VCF Operations Fleet Manager depot.


What’s New in VMware Cloud Foundation 9.0

Event:

Appearance: VMware Cloud Foundation 9.0 Showcase – Modern Private Cloud

Company: Broadcom

Video Links:

Personnel: Sabina Anja

Step into the future of infrastructure modernization with VMware Cloud Foundation 9.0, the next evolution of VCF. In this session, Broadcom’s Sabina Anja will walk you through new innovations that will redefine how your private cloud operates. Explore features in lifecycle management, fleet management, virtual private cloud networking, and hyper-converged infrastructure (HCI) storage. Discover how these advancements will simplify deployment, streamline operational efficiency, and elevate infrastructure performance. Gain insights into the strategic implications of enhanced capabilities and learn how they empower your organization to build and manage a resilient, future-ready private cloud infrastructure.


Edge to Cloud Security: Harnessing NAC, SASE and ZTNA

Event: Networking Field Day 38

Appearance: HPE Aruba Networking Presents at Networking Field Day 38

Company: HPE Aruba Networking

Video Links:

Personnel: Adam Fuoss, Mathew George

See new Central cloud native NAC; SASE with SSE, SD-WAN & NAC; new ZTNA natively in SD-WAN Gateways. Adam Fuoss, VP of Product for EdgeConnect SD-WAN, outlined HPE Aruba Networking’s integrated SASE portfolio, comprising SSE (Security Service Edge) for cloud-based security focused on ZTNA (Zero Trust Network Access), EdgeConnect SD-WAN for connecting diverse locations, and ClearPass/NAC (Network Access Control). He highlighted the challenge of traditional ZTNA connectors, which often rely on virtual machines in data centers, leading to inefficient traffic hair-pinning when applications reside in branches. To address this, HPE Aruba Networking has integrated the SSE connector as a container directly into the EdgeConnect SD-WAN appliance, allowing users to connect to cloud security services and then directly to branch applications without backhauling traffic, significantly improving efficiency for distributed applications and remote contractors.

Mathew George, a Technical Marketing Engineer, then provided an overview of Central NAC, HPE Aruba Networking’s cloud-native NAC offering. This solution aims to simplify user and device connectivity by leveraging cloud-based identity sources like Google Workspace, Microsoft Entra, and Okta for authentication and authorization. Central NAC uses Client Insights for advanced device profiling, combining fingerprints with traffic flow information and AI/ML models for accurate classification. It integrates with third-party systems like MDM and EDR solutions to pull compliance attributes, which are then used in NAC policies. Central NAC also supports certificate-based authentication (including “Bring Your Own Certificate” with external PKI), MPSK (Multi-Pre-Shared Key) for user-based or admin-based device authentication, and various guest workflows. A key feature demonstrated was the real-time re-authentication and policy enforcement based on changes in the Identity Provider (IdP), showcasing true Zero Trust in action.

The presentation underscored HPE Aruba Networking’s commitment to a unified Zero Trust posture across their entire portfolio. The vision is for a single policy engine to enforce security from Wi-Fi and IoT devices all the way through switches, access points, gateways, and the SSE cloud. This includes multi-vendor support, allowing for VLAN enforcement on third-party switches like Cisco. While Central NAC streamlines simpler use cases, ClearPass continues to address more complex, on-premise requirements. The overall message emphasized leveraging telemetry-based networking and AI-driven insights to enhance security, improve endpoint experiences, and provide engineers with the necessary data to maintain optimal network performance, ultimately enabling a truly integrated security and networking approach from edge to cloud.