|
|
This video is part of the appearance, “Southworks Presents at Tech Field Day at KubeCon North America 2025“. It was recorded as part of Tech Field Day at KubeCon North America 2025 at 16:00-17:30 on November 11, 2025.
Watch on YouTube
Watch on Vimeo
A Cloud Scou is a forward-deployed engineer who joins the product team to co-own reliability, scalability, and evolution. Drawing from the Forward-Deployed Engineer for SR and AI-Managed DevCrew models, Scouts act as both architectural advisors and implementers — blending human judgment with AI-driven companions to build, test, and tune cloud-native systems. We walk through how this embedded approach fosters continuous improvement, strengthens technical decision-making, and creates a shared sense of accountability between Dev, Ops, and AI.
Johnny Halife from SOUTHWORKS presented an example of their work with a European streaming service facing issues with their electronic program guide (EPG). The EPG, built on Node.js, Lambda, S3, BigQuery, and XML, was experiencing blank displays due to ingestion problems. The issue was traced to an unexpected 413 error indicating that the request entity was too large, specifically related to image transformation failures. This problem was impacting viewers, who were seeing blank screens.
To address this, SOUTHWORKS employed a Cloud Scout, leveraging tools such as GitHub Copilot and their own MCP servers, which are connected to AWS CloudWatch. The process began with the scout prompting GitHub Copilot to create a Jira ticket, which was then assigned. The agent analyzed the error by running CloudWatch MCP, finding related logs, and contextualizing them within the solution codebase. This analysis revealed a missing validation and a data conflict between files, providing evidence-backed insights. The agent then proposed solutions, including code changes, which were compiled into a pull request.
The final step involved a code review by the Scout, along with standard organizational pre- and post-requisites, including SonarQube and linting. This process, previously taking days, was reduced to a few hours. By implementing this AI-assisted approach, the streaming service experienced faster issue resolution, fewer noisy alerts, and predictive scoring for deployments, resulting in a significant reduction in recovery time. This approach enabled them to transition from a defensive strategy of increased monitoring and tooling to a proactive approach, aimed at preventing issues before they arise by analyzing past incidents and identifying potential risks.
Personnel: Johnny Halife








