Jan 23, 2025

The Unofficial Guide to AI at KubeCon EU 2025

Here's my unofficial guide to exploring everything AI at KubeCon London 2025!

Wilson Spearman

Co-founder

AI at KubeCon 2025
AI at KubeCon 2025

KubeCon Europe 2025 is coming up soon, this year in London! My biggest takeaway from last year’s KubeCon is that planning is key. There are so many talks and events. So to help you in deciding how to best spend your time this year, I’ve curated a list of the talks I’m most excited for this year. This list is focused on the intersection of AI and Cloud Native tech. This comes down to two categories: Kubernetes for AI (think HPC orchestration, scaling models and training, etc) and AI for Kubernetes (think LLMs for incident response, cluster management, etc).

Parity is also excited to have a big presence at KubeCon this year! We’ll be hosting a booth in the Solutions Hall, hosting coffee chats throughout the week, and hosting an invite-only dinner (details to come).

AI for Kubernetes

Superpowers for Humans of Kubernetes: How K8sGPT Is Transforming Enterprise Ops

🚨This is my top recommended event! Alex is one of the creators of k8sGPT and has great perspective on leveraging AI in k8s 🚨

Time: Wednesday April 2, 2025 14:30 - 15:00 BST
Location: Level 1 | Hall Entrance S10 | Room C
Speaker: Alex Jones, AWS & Anais Urlichs, JP Morgan Chase

Description: Humans cannot scale like software, and our ability to diagnose and triage is finite. Imagine the burden of operating dozens of tenants across multiple clusters. It’s going to take a team, no lone hero can keep the lights on and the customers happy.

Until now.

Register for the k8sGPT talk here!

Keynote: AI Enabled Observability Explainers - We Actually Did Something With AI!

Time: Wednesday April 2, 2025 09:48 - 10:03 BST
Location: Level 0 | ICC Auditorium
Speaker: Vijay Samuel, Principal MTS, Architect, eBay

Description: If folks think that this will be yet another hand wavy AI talk, prepared to be disappointed! Over the last few quarters, the Observability platform team at eBay has embarked on the journey of building "Explainers" for telemetry signals. "So, you are just shoving data into an LLM, big deal!" - one might say. The approach that we took was slightly different. Yes, an LLM does know how to interpret an OTEL trace waterfall but does it do it predictably? No! For various reasons. This is where AI and Engineering have a beautiful marriage. For each signal, we have carefully married crafty algorithms and LLMs to create more predictable and accurate AI enabled experiences. Some of which include explaining traces, metrics and logs.

Learn more about this keynote here

AI Beyond Autocomplete: Using LLMs To Create 1000 Kubernetes Controllers

Time: Friday April 4, 2025 15:15 - 15:45 BST
Location: Level 0 | ICC Captial Hall | Room 1
Speaker: Justin Santa Barbara & Walter Fender, Google

Description: LLMs can generate React apps, poems, and even music. But can they rise to the ultimate challenge: writing reliable Kubernetes controllers? The Config Connector team say "yes!" We are successfully using AI to write production controllers for a thousand google cloud resources.

Sign up to attend here

Autonomous AI Agents in Production: Slashing Cloud Cost Root Cause Analysis From Hours To Minutes

Time: Thursday April 3, 2025 15:00 - 15:30 BST
Location: Level 1 | Hall Entrance S10 | Room B
Speaker: Ilya Lyamkin, Spotify

Description: As cloud infrastructures scale, traditional cost monitoring struggles to identify root causes of spending anomalies. This technical deep-dive shows how autonomous AI agents transformed our cost observability pipeline, reducing root cause analysis time from hours to under 5 minutes. We'll examine the agent architecture including deployment patterns, distributed cost tracing, and automated analysis workflows. Learn how we engineered AI agents to correlate cost signals across cloud services, implemented real-time pattern recognition with ML models, and built resilient feedback loops. Through production examples, we'll share our journey from manual investigation to automated root cause identification, including challenges in scaling agent intelligence. Attendees will gain practical insights into building their own AI-powered cost analysis system that scales with their infrastructure.

Register for Ilya's talk here


Kubernetes for AI

Keynote: LLM-Aware Load Balancing in Kubernetes: A New Era of Efficiency

Time: Friday April 4, 2025 09:06 - 09:21 BST
Location: Level 0 | ICC Auditorium
Speaker: Clayton Coleman, Distinguished Engineer, Google & Jiaxin Shan, Software Engineer, Bytedance

Description: Traditional load balancing approaches, including round robin or those relying on metrics like QPS are often ineffective when applied to LLM serving. LLM requests vary significantly in computational demands due to prompt length, the model differences and their autoregressive nature, leading to unpredictable request running times. Moreover, the emergence of model multiplexing techniques (e.g., LoRA) introduces new complexities that necessitate LLM-aware load balancing strategies. In this talk, we introduce a new set of Kubernetes APIs for routing to LLM workloads that allow configuration of serving objectives and priorities for each use case. These APIs integrate seamlessly with Gateway API, and an included extension means that support for these APIs can easily be plugged into many Gateway API implementations to enable turnkey LLM routing support.

Learn more about this keynote here

Deep Dive To AI Agent Observability

Time: Wednesday April 2, 2025 15:15 - 15:45 BST
Location: Level 1 | Hall Entrance N10 | Room E
Speaker: Guangya Liu, IBM & Karthik Kalyanaraman, Langtrace AI

Description: OpenTelemetry has emerged as a powerful framework for observability in cloud-native applications, but how does it apply to the intricate needs of AI Agent observability? This session explores the journey of leveraging OpenTelemetry to monitor, trace, and analyze AI Agents. We’ll cover key challenges such as capturing metrics for multi-agent systems, tracing inference workflows, and correlating AI-specific data like model performance and decision latency.

Sign up to attend here

Asimov's Zeroth Law of Robotics: Observability for AI

Time: Wednesday April 2, 2025 16:15 - 16:45 BST
Location: Level 1 | Hall Entrance N10 | Room E
Speaker: Nicole van der Hoeven, Senior Developer Advocate, Grafana Labs

Description: A robot may not harm humans. A robot must obey humans. A robot must protect its own existence. These are Isaac Asimov's three Laws of Robotics, created to govern the ethical programming of artificial intelligences. From the Butlerian Jihad to Skynet to cylons, we've been immortalizing our collective nightmares about artificial intelligence for years. But there's an unmentioned law that comes as a prerequisite to all of that: a robot must be observable.

Register for Nicole's talk here

Keynote: Into the Black Box: Observability in the Age of LLMs

Time: Wednesday April 2, 2025 09:26 - 09:41 BST
Location: Level 0 | ICC Auditorium
Speaker: Christine Yen, CEO/Cofounder, Honeycomb

Description: LLMs can provide a quick injection of magic into an existing product (or product concept)! Most of us looking to build on LLMs aren't ML engineers or AI experts, after all, and this new wave of LLM offerings makes it easy for any of us to build something delightful.

But once that product or feature is shipped, in production, in front of users, the problems all collapse back into something that feels awfully familiar: performance challenges, questionable accuracy, and unhappy or confused users.

This talk will assert that building on LLMs is just like buliding on top of any other sort of black box in our architecture (APIs, DBs, etc)—this one just happens to be inherently unpredictable and probablistic.

Learn more about this keynote here

Advancements in AI/ML Inference Workloads on Kubernetes

Time: Wednesday April 2, 2025 11:15 - 11:45 BST
Location: Level 3 | ICC Capital Suite 7-9
Speaker:
Yuan Tang, Principal Software Engineer, Red Hat, Eduardo Arango Gutierez DE, Senior Systems Software Engineer, NVIDIA

Description: The emergence of Generative AI (GenAI) has introduced new challenges and demands in AI/ML inference, necessitating advanced solutions for efficient serving infrastructures. The Kubernetes Working Group Serving (WG Serving) is dedicated to enhancing serving workload on K8s, especially for hardware-accelerated AI/ML inference. This group prioritizes compute-intensive inference scenarios using specialized accelerators, benefiting various serving workloads such as web services and stateful databases.

Sign up to attend here

Cloud Native + Kubernetes AI Day

There is a dedicated co-hosted event for AI in the Kubernetes and Cloud Native Ecosystem this year! It requires the all access pass and takes place on April 1, prior to the start of KubeCon.

Parity at KubeCon

We're excited to be hosting a booth at KubeCon's Solution Hall this year. Come by to say hello and pick up some Parity swag. We'll also be having coffee chats and a dinner this year, details to come!

KubeCon Europe 2025 is coming up soon, this year in London! My biggest takeaway from last year’s KubeCon is that planning is key. There are so many talks and events. So to help you in deciding how to best spend your time this year, I’ve curated a list of the talks I’m most excited for this year. This list is focused on the intersection of AI and Cloud Native tech. This comes down to two categories: Kubernetes for AI (think HPC orchestration, scaling models and training, etc) and AI for Kubernetes (think LLMs for incident response, cluster management, etc).

Parity is also excited to have a big presence at KubeCon this year! We’ll be hosting a booth in the Solutions Hall, hosting coffee chats throughout the week, and hosting an invite-only dinner (details to come).

AI for Kubernetes

Superpowers for Humans of Kubernetes: How K8sGPT Is Transforming Enterprise Ops

🚨This is my top recommended event! Alex is one of the creators of k8sGPT and has great perspective on leveraging AI in k8s 🚨

Time: Wednesday April 2, 2025 14:30 - 15:00 BST
Location: Level 1 | Hall Entrance S10 | Room C
Speaker: Alex Jones, AWS & Anais Urlichs, JP Morgan Chase

Description: Humans cannot scale like software, and our ability to diagnose and triage is finite. Imagine the burden of operating dozens of tenants across multiple clusters. It’s going to take a team, no lone hero can keep the lights on and the customers happy.

Until now.

Register for the k8sGPT talk here!

Keynote: AI Enabled Observability Explainers - We Actually Did Something With AI!

Time: Wednesday April 2, 2025 09:48 - 10:03 BST
Location: Level 0 | ICC Auditorium
Speaker: Vijay Samuel, Principal MTS, Architect, eBay

Description: If folks think that this will be yet another hand wavy AI talk, prepared to be disappointed! Over the last few quarters, the Observability platform team at eBay has embarked on the journey of building "Explainers" for telemetry signals. "So, you are just shoving data into an LLM, big deal!" - one might say. The approach that we took was slightly different. Yes, an LLM does know how to interpret an OTEL trace waterfall but does it do it predictably? No! For various reasons. This is where AI and Engineering have a beautiful marriage. For each signal, we have carefully married crafty algorithms and LLMs to create more predictable and accurate AI enabled experiences. Some of which include explaining traces, metrics and logs.

Learn more about this keynote here

AI Beyond Autocomplete: Using LLMs To Create 1000 Kubernetes Controllers

Time: Friday April 4, 2025 15:15 - 15:45 BST
Location: Level 0 | ICC Captial Hall | Room 1
Speaker: Justin Santa Barbara & Walter Fender, Google

Description: LLMs can generate React apps, poems, and even music. But can they rise to the ultimate challenge: writing reliable Kubernetes controllers? The Config Connector team say "yes!" We are successfully using AI to write production controllers for a thousand google cloud resources.

Sign up to attend here

Autonomous AI Agents in Production: Slashing Cloud Cost Root Cause Analysis From Hours To Minutes

Time: Thursday April 3, 2025 15:00 - 15:30 BST
Location: Level 1 | Hall Entrance S10 | Room B
Speaker: Ilya Lyamkin, Spotify

Description: As cloud infrastructures scale, traditional cost monitoring struggles to identify root causes of spending anomalies. This technical deep-dive shows how autonomous AI agents transformed our cost observability pipeline, reducing root cause analysis time from hours to under 5 minutes. We'll examine the agent architecture including deployment patterns, distributed cost tracing, and automated analysis workflows. Learn how we engineered AI agents to correlate cost signals across cloud services, implemented real-time pattern recognition with ML models, and built resilient feedback loops. Through production examples, we'll share our journey from manual investigation to automated root cause identification, including challenges in scaling agent intelligence. Attendees will gain practical insights into building their own AI-powered cost analysis system that scales with their infrastructure.

Register for Ilya's talk here


Kubernetes for AI

Keynote: LLM-Aware Load Balancing in Kubernetes: A New Era of Efficiency

Time: Friday April 4, 2025 09:06 - 09:21 BST
Location: Level 0 | ICC Auditorium
Speaker: Clayton Coleman, Distinguished Engineer, Google & Jiaxin Shan, Software Engineer, Bytedance

Description: Traditional load balancing approaches, including round robin or those relying on metrics like QPS are often ineffective when applied to LLM serving. LLM requests vary significantly in computational demands due to prompt length, the model differences and their autoregressive nature, leading to unpredictable request running times. Moreover, the emergence of model multiplexing techniques (e.g., LoRA) introduces new complexities that necessitate LLM-aware load balancing strategies. In this talk, we introduce a new set of Kubernetes APIs for routing to LLM workloads that allow configuration of serving objectives and priorities for each use case. These APIs integrate seamlessly with Gateway API, and an included extension means that support for these APIs can easily be plugged into many Gateway API implementations to enable turnkey LLM routing support.

Learn more about this keynote here

Deep Dive To AI Agent Observability

Time: Wednesday April 2, 2025 15:15 - 15:45 BST
Location: Level 1 | Hall Entrance N10 | Room E
Speaker: Guangya Liu, IBM & Karthik Kalyanaraman, Langtrace AI

Description: OpenTelemetry has emerged as a powerful framework for observability in cloud-native applications, but how does it apply to the intricate needs of AI Agent observability? This session explores the journey of leveraging OpenTelemetry to monitor, trace, and analyze AI Agents. We’ll cover key challenges such as capturing metrics for multi-agent systems, tracing inference workflows, and correlating AI-specific data like model performance and decision latency.

Sign up to attend here

Asimov's Zeroth Law of Robotics: Observability for AI

Time: Wednesday April 2, 2025 16:15 - 16:45 BST
Location: Level 1 | Hall Entrance N10 | Room E
Speaker: Nicole van der Hoeven, Senior Developer Advocate, Grafana Labs

Description: A robot may not harm humans. A robot must obey humans. A robot must protect its own existence. These are Isaac Asimov's three Laws of Robotics, created to govern the ethical programming of artificial intelligences. From the Butlerian Jihad to Skynet to cylons, we've been immortalizing our collective nightmares about artificial intelligence for years. But there's an unmentioned law that comes as a prerequisite to all of that: a robot must be observable.

Register for Nicole's talk here

Keynote: Into the Black Box: Observability in the Age of LLMs

Time: Wednesday April 2, 2025 09:26 - 09:41 BST
Location: Level 0 | ICC Auditorium
Speaker: Christine Yen, CEO/Cofounder, Honeycomb

Description: LLMs can provide a quick injection of magic into an existing product (or product concept)! Most of us looking to build on LLMs aren't ML engineers or AI experts, after all, and this new wave of LLM offerings makes it easy for any of us to build something delightful.

But once that product or feature is shipped, in production, in front of users, the problems all collapse back into something that feels awfully familiar: performance challenges, questionable accuracy, and unhappy or confused users.

This talk will assert that building on LLMs is just like buliding on top of any other sort of black box in our architecture (APIs, DBs, etc)—this one just happens to be inherently unpredictable and probablistic.

Learn more about this keynote here

Advancements in AI/ML Inference Workloads on Kubernetes

Time: Wednesday April 2, 2025 11:15 - 11:45 BST
Location: Level 3 | ICC Capital Suite 7-9
Speaker:
Yuan Tang, Principal Software Engineer, Red Hat, Eduardo Arango Gutierez DE, Senior Systems Software Engineer, NVIDIA

Description: The emergence of Generative AI (GenAI) has introduced new challenges and demands in AI/ML inference, necessitating advanced solutions for efficient serving infrastructures. The Kubernetes Working Group Serving (WG Serving) is dedicated to enhancing serving workload on K8s, especially for hardware-accelerated AI/ML inference. This group prioritizes compute-intensive inference scenarios using specialized accelerators, benefiting various serving workloads such as web services and stateful databases.

Sign up to attend here

Cloud Native + Kubernetes AI Day

There is a dedicated co-hosted event for AI in the Kubernetes and Cloud Native Ecosystem this year! It requires the all access pass and takes place on April 1, prior to the start of KubeCon.

Parity at KubeCon

We're excited to be hosting a booth at KubeCon's Solution Hall this year. Come by to say hello and pick up some Parity swag. We'll also be having coffee chats and a dinner this year, details to come!

KubeCon Europe 2025 is coming up soon, this year in London! My biggest takeaway from last year’s KubeCon is that planning is key. There are so many talks and events. So to help you in deciding how to best spend your time this year, I’ve curated a list of the talks I’m most excited for this year. This list is focused on the intersection of AI and Cloud Native tech. This comes down to two categories: Kubernetes for AI (think HPC orchestration, scaling models and training, etc) and AI for Kubernetes (think LLMs for incident response, cluster management, etc).

Parity is also excited to have a big presence at KubeCon this year! We’ll be hosting a booth in the Solutions Hall, hosting coffee chats throughout the week, and hosting an invite-only dinner (details to come).

AI for Kubernetes

Superpowers for Humans of Kubernetes: How K8sGPT Is Transforming Enterprise Ops

🚨This is my top recommended event! Alex is one of the creators of k8sGPT and has great perspective on leveraging AI in k8s 🚨

Time: Wednesday April 2, 2025 14:30 - 15:00 BST
Location: Level 1 | Hall Entrance S10 | Room C
Speaker: Alex Jones, AWS & Anais Urlichs, JP Morgan Chase

Description: Humans cannot scale like software, and our ability to diagnose and triage is finite. Imagine the burden of operating dozens of tenants across multiple clusters. It’s going to take a team, no lone hero can keep the lights on and the customers happy.

Until now.

Register for the k8sGPT talk here!

Keynote: AI Enabled Observability Explainers - We Actually Did Something With AI!

Time: Wednesday April 2, 2025 09:48 - 10:03 BST
Location: Level 0 | ICC Auditorium
Speaker: Vijay Samuel, Principal MTS, Architect, eBay

Description: If folks think that this will be yet another hand wavy AI talk, prepared to be disappointed! Over the last few quarters, the Observability platform team at eBay has embarked on the journey of building "Explainers" for telemetry signals. "So, you are just shoving data into an LLM, big deal!" - one might say. The approach that we took was slightly different. Yes, an LLM does know how to interpret an OTEL trace waterfall but does it do it predictably? No! For various reasons. This is where AI and Engineering have a beautiful marriage. For each signal, we have carefully married crafty algorithms and LLMs to create more predictable and accurate AI enabled experiences. Some of which include explaining traces, metrics and logs.

Learn more about this keynote here

AI Beyond Autocomplete: Using LLMs To Create 1000 Kubernetes Controllers

Time: Friday April 4, 2025 15:15 - 15:45 BST
Location: Level 0 | ICC Captial Hall | Room 1
Speaker: Justin Santa Barbara & Walter Fender, Google

Description: LLMs can generate React apps, poems, and even music. But can they rise to the ultimate challenge: writing reliable Kubernetes controllers? The Config Connector team say "yes!" We are successfully using AI to write production controllers for a thousand google cloud resources.

Sign up to attend here

Autonomous AI Agents in Production: Slashing Cloud Cost Root Cause Analysis From Hours To Minutes

Time: Thursday April 3, 2025 15:00 - 15:30 BST
Location: Level 1 | Hall Entrance S10 | Room B
Speaker: Ilya Lyamkin, Spotify

Description: As cloud infrastructures scale, traditional cost monitoring struggles to identify root causes of spending anomalies. This technical deep-dive shows how autonomous AI agents transformed our cost observability pipeline, reducing root cause analysis time from hours to under 5 minutes. We'll examine the agent architecture including deployment patterns, distributed cost tracing, and automated analysis workflows. Learn how we engineered AI agents to correlate cost signals across cloud services, implemented real-time pattern recognition with ML models, and built resilient feedback loops. Through production examples, we'll share our journey from manual investigation to automated root cause identification, including challenges in scaling agent intelligence. Attendees will gain practical insights into building their own AI-powered cost analysis system that scales with their infrastructure.

Register for Ilya's talk here


Kubernetes for AI

Keynote: LLM-Aware Load Balancing in Kubernetes: A New Era of Efficiency

Time: Friday April 4, 2025 09:06 - 09:21 BST
Location: Level 0 | ICC Auditorium
Speaker: Clayton Coleman, Distinguished Engineer, Google & Jiaxin Shan, Software Engineer, Bytedance

Description: Traditional load balancing approaches, including round robin or those relying on metrics like QPS are often ineffective when applied to LLM serving. LLM requests vary significantly in computational demands due to prompt length, the model differences and their autoregressive nature, leading to unpredictable request running times. Moreover, the emergence of model multiplexing techniques (e.g., LoRA) introduces new complexities that necessitate LLM-aware load balancing strategies. In this talk, we introduce a new set of Kubernetes APIs for routing to LLM workloads that allow configuration of serving objectives and priorities for each use case. These APIs integrate seamlessly with Gateway API, and an included extension means that support for these APIs can easily be plugged into many Gateway API implementations to enable turnkey LLM routing support.

Learn more about this keynote here

Deep Dive To AI Agent Observability

Time: Wednesday April 2, 2025 15:15 - 15:45 BST
Location: Level 1 | Hall Entrance N10 | Room E
Speaker: Guangya Liu, IBM & Karthik Kalyanaraman, Langtrace AI

Description: OpenTelemetry has emerged as a powerful framework for observability in cloud-native applications, but how does it apply to the intricate needs of AI Agent observability? This session explores the journey of leveraging OpenTelemetry to monitor, trace, and analyze AI Agents. We’ll cover key challenges such as capturing metrics for multi-agent systems, tracing inference workflows, and correlating AI-specific data like model performance and decision latency.

Sign up to attend here

Asimov's Zeroth Law of Robotics: Observability for AI

Time: Wednesday April 2, 2025 16:15 - 16:45 BST
Location: Level 1 | Hall Entrance N10 | Room E
Speaker: Nicole van der Hoeven, Senior Developer Advocate, Grafana Labs

Description: A robot may not harm humans. A robot must obey humans. A robot must protect its own existence. These are Isaac Asimov's three Laws of Robotics, created to govern the ethical programming of artificial intelligences. From the Butlerian Jihad to Skynet to cylons, we've been immortalizing our collective nightmares about artificial intelligence for years. But there's an unmentioned law that comes as a prerequisite to all of that: a robot must be observable.

Register for Nicole's talk here

Keynote: Into the Black Box: Observability in the Age of LLMs

Time: Wednesday April 2, 2025 09:26 - 09:41 BST
Location: Level 0 | ICC Auditorium
Speaker: Christine Yen, CEO/Cofounder, Honeycomb

Description: LLMs can provide a quick injection of magic into an existing product (or product concept)! Most of us looking to build on LLMs aren't ML engineers or AI experts, after all, and this new wave of LLM offerings makes it easy for any of us to build something delightful.

But once that product or feature is shipped, in production, in front of users, the problems all collapse back into something that feels awfully familiar: performance challenges, questionable accuracy, and unhappy or confused users.

This talk will assert that building on LLMs is just like buliding on top of any other sort of black box in our architecture (APIs, DBs, etc)—this one just happens to be inherently unpredictable and probablistic.

Learn more about this keynote here

Advancements in AI/ML Inference Workloads on Kubernetes

Time: Wednesday April 2, 2025 11:15 - 11:45 BST
Location: Level 3 | ICC Capital Suite 7-9
Speaker:
Yuan Tang, Principal Software Engineer, Red Hat, Eduardo Arango Gutierez DE, Senior Systems Software Engineer, NVIDIA

Description: The emergence of Generative AI (GenAI) has introduced new challenges and demands in AI/ML inference, necessitating advanced solutions for efficient serving infrastructures. The Kubernetes Working Group Serving (WG Serving) is dedicated to enhancing serving workload on K8s, especially for hardware-accelerated AI/ML inference. This group prioritizes compute-intensive inference scenarios using specialized accelerators, benefiting various serving workloads such as web services and stateful databases.

Sign up to attend here

Cloud Native + Kubernetes AI Day

There is a dedicated co-hosted event for AI in the Kubernetes and Cloud Native Ecosystem this year! It requires the all access pass and takes place on April 1, prior to the start of KubeCon.

Parity at KubeCon

We're excited to be hosting a booth at KubeCon's Solution Hall this year. Come by to say hello and pick up some Parity swag. We'll also be having coffee chats and a dinner this year, details to come!

Revolutionize Your Incident Response

Revolutionize Your Incident Response

Transform your on-call experience with Parity's AI SRE. Parity works alongside your engineers to resolve incidents.

Transform your on-call experience with Parity's AI SRE. Parity works alongside your engineers to resolve incidents.

Subscribe

2025 • Parity • San FRANCISCO

Subscribe

2025 • Parity • San FRANCISCO

Subscribe

2025 • Parity • San FRANCISCO