Welcome to the session program for OSA CON 2025.

Tuesday, November 4th, 2025

2025-11-04T16:00:00.000Z
Opening remarks
2025-11-04T16:15:00.000Z
2025-11-04T16:50:00.000Z
2025-11-04T17:20:00.000Z
Break
2025-11-04T17:30:00.000Z
2025-11-04T18:00:00.000Z
2025-11-04T18:30:00.000Z
2025-11-04T19:00:00.000Z
2025-11-04T19:15:00.000Z
2025-11-04T19:45:00.000Z
2025-11-04T20:15:00.000Z
By Robert Hodges
AI is going to alter the world. Will it lead to a golden age, human extinction, or just one more technical advance? History tells us that the winners in the race will be those who best merge human capabilities with the power of AI. We’ll look at how this is already playing out in analytics and open source software. Some of the changes will surprise you.
11/04/2025 4:20 PM 11/04/2025 4:45 PM UTC OSACon: Don't Fire Your Developers and Other Lessons for the AI Revolution Presented by Robert Hodges.

AI is going to alter the world. Will it lead to a golden age, human extinction, or just one more technical advance? History tells us that the winners in the race will be those who best merge human capabilities with the power of AI. We’ll look at how this is already playing out in analytics and open source software. Some of the changes will surprise you.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Peculiar C Umeh, Anastasiia Zvenigorodskaia & Avi Press
Open source thrives on passion — but it also takes more. This panel brings together leaders from across the ecosystem to explore how open-source projects can stay healthy, grow contributors, and even turn sustainability into profitability. From practical frameworks like the CHAOSS Practitioner Guides to lessons in building successful businesses around open code and data-driven insights from massive usage analytics, our panelists will share real-world tactics for keeping open source vibrant for the long term.
11/04/2025 4:50 PM 11/04/2025 5:25 PM UTC OSACon: Panel: Sustaining Open-Source Success Presented by Peculiar C Umeh, Anastasiia Zvenigorodskaia & Avi Press.

Open source thrives on passion — but it also takes more. This panel brings together leaders from across the ecosystem to explore how open-source projects can stay healthy, grow contributors, and even turn sustainability into profitability. From practical frameworks like the CHAOSS Practitioner Guides to lessons in building successful businesses around open code and data-driven insights from massive usage analytics, our panelists will share real-world tactics for keeping open source vibrant for the long term.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Chris Crane
Modern analytics experiences demand “conversation-fast” backends—systems that serve the requests of an agent or LLM in real time at the speed of a natural conversation. In this talk, we’ll get deep into an open-source reference architecture for powering conversational AI and real-time analytics in user-facing applications. We’ll get hands-on in the code, and explore practical patterns for integrating streaming and analytical infrastructure into your web application, including AI chat systems.
11/04/2025 5:30 PM 11/04/2025 6:00 PM UTC OSACon: OLAP in your App: Integrating realtime & agentic analytics into your app Presented by Chris Crane.

Modern analytics experiences demand “conversation-fast” backends—systems that serve the requests of an agent or LLM in real time at the speed of a natural conversation.

In this talk, we’ll get deep into an open-source reference architecture for powering conversational AI and real-time analytics in user-facing applications. We’ll get hands-on in the code, and explore practical patterns for integrating streaming and analytical infrastructure into your web application, including AI chat systems.

We’ll focus on a reference architecture that integrates transactional (Postgres) and analytical engines (ClickHouse®), along with RedPanda for streaming, a React web app frontend, and MCP interfaces for enabling LLM chat within the app.

The whole analytics stack is defined in code using the MooseStack open source framework. https://docs.fiveonefour.com/moose

This architecture ensures the low-latency responsiveness users expect with analytics visualizations and conversational AI, and a modern, local-first developer experience that integrates naturally into your stack.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Jayesh Asrani
Discover the transformative power of streaming analytics featuring groundbreaking case studies from some of the most innovative companies in the world. Explore how Uber, Razorpay, and Stripe leverage next-gen streaming architectures to power their real-time decision-making, improve user experiences, and drive operational excellence. These case studies will offer a rare glimpse into the advanced technologies and strategies behind these leading-edge systems, showcasing real-world applications of streaming analytics that are as inspiring as they are practical.
11/04/2025 5:30 PM 11/04/2025 6:00 PM UTC OSACon: Streaming Analytics in Action: Real-World Case Studies from Uber, Razorpay, and Stripe Presented by Jayesh Asrani.

Discover the transformative power of streaming analytics featuring groundbreaking case studies from some of the most innovative companies in the world. Explore how Uber, Razorpay, and Stripe leverage next-gen streaming architectures to power their real-time decision-making, improve user experiences, and drive operational excellence. These case studies will offer a rare glimpse into the advanced technologies and strategies behind these leading-edge systems, showcasing real-world applications of streaming analytics that are as inspiring as they are practical.

Learn how these organisations tackle the complexities of real-time data processing to unlock speed, scale, and insight—and leave with actionable ideas to jumpstart your own journey into the world of streaming analytics. We will share reference architectures and the resulting technical and business outcomes.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Viktor Kessler
As the Lakehouse paradigm rises in popularity, so does the risk of being locked into a single vendor’s ecosystem. But what if you could have all the benefits of a unified architecture—without giving up control? In this session, we introduce Lakekeeper, an open-source Apache Iceberg catalog that makes it possible to build Lakehouse architectures that are truly portable: across clouds, compute engines, and storage layers. This talk speaks directly to data professionals looking to stay ahead of the curve by exploring:
11/04/2025 6:00 PM 11/04/2025 6:30 PM UTC OSACon: Agnostic is the Only Constant: Embracing the Lakehouse Paradigm Without Lock-In Presented by Viktor Kessler.

As the Lakehouse paradigm rises in popularity, so does the risk of being locked into a single vendor’s ecosystem. But what if you could have all the benefits of a unified architecture—without giving up control?

In this session, we introduce Lakekeeper, an open-source Apache Iceberg catalog that makes it possible to build Lakehouse architectures that are truly portable: across clouds, compute engines, and storage layers.

This talk speaks directly to data professionals looking to stay ahead of the curve by exploring:

How open formats like Iceberg and open catalogs like Lakekeeper can break down silos

How to build cloud-neutral, compute-agnostic analytics pipelines

Why metadata and catalogs are the new control plane for governance and orchestration

Real-world examples of Lakehouse deployments using Trino, Spark, and DuckDB

A vision for future-proofed architectures built entirely on open standards

Whether you’re modernizing your stack or starting fresh, this session offers practical insight and fresh ideas for staying flexible, scalable, and free from lock-in — while staying fully open-source.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By sri Rama Satya Prasanth
This session explores how open-source analytics technologies are transforming the public sector through the lens of Electronic Income Verification (EIV) systems—platforms that process over 850,000 real-time verifications daily, integrate 40+ data sources, and maintain 99.95% uptime to support equitable, efficient public benefit delivery. We’ll dive into the open-source stack behind these systems: event streaming with Apache Kafka, data orchestration with Airflow, analytics with Apache Superset and DuckDB, and ML-powered fraud detection using tools like scikit-learn and Hugging Face NLP.
11/04/2025 6:00 PM 11/04/2025 6:30 PM UTC OSACon: Open Analytics in Action Presented by sri Rama Satya Prasanth.

This session explores how open-source analytics technologies are transforming the public sector through the lens of Electronic Income Verification (EIV) systems—platforms that process over 850,000 real-time verifications daily, integrate 40+ data sources, and maintain 99.95% uptime to support equitable, efficient public benefit delivery.

We’ll dive into the open-source stack behind these systems: event streaming with Apache Kafka, data orchestration with Airflow, analytics with Apache Superset and DuckDB, and ML-powered fraud detection using tools like scikit-learn and Hugging Face NLP. You’ll learn how public agencies are building scalable, secure, and cost-effective solutions by leveraging community-driven technologies and standards.

Topics include building modular data pipelines, real-time dashboards, anomaly detection, and managing governance and compliance (GDPR, HIPAA) in open environments. The talk will also highlight DevOps practices such as IaC, GitOps, and monitoring with Prometheus and Grafana to maintain visibility, security, and auditability in high-trust systems.

Ideal for data engineers, open-source practitioners, and civic tech innovators, this session offers a real-world case study on how open analytics infrastructure can power large-scale, high-impact digital public services.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Mingyu Chen
In this talk, I will introduce how Apache Doris, as a real-time analytical database, extends from custom-facing business scenarios to agent-facing ones. I will cover the technical details behind high concurrency and low-latency query analytics, as well as capabilities supporting AI scenarios such as hybrid search, agent observability, and collaboration between Doris MCP Server and large language models (LLMs). This will help the audience understand how Doris empowers enterprises to perform real-time data exploration in the AI era.
11/04/2025 6:30 PM 11/04/2025 7:00 PM UTC OSACon: From Custom-Facing to Agent-Facing: Empowering Real-Time Analytics by Apache Doris Presented by Mingyu Chen.

In this talk, I will introduce how Apache Doris, as a real-time analytical database, extends from custom-facing business scenarios to agent-facing ones.

I will cover the technical details behind high concurrency and low-latency query analytics, as well as capabilities supporting AI scenarios such as hybrid search, agent observability, and collaboration between Doris MCP Server and large language models (LLMs).

This will help the audience understand how Doris empowers enterprises to perform real-time data exploration in the AI era.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Rakshit Khare
This presentation showcases an open source healthcare analytics platform that reduced ICU transfers by 20% through real-time patient risk prediction. Built entirely with open source technologies, the system demonstrates how healthcare organizations can leverage community-driven tools to achieve clinical impact without vendor lock-in. The architecture combines Apache Kafka for real-time EMR streaming, Apache Spark for ML model training, and PostgreSQL with TimescaleDB for time-series clinical data. Docker containerization ensures reproducible deployments across environments, while Kubernetes orchestrates auto-scaling during patient admission surges.
11/04/2025 6:30 PM 11/04/2025 7:00 PM UTC OSACon: Open Source Healthcare Analytics: 20% ICU Transfer Reduction at Scale Presented by Rakshit Khare.

This presentation showcases an open source healthcare analytics platform that reduced ICU transfers by 20% through real-time patient risk prediction. Built entirely with open source technologies, the system demonstrates how healthcare organizations can leverage community-driven tools to achieve clinical impact without vendor lock-in.

The architecture combines Apache Kafka for real-time EMR streaming, Apache Spark for ML model training, and PostgreSQL with TimescaleDB for time-series clinical data. Docker containerization ensures reproducible deployments across environments, while Kubernetes orchestrates auto-scaling during patient admission surges. The ML pipeline uses scikit-learn and XGBoost models trained on anonymized historical cohorts, with MLflow tracking experiments and model versioning.

Key open source components include: Apache Airflow for workflow orchestration, Grafana for clinical dashboards, and Apache Superset for analytics visualization. The platform implements FHIR standards through HAPI FHIR server, ensuring interoperability with existing hospital systems.

Critical lessons learned include: designing privacy-preserving analytics with differential privacy libraries, implementing federated learning across hospital networks, and maintaining sub-second latency for critical alerts using Redis caching. The session covers practical deployment strategies, cost optimization techniques, and governance frameworks for open source healthcare analytics.

Attendees will learn to build production-ready healthcare analytics platforms using exclusively open source tools, complete with code examples, architecture patterns, and regulatory compliance strategies that deliver measurable patient outcomes.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Mya Jaye
New employee onboarding often involves navigating a sea of information, which can delay full productivity. This session will explore how AI can personalize information discovery, helping new hires integrate more quickly and engage effectively. We’ll detail an architecture that uses OpenTelemetry to transmit metrics and Google’s Model Context Protocol (MCP) Toolbox for Databases to connect AI agents with a high-performance ClickHouse® data lake. This setup allows for dynamic, real-time access to relevant company knowledge.
11/04/2025 7:00 PM 11/04/2025 7:10 PM UTC OSACon: AI Your OTel Presented by Mya Jaye.

New employee onboarding often involves navigating a sea of information, which can delay full productivity. This session will explore how AI can personalize information discovery, helping new hires integrate more quickly and engage effectively.

We’ll detail an architecture that uses OpenTelemetry to transmit metrics and Google’s Model Context Protocol (MCP) Toolbox for Databases to connect AI agents with a high-performance ClickHouse® data lake. This setup allows for dynamic, real-time access to relevant company knowledge.

Discover how to visualize these queries using modern dashboards like Grafana or Perses, creating an AI-enhanced system. The talk will cover strategies to improve new hire time-to-value and foster a more data-informed organizational culture.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Anvesh Reddy
The regulatory compliance landscape generates massive datasets—over 300 million pages of regulatory documents globally with 200+ daily changes—creating both analytical challenges and opportunities for the open source community. This presentation demonstrates how open source analytics tools and frameworks can be leveraged to build sophisticated regulatory compliance systems that rival proprietary enterprise solutions. This session showcases practical implementations using open source technologies including Apache Spark for large-scale document processing, Elasticsearch for regulatory search and retrieval, Apache Airflow for compliance workflow orchestration, and Hugging Face transformers for natural language understanding.
11/04/2025 7:00 PM 11/04/2025 7:10 PM UTC OSACon: Building Open Source AI Compliance Analytics: From Data Processing to Regulatory Intelligence Presented by Anvesh Reddy.

The regulatory compliance landscape generates massive datasets—over 300 million pages of regulatory documents globally with 200+ daily changes—creating both analytical challenges and opportunities for the open source community. This presentation demonstrates how open source analytics tools and frameworks can be leveraged to build sophisticated regulatory compliance systems that rival proprietary enterprise solutions.

This session showcases practical implementations using open source technologies including Apache Spark for large-scale document processing, Elasticsearch for regulatory search and retrieval, Apache Airflow for compliance workflow orchestration, and Hugging Face transformers for natural language understanding. We’ll explore how organizations across financial services, healthcare, and energy sectors have built cost-effective compliance analytics platforms using entirely open source stacks.

The presentation includes hands-on demonstrations of processing regulatory datasets, implementing change detection algorithms, and building interpretable ML models for compliance risk assessment. Attendees will see real architectures processing documents from regulatory bodies like CFTC and ESMA, with techniques for handling multi-jurisdictional requirements and maintaining audit trails using tools like Apache Kafka and PostgreSQL. Key technical topics include distributed text processing strategies, implementing semantic search for regulatory content, building alerting systems for regulatory changes, and creating compliance dashboards using open source visualization tools like Apache Superset and Grafana. We’ll address common challenges including data quality, model interpretability requirements, and scaling analytics workloads for enterprise compliance needs. The session emphasizes community-driven approaches to regulatory analytics, including contributing to open source compliance frameworks, sharing anonymized datasets for research, and collaborative development of regulatory parsing libraries. Practical takeaways include reference architectures, code samples, and deployment strategies that attendees can immediately apply in their organizations.

As regulatory technology spending approaches $204 billion by 2026, this presentation demonstrates how open source analytics can democratize advanced compliance capabilities, making sophisticated regulatory intelligence accessible to organizations of all sizes while fostering innovation through community collaboration.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Peter Zaitsev
What happens when databases don’t just store data—but help analyze it intelligently? This talk explores emerging trends in AI-powered database intelligence, from schema optimization and real-time query tuning to transforming unstructured content using techniques like sentiment analysis and entity recognition. We’ll also dive into the future of self-driving databases and multi-modal analytics that integrate text, images, and more. Attendees will leave with a forward-looking view of how AI is reshaping database engines into context-aware, insight-generating platforms—laying the foundation for the next generation of open-source analytics systems.
11/04/2025 7:15 PM 11/04/2025 7:45 PM UTC OSACon: Smarter Analytics: AI-Driven Intelligence in Modern Databases Presented by Peter Zaitsev.

What happens when databases don’t just store data—but help analyze it intelligently? This talk explores emerging trends in AI-powered database intelligence, from schema optimization and real-time query tuning to transforming unstructured content using techniques like sentiment analysis and entity recognition. We’ll also dive into the future of self-driving databases and multi-modal analytics that integrate text, images, and more. Attendees will leave with a forward-looking view of how AI is reshaping database engines into context-aware, insight-generating platforms—laying the foundation for the next generation of open-source analytics systems.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Russell Spitzer
Apache Iceberg™ made great advancements going from Table Format V1 to Table Format V2, introducing features like position deletes, advanced metrics, and cleaner metadata abstractions. But with Table Format V3 on the horizon, Iceberg users have even more to look forward to. In this session, we’ll explore some of the exciting new user-facing features that V3 Iceberg is about to introduce and see how they’ll make working with Open Data Formats easier than ever!
11/04/2025 7:15 PM 11/04/2025 7:45 PM UTC OSACon: What the Spec?!: New Features in Apache Iceberg™ Table Format V3 Presented by Russell Spitzer.

Apache Iceberg™ made great advancements going from Table Format V1 to Table Format V2, introducing features like position deletes, advanced metrics, and cleaner metadata abstractions. But with Table Format V3 on the horizon, Iceberg users have even more to look forward to.

In this session, we’ll explore some of the exciting new user-facing features that V3 Iceberg is about to introduce and see how they’ll make working with Open Data Formats easier than ever! We’ll go through the high-level details of the new functionality that will be available in V3. Then we’ll dive deep into some of the most impactful features. You’ll learn what Variant types have to offer your semi-structured data, how Row Lineage can enhance CDC capabilities, and more.

The community has come together to build yet another great release of the Iceberg spec, so attend and learn about all of the changes coming and how you can take advantage of them in your teams.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Rajesh Sura
As enterprises grapple with an explosion of data and increasing pressure to make rapid, informed decisions, traditional Business Intelligence (BI) tools are reaching their limits. Static dashboards and complex query interfaces often exclude non-technical users, creating friction between data and action. Enter AI-native analytics—a transformative approach that integrates natural language interfaces (NLIs) with scalable machine learning (ML) to deliver intelligent, conversational decision systems. This keynote explores how organizations can reimagine their analytics infrastructure by embedding AI into the very fabric of user interaction.
11/04/2025 7:45 PM 11/04/2025 8:15 PM UTC OSACon: AI-Native Analytics: Building the Next Generation of Intelligent, Conversational Decision Systems Presented by Rajesh Sura.

As enterprises grapple with an explosion of data and increasing pressure to make rapid, informed decisions, traditional Business Intelligence (BI) tools are reaching their limits. Static dashboards and complex query interfaces often exclude non-technical users, creating friction between data and action. Enter AI-native analytics—a transformative approach that integrates natural language interfaces (NLIs) with scalable machine learning (ML) to deliver intelligent, conversational decision systems.

This keynote explores how organizations can reimagine their analytics infrastructure by embedding AI into the very fabric of user interaction. Drawing on real-world implementations and cutting-edge research, we’ll unpack the architectural foundations needed to operationalize NLIs at scale—spanning natural language understanding (NLU), data context alignment, governance, and high-performance compute. We’ll address the core challenges of conversational systems in the enterprise: query ambiguity, semantic grounding, explainability, and scalability under dynamic workloads.

With AI-driven interfaces, business users can shift from passively consuming reports to actively engaging with data through dialogue—unlocking faster insight discovery and empowering decision-making at all levels. Attendees will leave with a strategic framework for building next-generation analytics platforms that are intelligent, adaptive, and truly human-centric.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Nidhin Jose
The retail sector’s shift toward omnichannel fulfillment and instant availability demands has exposed critical limitations in traditional batch-processed inventory systems. This presentation demonstrates how open source Event-Driven Architecture (EDA) tools are transforming retail inventory analytics, enabling continuous real-time processing that delivers superior accuracy, automated insights, and scalable supply chain responsiveness. Open source analytics platforms are proving their value in retail operations, with adopters experiencing up to 30% fewer stockouts and 15% improved inventory accuracy compared to proprietary batch systems.
11/04/2025 7:45 PM 11/04/2025 8:15 PM UTC OSACon: Open Source Event-Driven Analytics for Real-Time Retail Inventory Management Presented by Nidhin Jose.

The retail sector’s shift toward omnichannel fulfillment and instant availability demands has exposed critical limitations in traditional batch-processed inventory systems. This presentation demonstrates how open source Event-Driven Architecture (EDA) tools are transforming retail inventory analytics, enabling continuous real-time processing that delivers superior accuracy, automated insights, and scalable supply chain responsiveness.

Open source analytics platforms are proving their value in retail operations, with adopters experiencing up to 30% fewer stockouts and 15% improved inventory accuracy compared to proprietary batch systems. Apache Kafka serves as the foundational streaming platform, processing millions of inventory events per second during peak periods, while Apache Flink provides the real-time analytics engine for instant inventory calculations and decision automation. These open source solutions dramatically improve customer experience while reducing operational costs and optimizing stock levels.

The integration of open source machine learning frameworks with EDA further enhances analytical capabilities, enabling demand forecasting with 20% greater accuracy than traditional methods. By combining Python-based ML libraries, real-time streaming analytics, and automated inventory algorithms, retailers can align their strategies with live demand signals, reducing waste and improving profitability through data-driven decision making.

Transitioning to open source event-driven inventory analytics presents implementation challenges including legacy system integration, data quality assurance, and organizational change management. However, the open source ecosystem provides cost-effective, flexible solutions for overcoming these obstacles while maintaining full control over analytical processes and data.

This session will explore proven open source architectures, demonstrate measurable impacts on retail supply chains through live analytics dashboards, and outline how open source event-driven systems enable more intelligent, sustainable retail operations. Attendees will gain practical insights into implementing these solutions and leveraging the broader open source analytics ecosystem for competitive advantage.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Evan Rusackas, Michael Molina & Ville Brofeldt
Apache Superset has always been leading the charge on open-source BI, but now it’s getting ready to truly take over the BI world. Learn all about Superset’s new extensions architecture that will allow users and developers to more rapidly expand and improve the product’s capabilities, while simplifying life for both developers and maintainers.
11/04/2025 8:15 PM 11/04/2025 8:45 PM UTC OSACon: Apache Superset Extensions - Taking Open Source BI to the Next Level Presented by Evan Rusackas, Michael Molina & Ville Brofeldt.

Apache Superset has always been leading the charge on open-source BI, but now it’s getting ready to truly take over the BI world. Learn all about Superset’s new extensions architecture that will allow users and developers to more rapidly expand and improve the product’s capabilities, while simplifying life for both developers and maintainers.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Peng Wang
Retrieval-Augmented Generation (RAG) is transforming analytics applications — but implementing it often means managing multiple systems: OLTP, vector DBs, and orchestration tools. In this session, we’ll show how OceanBase simplifies this stack by supporting both structured and vector data natively, enabling developers to build real-time RAG pipelines using just one open-source database. We’ll walk through a working demo that combines OceanBase with OpenAI and popular Python frameworks like LangChain, demonstrating how to perform vector search and retrieval directly using SQL.
11/04/2025 8:15 PM 11/04/2025 8:45 PM UTC OSACon: RAG Without the Hassle: Building AI-Powered Analytics Apps on OceanBase Vector Database Presented by Peng Wang.

Retrieval-Augmented Generation (RAG) is transforming analytics applications — but implementing it often means managing multiple systems: OLTP, vector DBs, and orchestration tools.

In this session, we’ll show how OceanBase simplifies this stack by supporting both structured and vector data natively, enabling developers to build real-time RAG pipelines using just one open-source database.

We’ll walk through a working demo that combines OceanBase with OpenAI and popular Python frameworks like LangChain, demonstrating how to perform vector search and retrieval directly using SQL.

Unlike traditional setups that require combining a relational database and a separate vector database, OceanBase handles both transactional and semantic search in a single engine — with consistency, availability, and simplicity.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
2025-11-04T16:00:00.000Z - 2025-11-04T16:15:00.000Z.
Opening remarks
2025-11-04T17:20:00.000Z - 2025-11-04T17:30:00.000Z.
Break
2025-11-04T16:20:00.000Z - 2025-11-04 16:45:00 +0000 UTC.
By Robert Hodges
AI is going to alter the world. Will it lead to a golden age, human extinction, or just one more technical advance? History tells us that the winners in the race will be those who best merge human capabilities with the power of AI. We’ll look at how this is already playing out in analytics and open source software. Some of the changes will surprise you.
2025-11-04T16:50:00.000Z - 2025-11-04 17:25:00 +0000 UTC.
By Peculiar C Umeh, Anastasiia Zvenigorodskaia & Avi Press
Open source thrives on passion — but it also takes more. This panel brings together leaders from across the ecosystem to explore how open-source projects can stay healthy, grow contributors, and even turn sustainability into profitability. From practical frameworks like the CHAOSS Practitioner Guides to lessons in building successful businesses around open code and data-driven insights from massive usage analytics, our panelists will share real-world tactics for keeping open source vibrant for the long term.
2025-11-04T17:30:00.000Z - 2025-11-04 18:00:00 +0000 UTC.
By Chris Crane
Modern analytics experiences demand “conversation-fast” backends—systems that serve the requests of an agent or LLM in real time at the speed of a natural conversation. In this talk, we’ll get deep into an open-source reference architecture for powering conversational AI and real-time analytics in user-facing applications. We’ll get hands-on in the code, and explore practical patterns for integrating streaming and analytical infrastructure into your web application, including AI chat systems.
2025-11-04T17:30:00.000Z - 2025-11-04 18:00:00 +0000 UTC.
By Jayesh Asrani
Discover the transformative power of streaming analytics featuring groundbreaking case studies from some of the most innovative companies in the world. Explore how Uber, Razorpay, and Stripe leverage next-gen streaming architectures to power their real-time decision-making, improve user experiences, and drive operational excellence. These case studies will offer a rare glimpse into the advanced technologies and strategies behind these leading-edge systems, showcasing real-world applications of streaming analytics that are as inspiring as they are practical.
2025-11-04T18:00:00.000Z - 2025-11-04 18:30:00 +0000 UTC.
By sri Rama Satya Prasanth
This session explores how open-source analytics technologies are transforming the public sector through the lens of Electronic Income Verification (EIV) systems—platforms that process over 850,000 real-time verifications daily, integrate 40+ data sources, and maintain 99.95% uptime to support equitable, efficient public benefit delivery. We’ll dive into the open-source stack behind these systems: event streaming with Apache Kafka, data orchestration with Airflow, analytics with Apache Superset and DuckDB, and ML-powered fraud detection using tools like scikit-learn and Hugging Face NLP.
2025-11-04T18:00:00.000Z - 2025-11-04 18:30:00 +0000 UTC.
By Viktor Kessler
As the Lakehouse paradigm rises in popularity, so does the risk of being locked into a single vendor’s ecosystem. But what if you could have all the benefits of a unified architecture—without giving up control? In this session, we introduce Lakekeeper, an open-source Apache Iceberg catalog that makes it possible to build Lakehouse architectures that are truly portable: across clouds, compute engines, and storage layers. This talk speaks directly to data professionals looking to stay ahead of the curve by exploring:
2025-11-04T18:30:00.000Z - 2025-11-04 19:00:00 +0000 UTC.
By Rakshit Khare
This presentation showcases an open source healthcare analytics platform that reduced ICU transfers by 20% through real-time patient risk prediction. Built entirely with open source technologies, the system demonstrates how healthcare organizations can leverage community-driven tools to achieve clinical impact without vendor lock-in. The architecture combines Apache Kafka for real-time EMR streaming, Apache Spark for ML model training, and PostgreSQL with TimescaleDB for time-series clinical data. Docker containerization ensures reproducible deployments across environments, while Kubernetes orchestrates auto-scaling during patient admission surges.
2025-11-04T18:30:00.000Z - 2025-11-04 19:00:00 +0000 UTC.
By Mingyu Chen
In this talk, I will introduce how Apache Doris, as a real-time analytical database, extends from custom-facing business scenarios to agent-facing ones. I will cover the technical details behind high concurrency and low-latency query analytics, as well as capabilities supporting AI scenarios such as hybrid search, agent observability, and collaboration between Doris MCP Server and large language models (LLMs). This will help the audience understand how Doris empowers enterprises to perform real-time data exploration in the AI era.
2025-11-04T19:00:00.000Z - 2025-11-04 19:10:00 +0000 UTC.
By Anvesh Reddy
The regulatory compliance landscape generates massive datasets—over 300 million pages of regulatory documents globally with 200+ daily changes—creating both analytical challenges and opportunities for the open source community. This presentation demonstrates how open source analytics tools and frameworks can be leveraged to build sophisticated regulatory compliance systems that rival proprietary enterprise solutions. This session showcases practical implementations using open source technologies including Apache Spark for large-scale document processing, Elasticsearch for regulatory search and retrieval, Apache Airflow for compliance workflow orchestration, and Hugging Face transformers for natural language understanding.
2025-11-04T19:00:00.000Z - 2025-11-04 19:10:00 +0000 UTC.
By Mya Jaye
New employee onboarding often involves navigating a sea of information, which can delay full productivity. This session will explore how AI can personalize information discovery, helping new hires integrate more quickly and engage effectively. We’ll detail an architecture that uses OpenTelemetry to transmit metrics and Google’s Model Context Protocol (MCP) Toolbox for Databases to connect AI agents with a high-performance ClickHouse® data lake. This setup allows for dynamic, real-time access to relevant company knowledge.
2025-11-04T19:15:00.000Z - 2025-11-04 19:45:00 +0000 UTC.
By Peter Zaitsev
What happens when databases don’t just store data—but help analyze it intelligently? This talk explores emerging trends in AI-powered database intelligence, from schema optimization and real-time query tuning to transforming unstructured content using techniques like sentiment analysis and entity recognition. We’ll also dive into the future of self-driving databases and multi-modal analytics that integrate text, images, and more. Attendees will leave with a forward-looking view of how AI is reshaping database engines into context-aware, insight-generating platforms—laying the foundation for the next generation of open-source analytics systems.
2025-11-04T19:15:00.000Z - 2025-11-04 19:45:00 +0000 UTC.
By Russell Spitzer
Apache Iceberg™ made great advancements going from Table Format V1 to Table Format V2, introducing features like position deletes, advanced metrics, and cleaner metadata abstractions. But with Table Format V3 on the horizon, Iceberg users have even more to look forward to. In this session, we’ll explore some of the exciting new user-facing features that V3 Iceberg is about to introduce and see how they’ll make working with Open Data Formats easier than ever!
2025-11-04T19:45:00.000Z - 2025-11-04 20:15:00 +0000 UTC.
By Rajesh Sura
As enterprises grapple with an explosion of data and increasing pressure to make rapid, informed decisions, traditional Business Intelligence (BI) tools are reaching their limits. Static dashboards and complex query interfaces often exclude non-technical users, creating friction between data and action. Enter AI-native analytics—a transformative approach that integrates natural language interfaces (NLIs) with scalable machine learning (ML) to deliver intelligent, conversational decision systems. This keynote explores how organizations can reimagine their analytics infrastructure by embedding AI into the very fabric of user interaction.
2025-11-04T19:45:00.000Z - 2025-11-04 20:15:00 +0000 UTC.
By Nidhin Jose
The retail sector’s shift toward omnichannel fulfillment and instant availability demands has exposed critical limitations in traditional batch-processed inventory systems. This presentation demonstrates how open source Event-Driven Architecture (EDA) tools are transforming retail inventory analytics, enabling continuous real-time processing that delivers superior accuracy, automated insights, and scalable supply chain responsiveness. Open source analytics platforms are proving their value in retail operations, with adopters experiencing up to 30% fewer stockouts and 15% improved inventory accuracy compared to proprietary batch systems.
2025-11-04T20:15:00.000Z - 2025-11-04 20:45:00 +0000 UTC.
By Peng Wang
Retrieval-Augmented Generation (RAG) is transforming analytics applications — but implementing it often means managing multiple systems: OLTP, vector DBs, and orchestration tools. In this session, we’ll show how OceanBase simplifies this stack by supporting both structured and vector data natively, enabling developers to build real-time RAG pipelines using just one open-source database. We’ll walk through a working demo that combines OceanBase with OpenAI and popular Python frameworks like LangChain, demonstrating how to perform vector search and retrieval directly using SQL.
2025-11-04T20:15:00.000Z - 2025-11-04 20:45:00 +0000 UTC.
By Evan Rusackas, Michael Molina & Ville Brofeldt
Apache Superset has always been leading the charge on open-source BI, but now it’s getting ready to truly take over the BI world. Learn all about Superset’s new extensions architecture that will allow users and developers to more rapidly expand and improve the product’s capabilities, while simplifying life for both developers and maintainers.

Wednesday, November 5th, 2025

2025-11-05T16:00:00.000Z
2025-11-05T16:35:00.000Z
2025-11-05T17:15:00.000Z
2025-11-05T17:30:00.000Z
2025-11-05T18:00:00.000Z
2025-11-05T18:30:00.000Z
2025-11-05T19:00:00.000Z
2025-11-05T19:30:00.000Z
2025-11-05T20:00:00.000Z
By Maxime Beauchemin
As AI transforms how we build, scale, and interact with software, one thing is becoming clear: open source isn’t just keeping up—it’s leading. In this keynote, Max Beauchemin, creator of Apache Superset and Apache Airflow, unpacks why open source is uniquely positioned to dominate in the age of AI. From training data to developer velocity, open projects have structural advantages that proprietary vendors simply can’t replicate. We’ll explore how LLMs ““know”” open source deeply, how AI-native workflows amplify OSS contributions, and why communities—not corporations—are becoming the new centers of gravity for software innovation.
11/05/2025 4:00 PM 11/05/2025 4:30 PM UTC OSACon: Open Source’s Massive Unfair Advantage in the AI Era Presented by Maxime Beauchemin.

As AI transforms how we build, scale, and interact with software, one thing is becoming clear: open source isn’t just keeping up—it’s leading. In this keynote, Max Beauchemin, creator of Apache Superset and Apache Airflow, unpacks why open source is uniquely positioned to dominate in the age of AI. From training data to developer velocity, open projects have structural advantages that proprietary vendors simply can’t replicate.

We’ll explore how LLMs ““know”” open source deeply, how AI-native workflows amplify OSS contributions, and why communities—not corporations—are becoming the new centers of gravity for software innovation. Whether you’re a maintainer, contributor, or startup founder, this talk will reframe how you think about OSS and help you ride the wave instead of getting swamped by it.

The future of software is AI-native and open by default. Let’s talk about what that really means—and how.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Heather Meeker
Commercial Open Source Software (COSS) is a burgeoning business sector. This talk will focus on how the demand for COSS will be driven by the advance of LLM and AI.
11/05/2025 4:35 PM 11/05/2025 5:10 PM UTC OSACon: How Open Source Businesses Will Thrive in the Age of AI Presented by Heather Meeker.

Commercial Open Source Software (COSS) is a burgeoning business sector. This talk will focus on how the demand for COSS will be driven by the advance of LLM and AI.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Lorenzo Mangani
Airport is a DuckDB community extension developed by Query.Farm that adds Apache Arrow Flight support to DuckDB. In plain terms: 👉 It lets DuckDB query, modify, and store data through Arrow Flight servers — remote, high-performance data endpoints that speak gRPC and Arrow IPC. You can think of it as “network tables for DuckDB.” Instead of reading files from disk, you can query live data over the network just like local tables.
11/05/2025 5:15 PM 11/05/2025 5:25 PM UTC OSACon: Making DuckDB with Arrow Flight and the Airport Extension Presented by Lorenzo Mangani.

Airport is a DuckDB community extension developed by Query.Farm that adds Apache Arrow Flight support to DuckDB.

In plain terms: 👉 It lets DuckDB query, modify, and store data through Arrow Flight servers — remote, high-performance data endpoints that speak gRPC and Arrow IPC.

You can think of it as “network tables for DuckDB.” Instead of reading files from disk, you can query live data over the network just like local tables.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Josh Lee
Joseph Cambell’s story-circle describes the journeys of epic heroes from the known into the unknown in search of rewards. Maybe it’s not so different from the Gartner Hype Cycle that describes the journeys of innovations; And as individual developers we also go through a journey of discovery when adopting new tools and technologies.
11/05/2025 5:15 PM 11/05/2025 5:25 PM UTC OSACon: The Open Source Hero's Journey Presented by Josh Lee.

Joseph Cambell’s story-circle describes the journeys of epic heroes from the known into the unknown in search of rewards. Maybe it’s not so different from the Gartner Hype Cycle that describes the journeys of innovations; And as individual developers we also go through a journey of discovery when adopting new tools and technologies.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Shivji Kumar Jha & Anurag Pandey
Distributed systems are full of surprises—and ClickHouse® is no exception. In this talk, Shivji (Nutanix) and Anurag (Incerto) share real-world war room stories from designing, testing, and running ClickHouse® across both cloud and on-prem environments, where debugging production issues often felt more like unraveling plot twists in a thriller than routine operations. We’ll walk through some of the toughest incidents we’ve faced: what broke, what we thought was wrong, what actually was wrong, and how we got to the root cause.
11/05/2025 5:30 PM 11/05/2025 6:00 PM UTC OSACon: ClickHouse® Chronicles: Real-World War Rooms with Human and AI Agents Presented by Shivji Kumar Jha & Anurag Pandey.

Distributed systems are full of surprises—and ClickHouse® is no exception. In this talk, Shivji (Nutanix) and Anurag (Incerto) share real-world war room stories from designing, testing, and running ClickHouse® across both cloud and on-prem environments, where debugging production issues often felt more like unraveling plot twists in a thriller than routine operations.

We’ll walk through some of the toughest incidents we’ve faced: what broke, what we thought was wrong, what actually was wrong, and how we got to the root cause. Along the way, we’ll introduce a practical framework for tackling such issues—combining human intuition, AI assistance, and the messy negotiations that often define real-world problem-solving.

Because in production, solutions aren’t always about perfect optimizations—they’re about trade-offs, context, and sometimes, knowing when to compromise.

Join us for a behind-the-scenes look into ClickHouse® in the wild—lessons learned, patterns observed, and a few fun battle scars.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Christopher Bergh
Data teams continue to face long-standing challenges: their customers often distrust their results, data providers frequently ignore their existence, and teams spend more time firefighting than creating insights. The demand for AI just makes it more complicated: no wonder many data teams experience PTSD. The solution is simple: identify problems before they reach your customer. You need to implement data quality tests—lots of them. Check every table and column. See if anything is incorrect.
11/05/2025 5:30 PM 11/05/2025 6:00 PM UTC OSACon: Garbage Data = Garbage AI: An Open Source Data Quality Framework for Teams With No Time Presented by Christopher Bergh.

Data teams continue to face long-standing challenges: their customers often distrust their results, data providers frequently ignore their existence, and teams spend more time firefighting than creating insights. The demand for AI just makes it more complicated: no wonder many data teams experience PTSD.

The solution is simple: identify problems before they reach your customer. You need to implement data quality tests—lots of them. Check every table and column. See if anything is incorrect. Run these tests in production and incorporate them into the development process. Use the results to obtain data quality scores and implement improvements to your source systems.

Data teams work with hundreds or thousands of tables and often lack sufficient time to achieve data test coverage. That’s why, after decades of data engineering, we released an open-source tool that handles this for them.

DataKitchen’s open-source data quality and data observability tools aim to help data teams automatically generate 80% of the necessary data tests with just a few clicks, while providing an easy-to-use UI for collaborating on the remaining 20% of tests that are unique to their organization.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Ray Paik & Fatih Degirmenci
Recently, several open source companies attracted a lot of attention after their announcements of license changes. Not surprisingly, these shifts sparked backlash from open source enthusiasts, prompting some to create community-driven forks under open source foundations. Now there is growing skepticism toward (single) company backed open source projects, with many arguing that open source projects should be run by neutral foundations to prevent future bait-and-switch tactics. But is foundation backing really the answer?
11/05/2025 6:00 PM 11/05/2025 6:30 PM UTC OSACon: Companies vs. Foundations: Who Should Steer Your Open Source Project? Presented by Ray Paik & Fatih Degirmenci.

Recently, several open source companies attracted a lot of attention after their announcements of license changes. Not surprisingly, these shifts sparked backlash from open source enthusiasts, prompting some to create community-driven forks under open source foundations.

Now there is growing skepticism toward (single) company backed open source projects, with many arguing that open source projects should be run by neutral foundations to prevent future bait-and-switch tactics. But is foundation backing really the answer?

Drawing on over a decade of experience in both open source foundations and companies, Fatih and Ray will compare foundation-backed and company-backed projects across key areas such as governance, roadmap planning, community, and funding. They’ll explore real-world examples of successful—and not-so-successful—projects in both models.

Finally, Fatih and Ray will discuss why funding models should be just one of several factors in assessing the long-term viability of open source projects. They’ll offer a holistic approach for evaluating open source projects, helping developers and decision-makers make informed choices about which projects to adopt, support, or contribute to.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Udi Rot
After years of pushing ClickHouse® to its outer limits in real-world observability workloads, we’ve learned a lot - sometimes the hard way - about getting the most out of your analytics system. But before you dive into inverted indexes, object storage, and terabyte-scale performance tuning, it’s critical to get the basics right. This talk starts at the beginning, walking through the fundamentals that make ClickHouse® such a powerful engine for analytical workloads, the performance advantages of columnar storage, how its architecture supports horizontal scaling, and why it’s ideal for high-throughput, low-latency queries.
11/05/2025 6:00 PM 11/05/2025 6:30 PM UTC OSACon: Everything I Learned About ClickHouse® was From Real Workloads Presented by Udi Rot.

After years of pushing ClickHouse® to its outer limits in real-world observability workloads, we’ve learned a lot - sometimes the hard way - about getting the most out of your analytics system. But before you dive into inverted indexes, object storage, and terabyte-scale performance tuning, it’s critical to get the basics right.

This talk starts at the beginning, walking through the fundamentals that make ClickHouse® such a powerful engine for analytical workloads, the performance advantages of columnar storage, how its architecture supports horizontal scaling, and why it’s ideal for high-throughput, low-latency queries. We’ll share how our observability platform ingests and queries billions of logs, traces, and metrics using these core principles (and yours can too!).

Then dive into the deep end, covering some advanced and novel techniques we’ve developed over time. You’ll learn how we use custom inverted indexes for rare-event querying, materialized views for real-time aggregations, secondary index tuning, and cost-efficient object storage integration. Along the way, we’ll highlight performance optimization strategies grounded in real-world data scale.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Peter Corless
Observability (O11y) is the practice of collecting, analyzing and acting upon system telemetry to ensure optimum performance and reliability. It is a real-time use case that every tech-driven organization faces. However, the landscape of observability is rapidly changing driven by a number of factors: • The adoption of OpenTelemetry (OTel) at the agent and collection layers • The rise of disaggregated observability stacks, combining best-of-breed solutions for each layer, built on open source technologies • The evolution of “Observability 2.
11/05/2025 6:30 PM 11/05/2025 7:00 PM UTC OSACon: Emerging Architectures for Real-Time Observability at Scale Presented by Peter Corless.

Observability (O11y) is the practice of collecting, analyzing and acting upon system telemetry to ensure optimum performance and reliability. It is a real-time use case that every tech-driven organization faces. However, the landscape of observability is rapidly changing driven by a number of factors:

• The adoption of OpenTelemetry (OTel) at the agent and collection layers • The rise of disaggregated observability stacks, combining best-of-breed solutions for each layer, built on open source technologies • The evolution of “Observability 2.0” which combines all types of telemetry into a single common data model • The advent of AI within observability systems, and, conversely, the need for observability of AI systems

This talk will bring attendees up to speed on the rapidly changing observability landscape, and how data streaming, stream processing, and real-time analytics technologies play critical roles in emerging disaggregated observability stacks.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Karthickram Vailraj
Financial analytics platforms face unprecedented data challenges, processing millions of transactions while delivering real-time insights to traders, risk managers, and compliance teams. Open source distributed databases have emerged as the backbone of modern FinTech analytics, enabling organizations to scale cost-effectively while maintaining full control over their data architecture. This presentation explores how open source technologies like Apache Cassandra, PostgreSQL with Citus, and ClickHouse® power financial analytics through strategic sharding, replication, and consistency models.
11/05/2025 6:30 PM 11/05/2025 7:00 PM UTC OSACon: Open Source Database Architectures for High Volume Financial Analytics Presented by Karthickram Vailraj.

Financial analytics platforms face unprecedented data challenges, processing millions of transactions while delivering real-time insights to traders, risk managers, and compliance teams. Open source distributed databases have emerged as the backbone of modern FinTech analytics, enabling organizations to scale cost-effectively while maintaining full control over their data architecture.

This presentation explores how open source technologies like Apache Cassandra, PostgreSQL with Citus, and ClickHouse® power financial analytics through strategic sharding, replication, and consistency models. We’ll examine how leading financial institutions leverage these tools to handle 65,000+ transactions per second during peak trading periods while maintaining sub-millisecond query performance for real-time risk assessment and fraud detection.

Key topics include horizontal scaling strategies for multi-terabyte financial datasets, implementing cross-region replication for regulatory compliance, and balancing consistency requirements between transactional accuracy and analytical responsiveness. Through real-world case studies, attendees will discover how open source database architectures enable sophisticated financial analytics—from high-frequency trading algorithms to regulatory reporting pipelines—while reducing infrastructure costs and avoiding vendor lock-in.

Whether you’re building trading platforms, risk management systems, or compliance dashboards, this talk provides practical insights into architecting scalable, fault-tolerant analytics infrastructure using proven open source technologies that power today’s most demanding financial applications.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Marc-Antoine Desroches
How we built an open source observability stack that can track every frame of our game. https://github.com/madesroches/micromegas/ When every frame lasting 1/60th of a second can record thousands of events, traditional time series databases just won’t do.
11/05/2025 7:00 PM 11/05/2025 7:30 PM UTC OSACon: Micromegas - unified observability for video games Presented by Marc-Antoine Desroches.

How we built an open source observability stack that can track every frame of our game.

https://github.com/madesroches/micromegas/

When every frame lasting 1/60th of a second can record thousands of events, traditional time series databases just won’t do.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Chris Dabatos
I was fed up with FAQ search that spits back junk or nothing at all. TSIA says over sixty percent of support tickets could be solved by our own docs, so why isn’t that happening? In this talk I’ll show how I built a simple semantic search app with just three hundred lines of Python. I’ll demo it live answering three differently worded questions, and all in under a hundred milliseconds using TiDB Open-Source and Amazon Bedrock embeddings.
11/05/2025 7:00 PM 11/05/2025 7:30 PM UTC OSACon: Why Your FAQ Search Sucks and How I Fixed It Presented by Chris Dabatos.

I was fed up with FAQ search that spits back junk or nothing at all. TSIA says over sixty percent of support tickets could be solved by our own docs, so why isn’t that happening? In this talk I’ll show how I built a simple semantic search app with just three hundred lines of Python. I’ll demo it live answering three differently worded questions, and all in under a hundred milliseconds using TiDB Open-Source and Amazon Bedrock embeddings. At the end i’ll provide the steps you need to take so you can clone the repo and run it on your docs tonight.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Alkin Tezuysal & Boris Tyshkevich
Manual analysis of thousands of daily database alerts is impossible—but AI changes everything. This talk demonstrates using modern AI tools to analyze internal alert databases and uncover critical patterns in ClickHouse® Deployments. Through live demos, we’ll show how to: Identify the most critical and common alert patterns using AI-assisted SQL generation Correlate application alerts with ClickHouse® system tables (query_log, part_log, asynchronous_metric_log) Automate root cause analysis and predict alert escalation paths
11/05/2025 7:30 PM 11/05/2025 8:00 PM UTC OSACon: AI-Powered Alert Analysis: Uncovering Critical Patterns in ClickHouse® Databases Presented by Alkin Tezuysal & Boris Tyshkevich.

Manual analysis of thousands of daily database alerts is impossible—but AI changes everything. This talk demonstrates using modern AI tools to analyze internal alert databases and uncover critical patterns in ClickHouse® Deployments.

Through live demos, we’ll show how to:

  1. Identify the most critical and common alert patterns using AI-assisted SQL generation

  2. Correlate application alerts with ClickHouse® system tables (query_log, part_log, asynchronous_metric_log)

  3. Automate root cause analysis and predict alert escalation paths

  4. Transform raw alert data into actionable insights for proactive monitoring

See how combining ClickHouse®’s analytical power with AI tools creates a modern framework for database monitoring that prevents production outages before they happen.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Georg Heiler
Magenta Telekom ingests many terabytes of new data every day, and every downstream consumer wants it immediately. The real bottleneck turned out not to be hardware but humans wrestling with hidden, hard-wired dependencies in hundreds of heterogeneous pipelines and sometimes tool silos. Our fix was to treat every data asset as a node in a data-dependency graph and every transformation as an edge. Ingestion, Transformation, AI and BI are all part of the same executable graph.
11/05/2025 7:30 PM 11/05/2025 8:00 PM UTC OSACon: Scaling Data Pipelines @ Magenta Telekom Presented by Georg Heiler.

Magenta Telekom ingests many terabytes of new data every day, and every downstream consumer wants it immediately. The real bottleneck turned out not to be hardware but humans wrestling with hidden, hard-wired dependencies in hundreds of heterogeneous pipelines and sometimes tool silos.

Our fix was to treat every data asset as a node in a data-dependency graph and every transformation as an edge. Ingestion, Transformation, AI and BI are all part of the same executable graph. By using suitable abstractions and dependency injection less technical people are empowered to contribute business logic which can be operationalized efficiently.

This talk covers:

  • Unified asset graph – ingest → transforms → reports → ML all in one lineage-aware DAG.
  • Event-based pipelines: Events propagate state changes across the enterprise via the edges in the graph in near real-time
  • Dependency-injection: By using the right abstractions we can empower more users along the data value chain to contribute.

Operational challenges are handled by the abstractions and analysts only focus on the business logic.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By Ron Kapoor
This talk dives into technical optimizations that deliver low-latency, high-concurrency queries on Apache Iceberg without sacrificing openness. Together, we’ll examine what kills performance when querying Iceberg, highlight best practices that make queries faster, and evaluate query engine optimizations for Iceberg—including handling position and equality delete tables, distributed metadata parsing, and more. You’ll hear real-world stories from leading enterprises who have used these lessons to optimize Apache Iceberg performance at scale and walk away with actionable techniques for making your Iceberg lakehouse faster than ever.
11/05/2025 8:00 PM 11/05/2025 8:30 PM UTC OSACon: Real-Time Customer-Facing Analytics: From Pain to Production Presented by Ron Kapoor.

This talk dives into technical optimizations that deliver low-latency, high-concurrency queries on Apache Iceberg without sacrificing openness. Together, we’ll examine what kills performance when querying Iceberg, highlight best practices that make queries faster, and evaluate query engine optimizations for Iceberg—including handling position and equality delete tables, distributed metadata parsing, and more. You’ll hear real-world stories from leading enterprises who have used these lessons to optimize Apache Iceberg performance at scale and walk away with actionable techniques for making your Iceberg lakehouse faster than ever.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
By David Stokes
Structured Query Language’s Window Functions are a powerful tool for analytics. They let you get more granular insight than a GROUP BY clause. But the syntax is obtuse, the terms used are nebulous (unbounded previous anyone?), and the results can be much less insightful than expected. This session is a quick introduction to and explanation of how to use Window Functions efficiently to better investigate your data.
11/05/2025 8:00 PM 11/05/2025 8:30 PM UTC OSACon: SQL Window Functions In Five Easy Steps Presented by David Stokes.

Structured Query Language’s Window Functions are a powerful tool for analytics. They let you get more granular insight than a GROUP BY clause. But the syntax is obtuse, the terms used are nebulous (unbounded previous anyone?), and the results can be much less insightful than expected. This session is a quick introduction to and explanation of how to use Window Functions efficiently to better investigate your data.

https://us.airmeet.com/e/69f1f9b0-2f11-11ef-82f4-1d5f1667121e
2025-11-05T16:00:00.000Z - 2025-11-05 16:30:00 +0000 UTC.
By Maxime Beauchemin
As AI transforms how we build, scale, and interact with software, one thing is becoming clear: open source isn’t just keeping up—it’s leading. In this keynote, Max Beauchemin, creator of Apache Superset and Apache Airflow, unpacks why open source is uniquely positioned to dominate in the age of AI. From training data to developer velocity, open projects have structural advantages that proprietary vendors simply can’t replicate. We’ll explore how LLMs ““know”” open source deeply, how AI-native workflows amplify OSS contributions, and why communities—not corporations—are becoming the new centers of gravity for software innovation.
2025-11-05T16:35:00.000Z - 2025-11-05 17:10:00 +0000 UTC.
By Heather Meeker
Commercial Open Source Software (COSS) is a burgeoning business sector. This talk will focus on how the demand for COSS will be driven by the advance of LLM and AI.
2025-11-05T17:15:00.000Z - 2025-11-05 17:25:00 +0000 UTC.
By Josh Lee
Joseph Cambell’s story-circle describes the journeys of epic heroes from the known into the unknown in search of rewards. Maybe it’s not so different from the Gartner Hype Cycle that describes the journeys of innovations; And as individual developers we also go through a journey of discovery when adopting new tools and technologies.
2025-11-05T17:15:00.000Z - 2025-11-05 17:25:00 +0000 UTC.
By Lorenzo Mangani
Airport is a DuckDB community extension developed by Query.Farm that adds Apache Arrow Flight support to DuckDB. In plain terms: 👉 It lets DuckDB query, modify, and store data through Arrow Flight servers — remote, high-performance data endpoints that speak gRPC and Arrow IPC. You can think of it as “network tables for DuckDB.” Instead of reading files from disk, you can query live data over the network just like local tables.
2025-11-05T17:30:00.000Z - 2025-11-05 18:00:00 +0000 UTC.
By Christopher Bergh
Data teams continue to face long-standing challenges: their customers often distrust their results, data providers frequently ignore their existence, and teams spend more time firefighting than creating insights. The demand for AI just makes it more complicated: no wonder many data teams experience PTSD. The solution is simple: identify problems before they reach your customer. You need to implement data quality tests—lots of them. Check every table and column. See if anything is incorrect.
2025-11-05T17:30:00.000Z - 2025-11-05 18:00:00 +0000 UTC.
By Shivji Kumar Jha & Anurag Pandey
Distributed systems are full of surprises—and ClickHouse® is no exception. In this talk, Shivji (Nutanix) and Anurag (Incerto) share real-world war room stories from designing, testing, and running ClickHouse® across both cloud and on-prem environments, where debugging production issues often felt more like unraveling plot twists in a thriller than routine operations. We’ll walk through some of the toughest incidents we’ve faced: what broke, what we thought was wrong, what actually was wrong, and how we got to the root cause.
2025-11-05T18:00:00.000Z - 2025-11-05 18:30:00 +0000 UTC.
By Ray Paik & Fatih Degirmenci
Recently, several open source companies attracted a lot of attention after their announcements of license changes. Not surprisingly, these shifts sparked backlash from open source enthusiasts, prompting some to create community-driven forks under open source foundations. Now there is growing skepticism toward (single) company backed open source projects, with many arguing that open source projects should be run by neutral foundations to prevent future bait-and-switch tactics. But is foundation backing really the answer?
2025-11-05T18:00:00.000Z - 2025-11-05 18:30:00 +0000 UTC.
By Udi Rot
After years of pushing ClickHouse® to its outer limits in real-world observability workloads, we’ve learned a lot - sometimes the hard way - about getting the most out of your analytics system. But before you dive into inverted indexes, object storage, and terabyte-scale performance tuning, it’s critical to get the basics right. This talk starts at the beginning, walking through the fundamentals that make ClickHouse® such a powerful engine for analytical workloads, the performance advantages of columnar storage, how its architecture supports horizontal scaling, and why it’s ideal for high-throughput, low-latency queries.
2025-11-05T18:30:00.000Z - 2025-11-05 19:00:00 +0000 UTC.
By Karthickram Vailraj
Financial analytics platforms face unprecedented data challenges, processing millions of transactions while delivering real-time insights to traders, risk managers, and compliance teams. Open source distributed databases have emerged as the backbone of modern FinTech analytics, enabling organizations to scale cost-effectively while maintaining full control over their data architecture. This presentation explores how open source technologies like Apache Cassandra, PostgreSQL with Citus, and ClickHouse® power financial analytics through strategic sharding, replication, and consistency models.
2025-11-05T18:30:00.000Z - 2025-11-05 19:00:00 +0000 UTC.
By Peter Corless
Observability (O11y) is the practice of collecting, analyzing and acting upon system telemetry to ensure optimum performance and reliability. It is a real-time use case that every tech-driven organization faces. However, the landscape of observability is rapidly changing driven by a number of factors: • The adoption of OpenTelemetry (OTel) at the agent and collection layers • The rise of disaggregated observability stacks, combining best-of-breed solutions for each layer, built on open source technologies • The evolution of “Observability 2.
2025-11-05T19:00:00.000Z - 2025-11-05 19:30:00 +0000 UTC.
By Chris Dabatos
I was fed up with FAQ search that spits back junk or nothing at all. TSIA says over sixty percent of support tickets could be solved by our own docs, so why isn’t that happening? In this talk I’ll show how I built a simple semantic search app with just three hundred lines of Python. I’ll demo it live answering three differently worded questions, and all in under a hundred milliseconds using TiDB Open-Source and Amazon Bedrock embeddings.
2025-11-05T19:00:00.000Z - 2025-11-05 19:30:00 +0000 UTC.
By Marc-Antoine Desroches
How we built an open source observability stack that can track every frame of our game. https://github.com/madesroches/micromegas/ When every frame lasting 1/60th of a second can record thousands of events, traditional time series databases just won’t do.
2025-11-05T19:30:00.000Z - 2025-11-05 20:00:00 +0000 UTC.
By Georg Heiler
Magenta Telekom ingests many terabytes of new data every day, and every downstream consumer wants it immediately. The real bottleneck turned out not to be hardware but humans wrestling with hidden, hard-wired dependencies in hundreds of heterogeneous pipelines and sometimes tool silos. Our fix was to treat every data asset as a node in a data-dependency graph and every transformation as an edge. Ingestion, Transformation, AI and BI are all part of the same executable graph.
2025-11-05T19:30:00.000Z - 2025-11-05 20:00:00 +0000 UTC.
By Alkin Tezuysal & Boris Tyshkevich
Manual analysis of thousands of daily database alerts is impossible—but AI changes everything. This talk demonstrates using modern AI tools to analyze internal alert databases and uncover critical patterns in ClickHouse® Deployments. Through live demos, we’ll show how to: Identify the most critical and common alert patterns using AI-assisted SQL generation Correlate application alerts with ClickHouse® system tables (query_log, part_log, asynchronous_metric_log) Automate root cause analysis and predict alert escalation paths
2025-11-05T20:00:00.000Z - 2025-11-05 20:30:00 +0000 UTC.
By David Stokes
Structured Query Language’s Window Functions are a powerful tool for analytics. They let you get more granular insight than a GROUP BY clause. But the syntax is obtuse, the terms used are nebulous (unbounded previous anyone?), and the results can be much less insightful than expected. This session is a quick introduction to and explanation of how to use Window Functions efficiently to better investigate your data.
2025-11-05T20:00:00.000Z - 2025-11-05 20:30:00 +0000 UTC.
By Ron Kapoor
This talk dives into technical optimizations that deliver low-latency, high-concurrency queries on Apache Iceberg without sacrificing openness. Together, we’ll examine what kills performance when querying Iceberg, highlight best practices that make queries faster, and evaluate query engine optimizations for Iceberg—including handling position and equality delete tables, distributed metadata parsing, and more. You’ll hear real-world stories from leading enterprises who have used these lessons to optimize Apache Iceberg performance at scale and walk away with actionable techniques for making your Iceberg lakehouse faster than ever.