Title

Agnostic is the Only Constant: Embracing the Lakehouse Paradigm Without Lock-In

by Viktor Kessler
As the Lakehouse paradigm rises in popularity, so does the risk of being locked into a single vendor’s ecosystem. But what if you could have all the benefits of a unified architecture—without giving up control? In this session, we introduce Lakekeeper, an open-source Apache Iceberg catalog that makes it possible to build Lakehouse architectures that are truly portable: across clouds, compute engines, and storage layers. This talk speaks directly to data professionals looking to stay ahead of the curve by exploring:

AI-Native Analytics: Building the Next Generation of Intelligent, Conversational Decision Systems

by Rajesh Sura
As enterprises grapple with an explosion of data and increasing pressure to make rapid, informed decisions, traditional Business Intelligence (BI) tools are reaching their limits. Static dashboards and complex query interfaces often exclude non-technical users, creating friction between data and action. Enter AI-native analytics—a transformative approach that integrates natural language interfaces (NLIs) with scalable machine learning (ML) to deliver intelligent, conversational decision systems. This keynote explores how organizations can reimagine their analytics infrastructure by embedding AI into the very fabric of user interaction.

Apache Superset Extensions - Taking Open Source BI to the Next Level

by Evan Rusackas, Michael Molina & Ville Brofeldt
Apache Superset has always been leading the charge on open-source BI, but now it’s getting ready to truly take over the BI world. Learn all about Superset’s new extensions architecture that will allow users and developers to more rapidly expand and improve the product’s capabilities, while simplifying life for both developers and maintainers.

Companies vs. Foundations: Who Should Steer Your Open Source Project?

by Ray Paik & Fatih Degirmenci
Recently, several open source companies attracted a lot of attention after their announcements of license changes. Not surprisingly, these shifts sparked backlash from open source enthusiasts, prompting some to create community-driven forks under open source foundations. Now there is growing skepticism toward (single) company backed open source projects, with many arguing that open source projects should be run by neutral foundations to prevent future bait-and-switch tactics. But is foundation backing really the answer?

Everything I Learned About ClickHouse was From Real Workloads

by Udi Rot
After years of pushing ClickHouse to its outer limits in real-world observability workloads, we’ve learned a lot - sometimes the hard way - about getting the most out of your analytics system. But before you dive into inverted indexes, object storage, and terabyte-scale performance tuning, it’s critical to get the basics right. This talk starts at the beginning, walking through the fundamentals that make ClickHouse such a powerful engine for analytical workloads, the performance advantages of columnar storage, how its architecture supports horizontal scaling, and why it’s ideal for high-throughput, low-latency queries.

How Open Source Businesses Will Thrive in the Age of AI

by Heather Meeker
Commercial Open Source Software (COSS) is a burgeoning business sector. This talk will focus on how the demand for COSS will be driven by the advance of LLM and AI.

Micromegas - unified observability for video games

by Marc-Antoine Desroches
How we built an open source observability stack that can track every frame of our game. https://github.com/madesroches/micromegas/ When every frame lasting 1/60th of a second can record thousands of events, traditional time series databases just won’t do.

Open Analytics in Action

by sri Rama Satya Prasanth
This session explores how open-source analytics technologies are transforming the public sector through the lens of Electronic Income Verification (EIV) systems—platforms that process over 850,000 real-time verifications daily, integrate 40+ data sources, and maintain 99.95% uptime to support equitable, efficient public benefit delivery. We’ll dive into the open-source stack behind these systems: event streaming with Apache Kafka, data orchestration with Airflow, analytics with Apache Superset and DuckDB, and ML-powered fraud detection using tools like scikit-learn and Hugging Face NLP.

Open Source Database Architectures for High Volume Financial Analytics

by Karthickram Vailraj
Financial analytics platforms face unprecedented data challenges, processing millions of transactions while delivering real-time insights to traders, risk managers, and compliance teams. Open source distributed databases have emerged as the backbone of modern FinTech analytics, enabling organizations to scale cost-effectively while maintaining full control over their data architecture. This presentation explores how open source technologies like Apache Cassandra, PostgreSQL with Citus, and ClickHouse power financial analytics through strategic sharding, replication, and consistency models.

Open Source Event-Driven Analytics for Real-Time Retail Inventory Management

by Nidhin Jose
The retail sector’s shift toward omnichannel fulfillment and instant availability demands has exposed critical limitations in traditional batch-processed inventory systems. This presentation demonstrates how open source Event-Driven Architecture (EDA) tools are transforming retail inventory analytics, enabling continuous real-time processing that delivers superior accuracy, automated insights, and scalable supply chain responsiveness. Open source analytics platforms are proving their value in retail operations, with adopters experiencing up to 30% fewer stockouts and 15% improved inventory accuracy compared to proprietary batch systems.

Open Source’s Massive Unfair Advantage in the AI Era

by Maxime Beauchemin
As AI transforms how we build, scale, and interact with software, one thing is becoming clear: open source isn’t just keeping up—it’s leading. In this keynote, Max Beauchemin, creator of Apache Superset and Apache Airflow, unpacks why open source is uniquely positioned to dominate in the age of AI. From training data to developer velocity, open projects have structural advantages that proprietary vendors simply can’t replicate. We’ll explore how LLMs ““know”” open source deeply, how AI-native workflows amplify OSS contributions, and why communities—not corporations—are becoming the new centers of gravity for software innovation.

RAG Without the Hassle: Building AI-Powered Analytics Apps on OceanBase Vector Database

by Peng Wang
Retrieval-Augmented Generation (RAG) is transforming analytics applications — but implementing it often means managing multiple systems: OLTP, vector DBs, and orchestration tools. In this session, we’ll show how OceanBase simplifies this stack by supporting both structured and vector data natively, enabling developers to build real-time RAG pipelines using just one open-source database. We’ll walk through a working demo that combines OceanBase with OpenAI and popular Python frameworks like LangChain, demonstrating how to perform vector search and retrieval directly using SQL.

Scaling data pipelines @ Magenta Telekom

by Georg Heiler
Magenta Telekom ingests many terabytes of new data every day, and every downstream consumer wants it immediately. The real bottleneck turned out not to be hardware but humans wrestling with hidden, hard-wired dependencies in hundreds of heterogeneous pipelines and sometimes tool silos. Our fix was to treat every data asset as a node in a data-dependency graph and every transformation as an edge. Ingestion, Transformation, AI and BI are all part of the same executable graph.

Smarter Analytics: AI-Driven Intelligence in Modern Databases

by Peter Zaitsev
What happens when databases don’t just store data—but help analyze it intelligently? This talk explores emerging trends in AI-powered database intelligence, from schema optimization and real-time query tuning to transforming unstructured content using techniques like sentiment analysis and entity recognition. We’ll also dive into the future of self-driving databases and multi-modal analytics that integrate text, images, and more. Attendees will leave with a forward-looking view of how AI is reshaping database engines into context-aware, insight-generating platforms—laying the foundation for the next generation of open-source analytics systems.

Streaming Analytics in Action: Real-World Case Studies from Uber, Razorpay, and Stripe

by Jayesh Asrani
Discover the transformative power of streaming analytics featuring groundbreaking case studies from some of the most innovative companies in the world. Explore how Uber, Razorpay, and Stripe leverage next-gen streaming architectures to power their real-time decision-making, improve user experiences, and drive operational excellence. These case studies will offer a rare glimpse into the advanced technologies and strategies behind these leading-edge systems, showcasing real-world applications of streaming analytics that are as inspiring as they are practical.

What the Spec?!: New Features in Apache Iceberg™ Table Format V3

by Danica Fine
Apache Iceberg™ made great advancements going from Table Format V1 to Table Format V2, introducing features like position deletes, advanced metrics, and cleaner metadata abstractions. But with Table Format V3 on the horizon, Iceberg users have even more to look forward to. In this session, we’ll explore some of the exciting new user-facing features that V3 Iceberg is about to introduce and see how they’ll make working with Open Data Formats easier than ever!

Why Your FAQ Search Sucks and How I Fixed It

by Chris Dabatos
I was fed up with FAQ search that spits back junk or nothing at all. TSIA says over sixty percent of support tickets could be solved by our own docs, so why isn’t that happening? In this talk I’ll show how I built a simple semantic search app with just three hundred lines of Python. I’ll demo it live answering three differently worded questions, and all in under a hundred milliseconds using TiDB Open-Source and Amazon Bedrock embeddings.