2023 sessions

Title	Speaker(s)	Recording	Slides
`New` Workflow Orchestrator in town: Apache Airflow 2.x	Jarek Potiuk
A Guide to Responsible Data Collection In Open Source	Avi Press
An Overview of DuckDB	Gabor Szarnyas
Apache Pulsar: Finally an Alternative to Kafka?	Julien Jakubowski
Build a fully-managed OSS compatible lakehouse with BigLake Managed Tables	Jeffrey Nelson
Building a ChatGPT Data Pipeline with RisingWave Stream Processor and Astra Vector Search	Mary Grygleski & Karin Wolok
CICD Pipelines for dbt: DIY or DIWhy?	Cameron Cyr
Data Alchemy: Transforming Raw Data to Gold with Apache Hudi and DBT	Nadine Farah
Data as Code: Project Nessie brings a Git-like experience for Apache Iceberg Tables	Alex Merced
Data on GKE	Akshay Ram
ETL with Meltano + Singer in the LLM era	Pat Nadolny
From Click to Insight: Transforming Streams with Apache Flink	Andrey Gusarov
From Zero to Superset Hero: Data visualisation as a code with Terraform	Viktoria Ondrejova
Getting Started with Polars	Matt Harrison
Going beyond Observability: Grafana for Analytics	Kyle Cunningham
How to implement Data Contracts with DataHub	Shirshanka Das
Leveraging object storage: Tiered Storage for ClickHouse®	Arthur Ansquer
Make data movement limitless and secure with Open Source	Michel Tricot
Many Faces of Real-time Analytics	Dunith Dhanushka
Maximizing Query Speed and Minimizing Costs in Data Lakes with Open-Source Caching	Beinan Wang
Maybe The Real Modern Data Stack Was the Open Source Tools We Got Along The Way	Pedram Navid
Most "Open Source" AI Isn't. And What We Can Do About That.	Chris Hazard
Navigating the Landscape of a Fully Open Source Data Stack in 2023	Maxime Beauchemin
Open Formats: The Happy Accident Disrupting the Data Industry	Ryan Blue
Open Source BI FTW - Building Compelling Dashboards with Apache Superset	Evan Rusackas
Open Source Project Report: Evidence - Business Intelligence as Code	Sean Hughes
Panel Discussion on Growing a Healthy Open Source Community	Multiple speakers
Panel: Open Source means Open! Or Does it? The State of Licensing in 2023	Multiple speakers
Prestissimo : The new generation Presto	Aditi Pandit
Proton : A single binary to tackle streaming and historical analytics	Ken Chen
Query Live Data Using Open Source SQL Engines	Jove Zhong & Gang Tao
QuestDB: The building blocks of a fast open-source time-series database	Javier Ramirez
Real-Time Revolution: Kickstarting Your Journey in Streaming Data	Zander Matheson
Reducing complexity and increasing performance with Trino	Cole Bowden
Reinventing Kafka in the Data Streaming Era	Jun Rao
StarRocks: Fast Real-Time Analytics for User-Facing Applications	Albert Wong
The Future of Analytics is Open Source and Cloud Native	Robert Hodges
The Need for an Open Standard for Semantic Layer	Brian Bickell
Unlocking Advanced Log Analytics With ClickHouse® and Kafka	Arul Jegadish
Unlocking Financial Data with Real-Time Pipelines	Timothy Spann
Unlocking Scalable and Efficient Data Storage with Apache Ozone	Uma Maheswara Rao Gangumalla
Unveiling the Power of dbt and DuckDB: Hype vs. Reality	Cameron Cyr
What the Duck?	Jordan Tigani
Where the Modern Data Stack has Failed and why Engineering-centric Tools will Reshape the Data World	Nick Schrock
Who needs ChatGPT? Rock solid AI pipelines with Hugging Face and Kedro	Juan Luis Cano
You put OLTP in my OLAP! Analytics and Real-time Converged	Felipe Mendes

Title

Speaker(s)

Recording

Slides