These sessions were presented at OSA Con 2023 on December 12-14, 2023.

Title Speaker(s) Recording Slides

`New` Workflow Orchestrator in town: Apache Airflow 2.x

Jarek Potiuk

A Guide to Responsible Data Collection In Open Source

Avi Press

An Overview of DuckDB

Gabor Szarnyas

Apache Pulsar: Finally an Alternative to Kafka?

Julien Jakubowski

Build a fully-managed OSS compatible lakehouse with BigLake Managed Tables

Jeffrey Nelson

Building a ChatGPT Data Pipeline with RisingWave Stream Processor and Astra Vector Search

Mary Grygleski & Karin Wolok

CICD Pipelines for dbt: DIY or DIWhy?

Cameron Cyr

Data Alchemy: Transforming Raw Data to Gold with Apache Hudi and DBT

Nadine Farah

Data as Code: Project Nessie brings a Git-like experience for Apache Iceberg Tables

Alex Merced

Data on GKE

Akshay Ram

ETL with Meltano + Singer in the LLM era

Pat Nadolny

From Click to Insight: Transforming Streams with Apache Flink

Andrey Gusarov

From Zero to Superset Hero: Data visualisation as a code with Terraform

Viktoria Ondrejova

Getting Started with Polars

Matt Harrison

Going beyond Observability: Grafana for Analytics

Kyle Cunningham

How to implement Data Contracts with DataHub

Shirshanka Das

Leveraging object storage: Tiered Storage for ClickHouse

Arthur Ansquer

Make data movement limitless and secure with Open Source

Michel Tricot

Many Faces of Real-time Analytics

Dunith Dhanushka

Maximizing Query Speed and Minimizing Costs in Data Lakes with Open-Source Caching

Beinan Wang

Maybe The Real Modern Data Stack Was the Open Source Tools We Got Along The Way

Pedram Navid

Most "Open Source" AI Isn't. And What We Can Do About That.

Chris Hazard

Navigating the Landscape of a Fully Open Source Data Stack in 2023

Maxime Beauchemin

Open Formats: The Happy Accident Disrupting the Data Industry

Ryan Blue

Open Source BI FTW - Building Compelling Dashboards with Apache Superset

Evan Rusackas

Open Source Project Report: Evidence - Business Intelligence as Code

Sean Hughes

Panel Discussion on Growing a Healthy Open Source Community

Multiple speakers

Panel: Open Source means Open! Or Does it? The State of Licensing in 2023

Multiple speakers

Prestissimo : The new generation Presto

Aditi Pandit

Proton : A single binary to tackle streaming and historical analytics

Ken Chen

Query Live Data Using Open Source SQL Engines

Jove Zhong & Gang Tao

QuestDB: The building blocks of a fast open-source time-series database

Javier Ramirez

Real-Time Revolution: Kickstarting Your Journey in Streaming Data

Zander Matheson

Reducing complexity and increasing performance with Trino

Cole Bowden

Reinventing Kafka in the Data Streaming Era

Jun Rao

StarRocks: Fast Real-Time Analytics for User-Facing Applications

Albert Wong

The Future of Analytics is Open Source and Cloud Native

Robert Hodges

The Need for an Open Standard for Semantic Layer

Brian Bickell

Unlocking Advanced Log Analytics With ClickHouse and Kafka

Arul Jegadish

Unlocking Financial Data with Real-Time Pipelines

Timothy Spann

Unlocking Scalable and Efficient Data Storage with Apache Ozone

Uma Maheswara Rao Gangumalla

Unveiling the Power of dbt and DuckDB: Hype vs. Reality

Cameron Cyr

What the Duck?

Jordan Tigani

Where the Modern Data Stack has Failed and why Engineering-centric Tools will Reshape the Data World

Nick Schrock

Who needs ChatGPT? Rock solid AI pipelines with Hugging Face and Kedro

Juan Luis Cano

You put OLTP in my OLAP! Analytics and Real-time Converged

Felipe Mendes