Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
ACloudDM 1: Workshop on Advances in Cloud Data Management 1
Time:
Tuesday, 04/Mar/2025:
9:00am - 10:30am
Session Chair: Jana Giceva, Technical University of Munich Session Chair: Martin Hentschel, IT University of Copenhagen
Location:WE5/00.019
Lecture Hall 2
Session Abstract
09:00 – 09:45 Peter Boncz CWI Amsterdam Keynote
09:50 – 10:10 Viktor Leis TU München
10:10 – 10:30 Benjamin Wagner Firebolt
Presentations
MotherDuck: DuckDB backed by the cloud
Peter Boncz
CWI, Netherlands, The
MotherDuck is a new service that connects DuckDB to the cloud. It introduces the concept of "dual query processing": the ability to execute queries partly on the client and partly in the cloud. The talk covers the motivation for MotherDuck and some of its use cases; as well as the main characteristics of its system architecture, which heavily uses the extension mechanisms of DuckDB. To provide context, the talk will therefore also provide a brief overview of the DuckDB architecture. The talk will also cover ongoing research work related to MotherDuck in the area of caching as well as query optimization.
The Future of the Cloud and the Future of Cloud Databases
Viktor Leis
Technische Universität München, Germany
Cloud computing is transforming the technology landscape, with database systems at the forefront of this change. A striking example is an online bookstore that has grown to dominate the database market. The appeal of cloud computing for IT users lies in several key factors: a reduced total cost of ownership through economies of scale and advanced services that minimize the burden of "undifferentiated heavy lifting". More broadly, cloud computing reflects a civilizational trend toward increased technological and economic specialization.
However, the current state of cloud computing often falls short of these promises. Hyperscalers are evolving into vertically integrated oligopolies, controlling everything from basic server rentals to high-level services. This trend is only accelerating, potentially leading to a future where hyperscalers establish software standards and design their own hardware, making it impossible to compete. Moreover, despite differences in branding, the major cloud providers are fundamentally similar, lacking interoperability and fostering vendor lock-in. As a result, we risk returning to the monopolistic conditions of the IBM and Wintel eras and ultimately technological stagnation due to limited competition.
Yet there is cause for optimism. Great technology can still succeed, as the multi-cloud data warehouse Snowflake has shown. The rise of data lakes and open standards, such as Parquet and Iceberg, further underscores the potential for interoperability and innovation. Additionally, there are orders-of-magnitude gaps between the price of existing cloud services and what is theoretically achievable, creating opportunities for disruption. These price gaps persist because cloud services are inherently complex to build, requiring redundant efforts and leading to high barriers to entry. For example, a DBMS might need a highly available control plane, a write-ahead log service, and distributed storage servers. None of these abstractions is available as a read-to-use service, which makes it difficult to enter the cloud database market. The current cloud landscape is more a result of historical circumstances than optimal design, leaving ample room for disruption.
In this talk, I will outline a blueprint for reinventing the cloud by focusing on three key areas: First, we need a unified multi-cloud abstraction over virtualized hardware. Second, we should establish new open standards for existing low-level cloud services. Third, we need abstractions that simplify the creation of new cloud services, such as reusable control planes and foundational components like log services and page servers. Together, this will make it significantly easier to build, deploy, and monetize new cloud services. Increased competition would commoditize foundational services and spur technological innovation.
Firebolt Transactions: Consistency, Performance and Availability - Pick All Three
Benjamin Wagner
Firebolt
Firebolt is a data warehouse built for data intensive applications. To support these workloads, our metadata services enable:
- An unlimited number of concurrent writers across a region
- Strong consistency with snapshot isolation
- Low overhead for read-only transactions (~2ms) on petabytes of data
- Powerful metadata operations such as zero-copy cloning and time travel
This talk provides a deep-dive into how we built Firebolt’s metadata services on top of FoundationDB. We focus on how to leverage the underlying key-value space in a way that supports low-latency transactions. Based on this, we describe our internal API design as well as dependent services such as metadata snapshot compaction and garbage collection. Finally, we describe how we deploy our service on AWS to minimize network latency.