Yesterday Snowflake released their Q4 FY 2024 earnings during which they announced that Frank Slootman is retiring. The stock proceeded to drop by ~20% after hours. Obviously losing Frank Slootman as the CEO is of huge significance, but there were more worrying trends from the company’s earnings call. One of which is Apache Iceberg.
Iceberg was mentioned no less than 18 times during the company’s earnings call, second only to “AI” (28) as per my very non-scientific analysis of the call. Now, why would Snowflake’s CEO, both new and past, alongside their CFO mention a somewhat obscure Apache project on the call? The answer is simple: Iceberg is moving data out of Snowflake.
We are forecasting increased revenue headwinds associated with product efficiency gains, tiered storage pricing, and the expectation that some of our customers will leverage Iceberg tables for their storage. Source: Snowflake (SNOW) Q4 2024 Earnings Call Transcript
The impact is not just losing the storage revenue alone, but the more valuable compute revenue too.
So, the amount of revenue associated with storage is coming down. But on top of that, we do expect a number of our large customers are going to adopt Iceberg formats and move their data out of Snowflake where we lose that storage revenue and also the compute revenue associated with moving that data into Snowflake. Source: Snowflake (SNOW) Q4 2024 Earnings Call Transcript
Not only is Apache Iceberg moving data out of Snowflake, but moving it to other systems to process and analyze this data. Storage and the highly lucrative compute suffer.
So what is Apache Iceberg and why is enabling this data migration off the world’s most popular data warehouse?
A brief tour of Iceberg
Iceberg is a high-performance format for huge analytic tables. Iceberg brings the reliability and simplicity of SQL tables to big data, while making it possible for engines like Spark, Trino, Flink, Presto, Hive and Impala to safely work with the same tables, at the same time. Source: Apache Iceberg
In non data-nerd terms Iceberg is a technology that allows you to create tables and offer the same transactional guarantees that databases offer but it does that on storage systems that do not necessarily support tables or transactions like AWS S3. Not only that, but the data you can store in Iceberg tables can be in non-proprietary formats like Parquet, CSV, JSON.
Iceberg is cheaper tremendously more flexible than data warehouses like Sowflake. Data stored in Iceberg tables can be queried with different querying engines like Dremio, Spark, Trino and others. In short, Iceberg allows you to create the backend of a data warehouse at a fraction of the cost of data warehouse like Snowflake, and gives you the freedom of storing your data in different format and using querying engines for your own choice. Iceberg gives you data and compute choices.
Keep reading with a 7-day free trial
Subscribe to Cu(m^2)ulative to keep reading this post and get 7 days of free access to the full post archives.