A Tabular newsletter revisiting the last month in Apache Iceberg
❤️Apache Iceberg? Spread the word by giving it a ⭐ on the apache/iceberg repo!
Project updates
Iceberg Java
- Spark: Support for enabling executor cache locality. When enabled, this property will increase the probability of cache hits on records in a partition. This is done by collocating tasks and query caching relative to the partition they both read from.
- The new docs site is live! This will make it easier for the community to maintain docs and simplify the release process going forward. Check out the README for information on how to get started contributing.
- More documentation on sharing credentials via REST was added, including S3, GCP, and Azure, opening up more flexibility for teams using service or utility accounts.
- Spark: Fixed aggregate pushdown for nested fields. You’ll see fewer errors when performing aggregation pushdowns on struct fields.
- Spark: Added support for setting/unsetting properties on views.
- Spark: Added support that ensures a view representation isn’t lost when describing/showing views from other engines.
- Added view support to the JDBC catalog.
- Added FileIO and APIs for encrypting metadata in upcoming v3 encryption support.
PyIceberg, iceberg-go, and iceberg-rust
- PyIceberg 0.6.0 is now available and includes write support! For more details and insights check out this discussion with contributors Fokko Driesprong and Kevin Liu, hosted by Brian (Bits) Olsen.
- Iceberg-rust premiered with its first release with version 0.2.0! This was a huge undertaking also with reviews and guidance from Fokko Driesprong, and heavy contributions from Renjie Liu, Xuanwo, JanKual, and many other passionate contributors. Congratulations to everyone who was involved!
- Iceberg-go added REST and Glue catalog support
Iceberg Summit 2024
Apache Software Foundation and Apache Iceberg PMC have agreed to allow Tabular and Dremio to jointly organize the first Iceberg Summit as a fully virtual event on May 14-15. The event will host dozens of technical talks covering experiences of data practitioners and developers working with Apache Iceberg as their table format.
If you are interested in speaking at Iceberg Summit 2024, you can submit your talk proposal here. The call for presentations will be open until April 12. The event is free to attend. You can register to attend here.
Bergy Blogs
- Introducing write support in PyIceberg 0.6.0
- Recipe: PyIceberg writes
- Accelerating queries on Iceberg tables with materialized views
- Implement an aggregation index for Apache Iceberg
- Introduction to Apache Iceberg
- Registering S3 files into Apache Iceberg tables- without the rewrites
- A deep dive into the world of Apache Iceberg catalogs
- Building a data lakehouse with Apache Iceberg, a primer
Podcasts / Videos
- Bergy Bits: Apache Iceberg view support
- Seven best practices for a successful Iceberg implementation
- Using Trino and Iceberg as the foundation of your data lakehouse
- Bergy Bits: Take care of that snake write
- What is Apache Iceberg – Nice lightboard introduction to Iceberg.
Ecosystem Updates
- Big Data Formats: An in-depth exploration
- The data lakehouse is on the horizon, but it’s not smooth sailing yet
- Lakehouse at Fortune 1 scale
Vendor Updates
- RealTime Lakehouse with Apache Flink and Apache Iceberg
- How to implement CDC using Debezium, Kafka, and Starburst Galaxy
- Exploring Global Internet Speeds using Apache Iceberg and ClickHouse
- An Overview of Snowflake Apache Iceberg Tables
- Building Modern Data Architectures with Iceberg, Tabular and MinIO
- Amazon is using Open Cybersecurity Schema Framework and Iceberg tables in Security Lake.
- Creating Snowflake-managed Iceberg tables
- A new Iceberg connector for Rising Wave.
Iceberg Resources
🏁 Get Started with Apache Iceberg.
🧑🍳Get cookin’ with recipes from the Apache Iceberg Cookbook.
👩🏫 Learn more about Apache Iceberg on the official Apache site.
📺 Watch and subscribe to the Iceberg YouTube Channel.
📰 Read up on some community blog posts.
🫴🏾 Contribute to Iceberg.
👥 `SELECT * FROM you JOIN iceberg_community`.📬 Subscribe to the Apache Iceberg mailing list.