StarTree To Support Apache Iceberg In A Bid To Expand Lakehouse Use Cases

StarTree Inc., which sells a real-time analytics platform and cloud service based on the Apache Pinot open-source online analytical processing database, today becomes the latest data analytics provider to announce full support for Apache Iceberg.

The StarTree Cloud managed service will employ Iceberg as the analytic and serving layer on top of its data lakehouse, effective today. The company said the move creates new use cases for Iceberg in real-time applications requiring high concurrency across thousands of simultaneous users. In particular, it enables Iceberg to be more easily applied to customer-facing scenarios where organizations want to expose data externally without relying on complex, multi-step pipelines.

Iceberg is a management layer that sits atop data files in cloud storage to improve consistency, manageability and query performance. It has been rapidly gaining acceptance as a de facto table standard, replacing an assortment of proprietary alternatives.

Iceberg provides transactional access to structured files in formats such as Parquet, a columnar storage file format optimized for efficient read/write access to large analytical datasets. However, Iceberg lacks native capabilities to process low-latency, high-concurrency queries.

For this reason, organizations have typically extracted Iceberg data into separate systems, such as key-value stores or proprietary formats, to achieve subsecond responsiveness. These require engineering-intensive pipelines and data duplication while limiting flexibility.

Query complexity

“Not only are you duplicating data, you’re amplifying the data itself because you have to materialize all combinations of your dimensions and metrics to make it easy to query in a key-value store-like fashion,” said Chinmay Soman, StarTree’s head of product.

StarTree said it enables direct querying of Iceberg tables without the need to move or transform the underlying data. The integration supports open formats and leverages performance-enhancing features, including Pinot indexing and materialization, local caching and intelligent prefetching.

“Data products today increasingly rely on historical data from lakehouses, but the serving layer has been missing,” said Chief Marketing Officer Chad Meley. “By querying Iceberg directly with subsecond latency, we’re eliminating the need for intermediate pipelines, duplicate storage and external databases.”

Executives said Iceberg support expands StarTree’s addressable market beyond its original focus on streaming and low-latency analytics. “This is certainly a new use case for us,” Meley said. “The primary challenge we’re solving is no longer just about data freshness. It’s about helping customers build scalable data products without all the bloat and complexity.”

StarTree enables various indexes and pre-aggregated materializations to be defined directly on Iceberg tables. Indexes for numerical data, text, JavaScript Object Notation, geospatial data and other types can be distributed locally on compute nodes or stored in object storage.

Soman said the integration is based on work StarTree had already done to query Parquet files and S3-based object storage. “Parquet is not designed for random read access, but we’ve adapted Pinot to use it as a forward index,” he said. “Combining that with our understanding of Iceberg manifests and metadata gave us the building blocks we needed.”

Data stays in place

The company emphasized that its query engine still uses proprietary indexing strategies to achieve performance, but that the data itself remains in open formats. “We’re not moving data from Iceberg into StarTree’s proprietary format,” Meley said. “The only thing proprietary in this case would be the index.”

Support for Iceberg enables customers like financial technology firms to use StarTree to power merchant-facing dashboards that report historical cash flow or cohort revenue metrics. Transportation and logistics organizations are building interactive dashboards to review delivery performance, error rates and route efficiency across time. In both cases, data doesn’t need to be real-time, but must still be served with strict service level agreements to large user bases.

Paul Nashawaty, principal analyst at theCUBE Research, SiliconANGLE’s sister market research firm, said the approach addresses a growing gap in modern data architecture. “Iceberg adoption is accelerating, but most query engines can’t meet the performance SLAs of customer-facing applications,” he said. “StarTree’s ability to serve Iceberg data at high concurrency without duplication is a timely advancement.”

Soman said there are minor performance tradeoffs using Iceberg instead of Pinot’s proprietary native format, but that Pinot is still capable of handling hundreds of queries per second with subsecond latencies.

Meley said that the decision to support Iceberg reflects both market momentum and practical customer needs. “All of our customers are asking about Iceberg,” he said. “It’s becoming the standard for lakehouse storage, and this allows us to support that natively while simplifying the architecture for serving data products.”

Photo: Pixabay

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+

CUBE Alumni Network

C-level and Technical

Domain Experts

Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Source link

What's Hot

DeepSeek-R1: Hype cools as India seeks practical GenAI solutions

Google Docs gets AI voice reader, lets you turn your documents into audio with a click

Security experts warn against selling Nvidia AI chips to China

StarTree to support Apache Iceberg in a bid to expand lakehouse use cases

Amazon DocumentDB goes serverless with automatic scaling to support agentic AI workloads

Automated data operations platform Pantomath raises $30M led by General Catalyst

Palantir posts 48% revenue growth and beats expectations across the board

Barbara Hepworth Sculpture Will Remain in UK After £3.8 M. Raised

After 12-Year Hiatus, Egypt’s Alexandria Biennale Will Return

Ai Weiwei Visits Ukraine’s Front Line Ahead of Kyiv Installation

Maren Hassinger to Receive Her Largest Retrospective to Date Next Year