Skip to content
Z Zendikt
Category

Data Lakehouse

Independent 2026 ranking of data lakehouse platforms and open table formats. Databricks vs Snowflake, Iceberg vs Delta vs Hudi, honest pricing, real residency picks.

Products tracked: 10
Last verified: 2026-05-27
Re-verified every 90 days
Editorial verdict
Read full deep-dive

Databricks and Snowflake are the two enterprise-grade lakehouse platforms, now both pretending to be format-neutral after Databricks acquired Tabular (Iceberg) for ~$1B+ in Jun 2024 and Snowflake open-sourced Polaris Catalog the same month. Apache Iceberg is winning the open-table-format war on hyperscaler buy-in (AWS, GCP, Microsoft, Snowflake); Delta Lake remains strongest inside Databricks via Delta UniForm interop; Apache Hudi is the streaming-first specialist most common in Uber-origin shops. AWS Lake Formation, Google BigLake, and Microsoft Fabric OneLake are the hyperscaler-native lakehouse offerings, each tightest to its own object store. Dremio and Starburst are the query-engine specialists for buyers separating storage from compute. The honest 2026 view: pick Iceberg as your open table format unless you are deep on Databricks, and treat catalog choice (Polaris, Unity Catalog, Glue, Nessie) as the lock-in decision that matters more than the engine.

All 10 products, ranked

Sort: Editorial rank · · ·
  1. #1

    Databricks Lakehouse Platform

    G2 4.5 (580)

    Delta Lake-native lakehouse with Unity Catalog and Mosaic AI; Iceberg-aware after Tabular acquisition.

    Databricks is the enterprise lakehouse leader, unifying data engineering, analytics, and ML/AI training on Delta Lake + Unity Catalog. The Jun 2024 acquisition of Tabular (the Iceberg-creator-led startup) for a reported $1B+ creates obvious tension because Databricks is the lead maintainer of Delta Lake, the rival format to Iceberg; the public position is that Databricks will support both via Delta UniForm and through ongoing Iceberg contribution. Last private valuation was $43B in Sept 2023 (reported $62B in subsequent rounds), with a 2026 IPO widely expected but not confirmed. Trade-offs: DBU pricing complexity, and SQL-only buyers often find Snowflake simpler.

    Pricing
    ◐ Partial
    Vendor trust
    7.7/10
    Best fit
    200-100,000+
    Reviews analyzed
    -
    Interested in Databricks Lakehouse Platform?
  2. #2

    Snowflake + Polaris Catalog

    G2 4.5 (680)

    Cloud-neutral managed lakehouse with native Iceberg and open-sourced Polaris Catalog.

    Snowflake (NYSE:SNOW) made a genuine strategic shift toward open lakehouse architecture in 2024: native Iceberg tables reached read/write parity with internal tables, and the Polaris Catalog was open-sourced in Jun 2024 as an Apache Iceberg REST catalog implementation. The honest reading is that this is a real bet on Iceberg interop, partly defensive against Databricks-on-Delta and partly offensive into the open-format buyer segment. The trade-off: whether enterprise customers actually benefit depends on which catalog they pick, and Snowflake credit-based pricing remains easy to overspend without governance. Best fit for SQL-first enterprises wanting open format with managed SaaS.

    Pricing
    ◐ Partial
    Vendor trust
    7.8/10
    Best fit
    200-100,000+
    Reviews analyzed
    -
    Interested in Snowflake + Polaris Catalog?
  3. #3

    AWS Lake Formation + Iceberg

    G2 4.2 (140)

    AWS-native lakehouse: Glue Catalog, Lake Formation governance, and S3 Tables for Iceberg.

    AWS Lake Formation is the AWS-native lakehouse governance layer over S3, with AWS Glue Data Catalog as the metadata store and Lake Formation managing fine-grained access controls. The 2024 Re:Invent S3 Tables announcement made Iceberg a first-class S3 bucket type, removing the need for a separate Iceberg metastore for many AWS-native pipelines. The lakehouse engines on top are Athena, EMR, Redshift Spectrum, and Glue ETL. Strengths: deep AWS integration, IAM-native access, and Iceberg-native S3. Trade-offs: best-fit narrows sharply when not AWS-anchored, governance UX is more workmanlike than Unity Catalog, and pricing fragments across Glue, Lake Formation, S3 Tables, and the chosen query engine.

    Pricing
    ● Transparent
    Vendor trust
    8.3/10
    Best fit
    50-100,000+
    Reviews analyzed
    -
    Interested in AWS Lake Formation + Iceberg?
  4. #4

    Google BigLake

    G2 4.4 (110)

    BigQuery engine over open table formats: Iceberg, Hudi, and Delta on Cloud Storage.

    BigLake is Google Cloud lakehouse layer that lets BigQuery (and other GCP engines including Dataproc Spark and Dataflow) query Apache Iceberg, Apache Hudi, and Delta Lake tables on Cloud Storage with the same governance model as native BigQuery tables. The fit: GCP-anchored teams who already use BigQuery as the analytics engine and want to add lakehouse semantics over open formats without operating a separate platform. Strengths: tightest integration with BigQuery, Looker, and Vertex AI; native Iceberg, Hudi, and Delta support; and serverless query economics. Trade-offs: best-fit narrows sharply when not GCP-anchored, and cross-cloud egress economics favor staying inside GCP.

    Pricing
    ● Transparent
    Vendor trust
    8.8/10
    Best fit
    50-100,000+
    Reviews analyzed
    -
    Interested in Google BigLake?
  5. #5

    Microsoft Fabric OneLake

    G2 4.4 (380)

    Microsoft unified lakehouse store: Delta-native, with Iceberg via shortcuts and Power BI bundle economics.

    OneLake is the unified data lake layer underneath Microsoft Fabric, announced in May 2023 as part of Microsoft Fabric and using Delta Lake as the native open format. The 2024-2025 additions of OneLake shortcuts to Iceberg tables (in S3, ADLS, and elsewhere) and the broader Fabric Iceberg interop make OneLake the closest thing to a multi-format lakehouse store from Microsoft. The honest framing: OneLake wins deals through Power BI Premium bundle pricing and Microsoft 365 procurement leverage, not because the underlying engine is best-in-class. Capacity Unit (CU) pricing complexity remains the main cost-forecasting issue.

    Pricing
    ◐ Partial
    Vendor trust
    8.3/10
    Best fit
    500-100,000+
    Reviews analyzed
    -
    Interested in Microsoft Fabric OneLake?
  6. #6

    Apache Iceberg

    G2 4.6 (90)

    The winning open table format of 2025-2026 by hyperscaler buy-in.

    Apache Iceberg is the open table format originated at Netflix in 2017, donated to the Apache Software Foundation, and now the de facto winner of the open-table-format war in 2025-2026 on the strength of hyperscaler buy-in. AWS (S3 Tables, Athena, EMR, Redshift), Google (BigLake, BigQuery), Microsoft (Fabric via shortcuts), and Snowflake all support Iceberg as a first-class format. Databricks acquired Tabular (the company founded by Iceberg creators Ryan Blue and Daniel Weeks) in Jun 2024 for a reported $1B+, which brought core Iceberg engineering talent into the Delta Lake-stewarding company; the public position is dual-format support. The honest read: pick Iceberg unless you are deep on Databricks.

    Pricing
    ● Transparent
    Vendor trust
    8.9/10
    Best fit
    50-100,000+
    Reviews analyzed
    -
    Interested in Apache Iceberg?
  7. #7

    Delta Lake

    G2 4.5 (60)

    Databricks-led open table format with Iceberg interop via Delta UniForm.

    Delta Lake is the open table format created at Databricks, open-sourced under the Linux Foundation in 2019, and the native format for the Databricks Lakehouse Platform. It remains strong inside Databricks (Unity Catalog assumes Delta as the default) and has hedged for the Iceberg-dominant 2025-2026 landscape via Delta UniForm (2024), which writes Iceberg metadata in parallel so external engines can read Delta tables as if they were Iceberg. The honest framing: if Databricks is your primary engine, Delta is the right format; if you want format neutrality across hyperscalers, Iceberg is winning. Microsoft Fabric OneLake also uses Delta natively, which keeps Delta relevant outside Databricks.

    Pricing
    ● Transparent
    Vendor trust
    8.8/10
    Best fit
    50-100,000+
    Reviews analyzed
    -
    Interested in Delta Lake?
  8. #8

    Apache Hudi + Onehouse

    G2 4.3 (35)

    Streaming-first open table format from Uber, with Onehouse as commercial managed offering.

    Apache Hudi is the open table format originated at Uber in 2016-2017 and donated to the Apache Software Foundation, designed from day one for streaming-first and record-update-heavy workloads (CDC, real-time ingestion, frequent upserts). Onehouse is the commercial managed offering founded by Hudi creator Vinoth Chandar in 2021, with a multi-format strategy (Hudi, Iceberg, Delta via Apache XTable). The honest framing: Hudi has lost the broader open-table-format war to Iceberg on hyperscaler buy-in, but retains a defensible niche in streaming-first and CDC-heavy workloads where its incremental processing model is genuinely differentiating. Best fit for Uber-origin shops and streaming-heavy data engineering teams.

    Pricing
    ◐ Partial
    Vendor trust
    8.6/10
    Best fit
    50-50,000+
    Reviews analyzed
    -
    Interested in Apache Hudi + Onehouse?
  9. #9

    Dremio

    G2 4.4 (95)

    Lakehouse-native query engine on Iceberg with Project Nessie Git-for-data catalog.

    Dremio is the lakehouse-native query engine purpose-built for SQL on Apache Iceberg tables in S3/ADLS/GCS, with Project Nessie as the Git-for-data catalog. The fit: teams that want to separate storage from compute vendor, run their data in Iceberg in their own object store, and use Dremio as the engine without committing to Databricks or Snowflake compute. Series E $410M raised in Jan 2022 at $2B+ valuation; no significant funding rounds publicly disclosed since. Strengths: Iceberg-first engineering, Nessie data versioning, and reflections (acceleration layer) for sub-second BI. Trade-offs: smaller market presence than Databricks/Snowflake, narrower ecosystem.

    Pricing
    ◐ Partial
    Vendor trust
    7.8/10
    Best fit
    100-5,000+
    Reviews analyzed
    -
    Interested in Dremio?
  10. #10

    Starburst

    G2 4.4 (115)

    Managed Trino with multi-format lakehouse support and Stargate federation.

    Starburst is the commercial company behind Trino (the open-source distributed SQL query engine, formerly PrestoSQL), offering Starburst Galaxy (SaaS) and Starburst Enterprise (self-hosted) as managed Trino with multi-format lakehouse support (Iceberg, Delta, Hudi) and Stargate federation across data sources. Series D $250M raised in Feb 2022 at $3.35B valuation; no major funding round publicly disclosed since. Strengths: federated query across lakehouse plus operational data sources (Postgres, MySQL, Mongo, etc.), Trino community heritage, and multi-format support. Trade-offs: smaller than Databricks/Snowflake, primary value is federation rather than being a one-stop lakehouse.

    Pricing
    ◐ Partial
    Vendor trust
    7.8/10
    Best fit
    100-10,000+
    Reviews analyzed
    -
    Interested in Starburst?

How we rank data lakehouse

Evaluated 14 lakehouse platforms and open table formats across six weighted factors: open-format support and neutrality (25%), governance and catalog quality (20%), value and pricing transparency (15%), scalability (15%), ease of use (15%), and ecosystem integrations (10%). Pricing verified Mar-May 2026 from vendor public pricing pages, AWS/GCP/Azure marketplace listings, and primary press releases. Open-source projects (Apache Iceberg, Delta Lake, Apache Hudi) evaluated on the basis of GitHub commit activity, contributor diversity, and downstream production references. Customer counts and valuations cited only where publicly disclosed; we deliberately avoid inventing query latency or customer-count numbers. Radar scores normalized within category, not cross-category.

See full deep-dive →
What you get on this category
  • 10 products with full intelligence profile
  • Verified pricing crowdsourced from real buyers
  • Vendor trust scores independent of product quality
  • review patterns from G2, Capterra, Reddit, Trustpilot
  • Quarterly re-verification of all data