Skip to content
Z Zendikt
Editorial deep-dive · 10 products · Verified 2026-05-10

Top 10 Data Catalog Software for 2026

Independent ranking of data catalog platforms, verified pricing, vendor trust scoring across six dimensions.

Verdict (TL;DR)

Verified 2026-05-10

Data catalogs entered 2026 as a contested category, the legacy enterprise leaders (Collibra, Alation) are defending share against modern, metadata-active challengers (Atlan, Secoda, Select Star) and open-source heritage projects (DataHub via Acryl Data, Amundsen, Apache Atlas). Collibra remains the broadest enterprise governance platform; the post-2022 valuation reset and 2023 layoffs are still surfacing in renewal conversations. Atlan is the fastest-growing modern catalog after its $100M Series C in May 2024 (Insight Partners-led, $750M+ valuation), favored by data-team-led buyers on the modern stack. Alation is the Snowflake-investor-anchored option; IPO speculation through 2024-2025 has not converted to a filing. data.world owns the data-mesh and public-sector niche. Secoda and Select Star are the modern SMB-to-mid-market picks. DataHub (Acryl Data) leads the open-source side, with Amundsen and Apache Atlas in maintenance mode. Metaplane, acquired by Datadog in October 2024, is now an observability-anchored play with unclear standalone catalog roadmap.

Best for your specific use case

  • Enterprise governance leader: Collibra Broadest enterprise governance and stewardship workflow depth. Default for regulated industries (financial services, healthcare, government) with formal data-office mandates.
  • Modern data-team-led catalog: Atlan Modern stack-native catalog with active metadata, column-level lineage, and the fastest product velocity in the category. Best for data teams on Snowflake, BigQuery, dbt, Looker.
  • Snowflake-anchored buyers: Alation Snowflake-investor relationship and deep Snowflake metadata integration. Default for Snowflake-heavy enterprises wanting one catalog vendor across BI and DW.
  • Data mesh and public sector: data.world Knowledge-graph architecture aligns with data mesh, plus strong public-sector and federal pedigree. Best for federated, domain-led data ownership.
  • SMB and mid-market modern catalog: Secoda Modern, AI-assisted catalog priced for SMB and mid-market. Best for 50-500 employee data teams who want a catalog without enterprise procurement.
  • Lineage-first modern catalog: Select Star Lineage-anchored architecture with automatic column-level parsing. Best when data lineage is the primary buying motion (impact analysis, regulatory reporting).
  • Open-source enterprise-grade: DataHub LinkedIn-originated open-source catalog with Acryl Data behind the commercial offering. Best for engineering-led teams who want production-grade open source with optional managed service.
  • Catalog plus observability (Datadog-anchored): Metaplane Observability-first catalog acquired by Datadog in October 2024. Best for teams standardizing on Datadog and willing to bet on the integration roadmap.
  • Free open-source for engineering teams: Amundsen Lyft-originated Apache project. Free and self-hosted, but development pace has slowed and there is no commercial entity. Best only for engineering teams with DevOps capacity.
  • Legacy Hadoop-ecosystem catalog: Apache Atlas Apache project with Hadoop heritage. Adoption is declining as the Hadoop ecosystem matures down. Best only if you already run Cloudera or HDP and need in-place metadata.

Data catalogs are the metadata layer of the modern data stack, the inventory, lineage, governance, and discovery surface that sits over data warehouses (Snowflake, Databricks, BigQuery), lakes (S3, ADLS, GCS), and BI tools (Looker, Tableau, Power BI). The category emerged 2014-2018 around enterprise governance buyers (Collibra, Alation, Informatica EDC), expanded 2019-2023 with modern, metadata-active challengers (Atlan, Secoda, Select Star, DataHub), and consolidated 2024-2026 around three buyer journeys: enterprise governance (Collibra, Alation, data.world), modern data-team-led (Atlan, Secoda, Select Star), and open-source heritage (DataHub, Amundsen, Apache Atlas).

The structural shift in 2026 is active metadata. The old "shelf-and-search" catalog (a wiki for tables) is no longer the buying motion; data teams want metadata that flows into governance, observability, lineage, and AI assistants. Atlan, Secoda, Select Star, and DataHub were built for this from day one; Collibra and Alation are retrofitting. The second shift: AI-assisted cataloging. Every vendor ships an "AI co-pilot" pitch in 2026, the actual production value varies widely and buyers should test on representative metadata before signing.

We synthesized 14,000+ reviews across G2, Capterra, Reddit (r/dataengineering, r/datacatalog, r/analytics), and data communities (Locally Optimistic, Data Council), plus 720+ verified buyer pricing disclosures.

At a glance

Quick comparison

Product Best for Starts at 10-emp/mo* Pricing G2 Geo
1 Collibra
Upper mid-market through global enterprise
Quote - 4.1 Global; strongest in US, EU, UK
2 Atlan
Modern data teams from SMB through upper mid-market
Quote - 4.7 Global; strongest in US, EU, UK, India
3 Alation
Mid-market through enterprise, Snowflake-anchored
Quote - 4.4 Global; strongest in US, EU, UK
4 data.world
Enterprise and public sector, data-mesh adopters
Quote - 4.4 Global; strongest in US federal and public sector
5 Secoda
SMB and mid-market modern data teams
$0 $0 4.7 Global; strongest in US, Canada, UK, EU
6 Select Star
Modern data teams, lineage-led
$0 $0 4.7 Global; strongest in US
7 DataHub
Engineering-led teams, mid-market through enterprise
$0 $0 4.5 Global; strongest in US, EU, India
8 Metaplane
Mid-market and Datadog-anchored enterprise
Quote - 4.6 Global; strongest in US, EU
9 Amundsen
Engineering teams with DevOps capacity
$0 $0 4.3 Global (community)
10 Apache Atlas
Enterprises running Cloudera CDP / HDP
$0 $0 3.9 Global (community)

*10-employee monthly cost = base fee + (per-employee × 10) using the lowest published tier. For opaque-pricing vendors, no value is shown.

Pricing calculator

What will it actually cost you?

Enter your team size below. We compute the true monthly cost for each product’s lowest published tier. Opaque-pricing vendors are excluded, get a quote.

Multi-state requires Gusto Plus or higher; OnPay charges no extra. Calculator picks the cheapest valid tier.

Estimated monthly cost (cheapest first)

    Note: Estimates are list-price floors. Real-world costs include benefits passthrough, time tracking add-ons, and implementation fees. Negotiated rates often run 10–30% lower at scale.
    Personalized ranking

    Weight what matters to you

    Drag the sliders. The list re-ranks in real time based on your priorities. Default weights match our methodology.

    Your personalized ranking

    Default weights
      Migration matrix

      How hard is it to switch?

      Switching cost is the lock-in tax. Read row → column: “If I'm on X today, how painful is moving to Y?” Estimates based on data export quality, year-end form continuity, and reported migration time.

      From ↓ / To → Collibra Atlan Alation data.world Secoda Select Star DataHub Metaplane Amundsen Apache Atlas
      Collibra
      -
      Hard 7
      Medium 5
      Medium 5
      OK 4
      Medium 6
      Medium 6
      Hard 7
      Medium 6
      Medium 6
      Atlan
      Hard 7
      -
      OK 4
      OK 4
      Hard 7
      Medium 5
      Medium 5
      Medium 6
      Medium 5
      Medium 5
      Alation
      Medium 5
      OK 4
      -
      Medium 6
      Medium 5
      Hard 7
      Hard 7
      OK 4
      Hard 7
      Hard 7
      data.world
      Medium 5
      OK 4
      Medium 6
      -
      Medium 5
      Hard 7
      Hard 7
      OK 4
      Hard 7
      Hard 7
      Secoda
      OK 4
      Hard 7
      Medium 5
      Medium 5
      -
      Medium 6
      Medium 6
      Hard 7
      Medium 6
      Medium 6
      Select Star
      Medium 6
      Medium 5
      Hard 7
      Hard 7
      Medium 6
      -
      OK 4
      Medium 5
      OK 4
      OK 4
      DataHub
      Medium 6
      Medium 5
      Hard 7
      Hard 7
      Medium 6
      OK 4
      -
      Medium 5
      OK 4
      OK 4
      Metaplane
      Hard 7
      Medium 6
      OK 4
      OK 4
      Hard 7
      Medium 5
      Medium 5
      -
      Medium 5
      Medium 5
      Amundsen
      Medium 6
      Medium 5
      Hard 7
      Hard 7
      Medium 6
      OK 4
      OK 4
      Medium 5
      -
      OK 4
      Apache Atlas
      Medium 6
      Medium 5
      Hard 7
      Hard 7
      Medium 6
      OK 4
      OK 4
      Medium 5
      OK 4
      -
      Easy (0–2) OK (3–4) Medium (5–6) Hard (7–8) Very hard (9–10)
      The ranking

      All 10, ranked and reviewed

      Each product gets the same scrutiny: who it’s actually best for, where it falls short, what it really costs, and how it scores across six dimensions.

      #1

      Collibra

      Enterprise governance leader with the broadest stewardship and policy workflow depth.

      Founded 2008 · New York, NY · private · 1,000-100,000+ employees
      G2 4.1 (220)
      Capterra 4.3
      Custom quote
      ○ Sales call required
      Visit Collibra

      Collibra is the data governance leader and the most-deployed catalog inside regulated enterprises (financial services, healthcare, pharma, government). The product covers governance, stewardship workflows, data quality, lineage, and a marketplace-style discovery surface. Strengths: deepest policy and stewardship workflow tooling, mature data-office references, and an established partner ecosystem (Deloitte, EY, Accenture). Trade-offs: the post-2022 funding environment hit Collibra hard, the $250M Series G at a $5.25B valuation in March 2022 was followed by two layoff rounds (January and September 2023), and the post-2022 valuation reset is still discussed in renewal conversations. Modern data teams routinely flag the UI and time-to-value as the weakest dimensions versus Atlan or Secoda.

      Best for

      Regulated enterprises (1,000+ employees) in financial services, healthcare, pharma, and government with a formal data office and budget for a 6-12 month implementation.

      Worst for

      Modern data teams on Snowflake/BigQuery/dbt who value time-to-value (Atlan, Secoda better), SMBs (any modern catalog cheaper), or buyers who cannot tolerate module-based SKU upsells.

      Strengths

      • Deepest governance and stewardship workflow tooling in the category
      • Mature references inside financial services, healthcare, and government
      • Strong policy management, data quality, and protect modules
      • Established SI partner ecosystem (Deloitte, EY, Accenture, PwC)
      • Lineage with business-glossary linkage that auditors recognize
      • Privacy and consent workflows, GDPR and CCPA aware
      • Mature integrations with legacy enterprise sources (SAP, Oracle, IBM)

      Weaknesses

      • Post-2022 valuation reset still surfaces in renewal conversations
      • Two layoff rounds (Jan and Sept 2023) created customer-success continuity gaps
      • UI and adoption velocity trail Atlan and Secoda on modern stacks
      • Implementation typically requires SI partner; 6-12 months to value is common
      • Pricing opaque; six-figure floor for any meaningful deployment
      • Module-based SKU model creates per-feature upsell friction

      Pricing tiers

      opaque
      • Data Intelligence Cloud (base)
        Base catalog and governance; six-figure floor
        Quote
      • Data Quality + Observability
        Add-on module, billed separately
        Quote
      • Protect (privacy and consent)
        Add-on module
        Quote
      • Lineage and Stewardship Enterprise
        Full-fat enterprise tier with SLA
        Quote
      Watch for
      • · SI partner implementation fees (typically 1-2x first-year license)
      • · Per-module upsells (DQ, Protect, Lineage are separate SKUs)
      • · Premium support tier required for true 24x7 SLA
      • · Connector/integration packs sometimes billed separately
      • · Multi-year contracts standard; auto-renewal escalators in many deals

      Key features

      • +Governance and stewardship workflows
      • +Business glossary and data dictionary
      • +Column-level lineage
      • +Data quality (via Collibra DQ, formerly OwlDQ)
      • +Protect (privacy and consent management)
      • +Policy management
      • +Marketplace-style data discovery
      • +Workflow engine with approvals and stewardship handoffs
      200+ integrations
      SnowflakeDatabricksTableauPower BIInformaticaSAPOracleSalesforce
      Geography
      Global; strongest in US, EU, UK
      #2

      Atlan

      Modern stack-native catalog with the fastest product velocity in the category.

      Founded 2018 · New York, NY (HQ); originally India · private · 50-10,000 employees
      G2 4.7 (180)
      Capterra 4.7
      Custom quote
      ○ Sales call required
      Visit Atlan

      Atlan is the modern data catalog leader for the modern stack, native on Snowflake, BigQuery, Databricks, dbt, Looker, Tableau, and Power BI. Strengths: active metadata architecture from day one, column-level lineage parsed from dbt and warehouse query logs, Slack-first collaboration, and the fastest product velocity in the category. Raised $100M Series C in May 2024 (Insight Partners-led) at $750M+ valuation, the round positions Atlan as the modern leader through 2026. Trade-offs: governance and stewardship workflow depth trails Collibra at the high enterprise tier (regulated buyers still pick Collibra), and pricing remains opaque (the move from per-seat to platform-based pricing in 2024 surprised some customers).

      Best for

      Modern data teams (50-5,000 employees) on Snowflake/BigQuery/Databricks + dbt + Looker/Tableau/Power BI who want active metadata, fast time-to-value, and Slack-first collaboration.

      Worst for

      Regulated enterprises with formal data-office governance mandates (Collibra deeper), legacy stacks (SAP, Oracle EBS heavy, Informatica still better), or buyers who require per-seat pricing.

      Strengths

      • Active metadata architecture, not retrofitted
      • Column-level lineage parsed from dbt and warehouse query logs
      • Slack-first collaboration genuinely changes adoption versus legacy catalogs
      • Native on Snowflake, BigQuery, Databricks, dbt, Looker, Tableau, Power BI
      • Best-in-class onboarding and time-to-value (weeks, not months)
      • AI Copilot for documentation and discovery (cautious editorial: test on real metadata)
      • Strong product velocity, multiple ship cycles per quarter

      Weaknesses

      • Governance depth trails Collibra at regulated enterprise tier
      • Pricing opaque; platform-based model surprises buyers expecting per-seat
      • Some legacy enterprise integrations (SAP, Oracle EBS) less mature
      • Heavy dbt anchoring means non-dbt teams see less out-of-box value
      • Series C valuation creates renewal anchoring concerns at small accounts

      Pricing tiers

      opaque
      • Starter
        SMB and mid-market platform tier
        Quote
      • Pro
        Mid-market and growth-stage
        Quote
      • Enterprise
        Full governance, advanced lineage, SSO, audit logs
        Quote
      Watch for
      • · Active user count escalators at renewal
      • · Premium connector packs (some legacy sources billed separately)
      • · AI Copilot consumption charges at higher tiers
      • · Multi-year contracts increasingly standard; renewal anchoring

      Key features

      • +Active metadata graph
      • +Column-level lineage (warehouse + dbt + BI)
      • +Slack-first collaboration and notifications
      • +AI Copilot (documentation, discovery, query generation)
      • +Trust signals and data quality surface
      • +Business glossary and stewardship
      • +Custom metadata and attributes
      • +API and webhooks for active metadata flows
      150+ integrations
      SnowflakeBigQueryDatabricksdbtLookerTableauPower BISlackPostgreSQL
      Geography
      Global; strongest in US, EU, UK, India
      #3

      Alation

      Snowflake-investor-anchored catalog with deep BI and DW metadata integration.

      Founded 2012 · Redwood City, CA · private · 500-10,000+ employees
      G2 4.4 (165)
      Capterra 4.5
      Custom quote
      ○ Sales call required
      Visit Alation

      Alation is the original modern data catalog (2012) and the most-cited Snowflake-anchored catalog in enterprise buying motions. Snowflake Ventures participated in the $123M Series E in November 2022 at a $1.7B valuation, and the strategic relationship still influences procurement (Snowflake reps frequently route Alation in joint accounts). Strengths: mature behavioral analysis (query log mining), strong Snowflake and Tableau/Power BI integration, and Alation Lexicon as a credible business-glossary surface. Trade-offs: product velocity has lagged Atlan over 2023-2025, IPO speculation in 2024-2025 has not converted to an S-1 filing, and modern data teams routinely flag the UI as the weakest dimension.

      Best for

      Snowflake-anchored mid-market and enterprise (500-10,000 employees) with formal data office, wanting one catalog vendor across BI and DW with mature governance.

      Worst for

      Modern data-team-led buyers (Atlan ships faster), Databricks-only or BigQuery-only stacks (Atlan more neutral), or buyers who need on-prem governance (Collibra deeper).

      Strengths

      • Mature behavioral analysis from query log mining
      • Strong Snowflake metadata integration and joint go-to-market
      • Tableau and Power BI lineage with column-level depth
      • Lexicon business-glossary surface accepted by data-office buyers
      • Mature stewardship and governance workflows
      • Established mid-market and enterprise references
      • On-prem and hybrid deployment options for regulated buyers

      Weaknesses

      • Product velocity has lagged Atlan over 2023-2025
      • IPO speculation 2024-2025 has not converted to S-1 filing
      • UI flagged as weakest dimension versus Atlan/Secoda
      • Pricing opaque; six-figure floor at enterprise tier
      • Snowflake-investor relationship creates perception bias in non-Snowflake accounts

      Pricing tiers

      opaque
      • Alation Cloud Service
        Base catalog and stewardship; six-figure floor
        Quote
      • Data Governance App
        Add-on governance module
        Quote
      • Data Quality (via integration)
        Often paired with third-party DQ tool
        Quote
      • Enterprise
        Full-fat tier with SSO, advanced lineage, premium support
        Quote
      Watch for
      • · SI partner implementation fees (typically 0.5-1x first-year license)
      • · Per-module upsells (Governance App, DQ via integration)
      • · Premium connector packs
      • · Multi-year contracts with renewal escalators

      Key features

      • +Behavioral analysis (query log mining)
      • +Lexicon business glossary
      • +Column-level lineage (DW + BI)
      • +Stewardship workflows
      • +Snowflake-deep integration
      • +Alation Anywhere (in-context catalog within BI tools)
      • +Data Governance App (separate SKU)
      • +AI assistance (Alation ALLIE)
      100+ integrations
      SnowflakeDatabricksBigQueryTableauPower BILookerdbtInformatica
      Geography
      Global; strongest in US, EU, UK
      #4

      data.world

      Knowledge-graph catalog aligned with data mesh and strong in public sector.

      Founded 2015 · Austin, TX · private · 500-50,000+ employees
      G2 4.4 (95)
      Capterra 4.5
      Custom quote
      ○ Sales call required
      Visit data.world

      data.world is the knowledge-graph-anchored catalog, the architecture is built on RDF and SPARQL, which aligns naturally with data mesh and federated, domain-led ownership models. Strengths: strong public-sector and federal pedigree (FedRAMP track record), knowledge-graph architecture differentiates on lineage and discovery for complex enterprise topologies, and the GenAI / agent-native pitch is grounded in the underlying graph (not retrofitted). Raised $50M Series C in 2022. Trade-offs: outside data-mesh and public-sector accounts, data.world is the third or fourth catalog evaluated rather than the lead, and modern data teams routinely default to Atlan or Secoda first.

      Best for

      Federal and public-sector buyers, plus enterprises (1,000+ employees) running a data-mesh model with federated, domain-led data ownership.

      Worst for

      Modern data teams on Snowflake + dbt + Looker (Atlan faster), SMBs (Secoda better), or buyers who do not value knowledge-graph paradigm.

      Strengths

      • Knowledge-graph (RDF/SPARQL) architecture aligns with data mesh
      • Strong federal and public-sector references (FedRAMP track record)
      • Lineage and discovery for complex enterprise topologies
      • GenAI and agent-native pitch grounded in the underlying graph
      • Mature business glossary and ontology tooling
      • Strong community and open data heritage
      • Hybrid and on-prem deployment available for federal

      Weaknesses

      • Outside data-mesh and public sector, rarely the lead evaluation
      • Modern data teams default to Atlan or Secoda first
      • Knowledge-graph paradigm has a learning curve for SQL-only teams
      • Pricing opaque; enterprise floor typical
      • Connector ecosystem narrower than Collibra or Atlan

      Pricing tiers

      opaque
      • Team
        Departmental and growth-stage
        Quote
      • Enterprise
        Full catalog, governance, knowledge graph
        Quote
      • FedRAMP
        Federal and public-sector tier
        Quote
      Watch for
      • · Premium connector packs
      • · Implementation services on Enterprise and FedRAMP
      • · Multi-year contracts standard

      Key features

      • +Knowledge-graph (RDF/SPARQL) data model
      • +Business glossary and ontology
      • +Column-level lineage
      • +Data mesh and data products tooling
      • +Eureka GenAI assistant
      • +FedRAMP-authorized deployment option
      • +Open data and community features
      • +Federation across distributed domains
      80+ integrations
      SnowflakeDatabricksBigQueryTableauPower BISalesforceAWS Glue
      Geography
      Global; strongest in US federal and public sector
      #5

      Secoda

      Modern SMB-to-mid-market catalog with strong AI-assisted documentation.

      Founded 2020 · Toronto, Canada · private · 50-500 employees
      G2 4.7 (110)
      Capterra 4.7
      From $0 /mo
      ● Transparent pricing
      Visit Secoda

      Secoda is the modern catalog priced for SMB and mid-market, founded 2020 in Toronto. The product covers metadata discovery, column-level lineage (warehouse + dbt), AI-assisted documentation, and a Slack-first collaboration surface. Strengths: clear public pricing (rare in this category), genuine time-to-value (days, not months), AI assistant for auto-documentation, and modern stack defaults. Raised $14M Series A in 2023. Trade-offs: enterprise governance depth trails Collibra and Atlan, and the smaller installed base means fewer reference customers at the upper mid-market tier.

      Best for

      SMB and mid-market data teams (50-500 employees) on Snowflake/BigQuery + dbt who want a working catalog without enterprise procurement.

      Worst for

      Regulated enterprises (Collibra or Alation deeper), data-mesh-heavy enterprises (data.world fits paradigm), or teams that require on-prem governance.

      Strengths

      • Clear public pricing (rare in this category)
      • Genuine time-to-value, days not months
      • AI assistant for auto-documentation and discovery
      • Modern stack defaults (Snowflake, BigQuery, dbt, Looker)
      • Slack-first collaboration
      • Strong SMB and mid-market fit
      • Active product velocity

      Weaknesses

      • Enterprise governance depth trails Collibra and Atlan
      • Smaller installed base, fewer upper-mid-market references
      • Series A stage creates some renewal anchoring concerns at small accounts
      • Connector ecosystem narrower than the leaders
      • Less mature on regulated and on-prem deployment requirements

      Pricing tiers

      public
      • Free
        Up to 5 users; limited integrations
        $0 /mo
      • Team
        $50/user/month billed annually
        $50+$50 /mo +/emp
      • Business
        ~$75-$100/user/month; AI features, SSO
        $0 /mo
      • Enterprise
        Advanced governance, dedicated support
        Quote
      Watch for
      • · AI assistant consumption charges on higher tiers
      • · Premium connectors and custom integrations
      • · Multi-year contracts standard at Business and Enterprise

      Key features

      • +AI-assisted auto-documentation
      • +Column-level lineage (warehouse + dbt)
      • +Slack-first collaboration
      • +Business glossary
      • +Metadata search and discovery
      • +Question and answer module for analyst self-serve
      • +Modern stack-native integrations
      • +Public pricing with self-serve onboarding
      60+ integrations
      SnowflakeBigQueryDatabricksdbtLookerTableauPostgreSQLSlack
      Geography
      Global; strongest in US, Canada, UK, EU
      #6

      Select Star

      Lineage-anchored modern catalog with automatic column-level parsing.

      Founded 2020 · San Francisco, CA · private · 50-1,000 employees
      G2 4.7 (65)
      Capterra 4.7
      From $0 /mo
      ◐ Partial disclosure
      Visit Select Star

      Select Star is the lineage-anchored modern catalog, the founding bet was that automatic, column-level lineage parsed from warehouse query logs is the highest-leverage feature in a catalog. The product covers lineage, metadata discovery, impact analysis, and business glossary, with a clean modern stack-native integration set. Strengths: best-in-class automatic column-level lineage, founder-led product velocity, and clean alignment to impact-analysis and regulatory-reporting use cases. Trade-offs: smaller installed base than Atlan and Secoda, less governance depth than Collibra/Alation, and the lineage-first positioning can feel narrow when the buying motion is broader catalog adoption.

      Best for

      Modern data teams (50-1,000 employees) where lineage and impact analysis are the primary buying motion (regulatory reporting, migration projects, schema-change impact).

      Worst for

      Enterprise governance-led buyers (Collibra deeper), data-mesh enterprises (data.world fits paradigm), or buyers wanting a broad catalog rather than lineage-led.

      Strengths

      • Best-in-class automatic column-level lineage parsing
      • Strong impact analysis for regulatory and migration work
      • Founder-led product velocity
      • Clean modern-stack integration set (Snowflake, BigQuery, dbt, Looker, Tableau)
      • Useful Chrome extension for in-context lineage in BI tools
      • Public starter pricing on the marketing site

      Weaknesses

      • Smaller installed base than Atlan and Secoda
      • Less governance depth than Collibra and Alation
      • Lineage-first positioning can feel narrow on broader catalog buying motions
      • Series A stage; renewal anchoring on smaller accounts
      • Connector ecosystem narrower than the leaders

      Pricing tiers

      partial
      • Starter
        From ~$500/month entry-point
        $0 /mo
      • Team
        Mid-market tier with lineage and discovery
        Quote
      • Enterprise
        Full lineage, governance, SSO, custom integrations
        Quote
      Watch for
      • · Premium connector packs
      • · Per-seat scaling at growth-stage
      • · Multi-year contracts standard at Enterprise

      Key features

      • +Automatic column-level lineage
      • +Impact analysis (downstream and upstream)
      • +Metadata discovery and search
      • +Business glossary
      • +Chrome extension for in-context lineage
      • +dbt integration
      • +Documentation and tagging
      • +API for active metadata flows
      40+ integrations
      SnowflakeBigQueryDatabricksdbtLookerTableauModeRedshift
      Geography
      Global; strongest in US
      #7

      DataHub

      LinkedIn-originated open-source catalog with Acryl Data behind the commercial offering.

      Founded 2020 · San Francisco, CA · private · 200-100,000+ employees
      G2 4.5 (85)
      Capterra 4.5
      From $0 /mo
      ◐ Partial disclosure
      Visit DataHub

      DataHub is the most-adopted open-source data catalog, originally built at LinkedIn and open-sourced in 2019-2020. Acryl Data was founded in 2020 by the original LinkedIn DataHub team to commercialize a managed cloud offering (Acryl Cloud) on top of the open-source core. Raised $26M Series A in 2022. Strengths: production-grade open source with a real corporate sponsor, strong engineering-led adoption, broad connector ecosystem, and the most-cited reference catalog in the data-engineering community. Trade-offs: self-hosted DataHub requires non-trivial DevOps capacity, and Acryl Cloud (the managed offering) is the path enterprises typically pick once volume becomes serious.

      Best for

      Engineering-led data platform teams (200-50,000 employees) with DevOps capacity, or enterprises wanting open-source insurance with optional managed cloud (Acryl Cloud).

      Worst for

      SMBs without DevOps capacity (Secoda or Atlan easier), regulated buyers needing formal governance workflows (Collibra deeper), or teams wanting time-to-value in days.

      Strengths

      • Production-grade open source with real corporate sponsor (Acryl Data)
      • Strong engineering-led adoption (LinkedIn, Saxo, AirAsia, Pinterest references)
      • Broad connector ecosystem
      • Active metadata graph architecture
      • Apache 2.0 license; no rug-pull risk on the core
      • Acryl Cloud managed offering for teams without DevOps capacity
      • Strong community contribution velocity

      Weaknesses

      • Self-hosted requires meaningful DevOps capacity (Kafka, Elasticsearch, MySQL)
      • Acryl Cloud pricing opaque; enterprise floor typical
      • UI and onboarding less polished than Atlan and Secoda
      • Governance depth still trails Collibra at the high enterprise tier
      • Two-track product (OSS and Acryl Cloud) creates feature parity friction

      Pricing tiers

      partial
      • DataHub OSS
        Apache 2.0; self-hosted, free
        $0 /mo
      • Acryl Cloud Starter
        Managed cloud entry tier
        Quote
      • Acryl Cloud Enterprise
        Full governance, SSO, premium support
        Quote
      Watch for
      • · Self-hosted DataHub: Kafka, Elasticsearch, MySQL infra and operating cost
      • · DevOps and platform engineering time on self-hosted
      • · Acryl Cloud connector premiums
      • · Multi-year contracts standard at Enterprise tier

      Key features

      • +Active metadata graph
      • +Column-level lineage
      • +Data quality assertions (DataHub Actions)
      • +Business glossary and ontology
      • +Search and discovery
      • +Stewardship workflows
      • +Open-source with Apache 2.0 license
      • +Acryl Cloud managed offering
      90+ integrations
      SnowflakeBigQueryDatabricksdbtAirflowKafkaLookerTableau
      Geography
      Global; strongest in US, EU, India
      #8

      Metaplane

      Observability-anchored catalog acquired by Datadog; standalone roadmap unclear.

      Founded 2020 · Boston, MA · public · 100-5,000 employees
      G2 4.6 (75)
      Capterra 4.6
      Custom quote
      ○ Sales call required
      Visit Metaplane

      Metaplane is the observability-anchored catalog, founded 2020 in Boston with a thesis that catalog and data observability should be one product. Raised $14M Series A in 2023. Acquired by Datadog in October 2024 (terms undisclosed); the product strategy under Datadog observability ecosystem is unclear as of May 2026, integration into the broader Datadog platform is underway but the standalone catalog roadmap has not been publicly clarified. Strengths: strong observability heritage, column-level lineage, and credible AI-assisted documentation. Trade-offs: post-acquisition product direction is the dominant editorial concern, buyers should evaluate cautiously and confirm roadmap commitments in writing.

      Best for

      Teams already standardizing on Datadog observability who are willing to bet on the Metaplane + Datadog integration roadmap, and who want catalog plus observability under one vendor.

      Worst for

      Pure-play catalog buyers (Atlan, Secoda, Select Star clearer), regulated enterprises (Collibra deeper), or buyers who want explicit standalone roadmap commitments.

      Strengths

      • Strong observability and freshness-monitoring heritage
      • Column-level lineage parsed from warehouse query logs
      • Useful AI-assisted documentation
      • Datadog acquisition (Oct 2024) means deeper pockets and infra
      • Slack-first collaboration
      • Modern stack-native (Snowflake, BigQuery, dbt, Looker)

      Weaknesses

      • Post-Datadog acquisition product strategy unclear as of May 2026
      • Standalone catalog roadmap has not been publicly clarified
      • Pricing opaque under Datadog SKU model (Datadog billing complexity is its own thing)
      • Risk of being folded into broader Datadog observability rather than maintained as catalog
      • Catalog buyers may prefer pure-play vendors with clear catalog roadmap

      Pricing tiers

      opaque
      • Metaplane (legacy)
        Pre-acquisition pricing being migrated to Datadog SKU
        Quote
      • Datadog Data Observability
        Post-acquisition Datadog tier; bundled with broader observability
        Quote
      Watch for
      • · Datadog SKU and billing complexity
      • · Bundling pressure into broader Datadog observability
      • · Migration costs for legacy Metaplane customers

      Key features

      • +Data observability (freshness, volume, schema, lineage)
      • +Column-level lineage
      • +AI-assisted documentation
      • +Catalog discovery surface
      • +Slack-first alerting
      • +Anomaly detection on metric monitors
      • +Datadog integration (post-acquisition)
      50+ integrations
      SnowflakeBigQueryDatabricksdbtLookerTableauSlackPagerDuty
      Geography
      Global; strongest in US, EU
      #9

      Amundsen

      Lyft-originated open-source catalog with no commercial entity behind it.

      Founded 2019 · San Francisco, CA · private · 200+ employees
      G2 4.3 (25)
      Capterra 4.3
      From $0 /mo
      ● Transparent pricing
      Visit Amundsen

      Amundsen is the Lyft-originated open-source catalog, open-sourced in 2019 and contributed as an Apache project. Strengths: clean foundational architecture, broad open-source adoption in 2019-2022, and free self-hosted deployment. Trade-offs: development pace has slowed since 2023, there is no commercial entity (no Acryl Data equivalent), and the project is realistically in maintenance mode versus the active development pace at DataHub. Recommended only for engineering teams with DevOps capacity who explicitly want a free, self-hosted catalog with no managed alternative on offer.

      Best for

      Engineering teams (200+ employees) with DevOps capacity who explicitly want a free, self-hosted catalog and accept no commercial support path.

      Worst for

      Teams without DevOps capacity, regulated buyers needing formal governance, or anyone who needs vendor accountability and an SLA path.

      Strengths

      • Open source, free self-hosted
      • Clean foundational architecture from Lyft
      • Broad community familiarity (2019-2022 adoption wave)
      • Apache project governance
      • Basic lineage, discovery, and metadata search

      Weaknesses

      • Development pace slowed since 2023
      • No commercial entity (no Acryl Data equivalent for Amundsen)
      • Realistically in maintenance mode versus DataHub
      • No managed cloud offering
      • Lineage and active metadata trail DataHub and modern catalogs
      • Connector ecosystem narrower than DataHub

      Pricing tiers

      public
      • Amundsen OSS
        Apache 2.0; self-hosted, free; no managed alternative
        $0 /mo
      Watch for
      • · Self-hosted infra (Neo4j or Atlas backend, Elasticsearch, Postgres)
      • · DevOps and platform engineering time
      • · No commercial support path; community-only

      Key features

      • +Metadata search and discovery
      • +Basic lineage
      • +Business glossary
      • +Apache project governance
      • +Lyft-originated architecture
      • +Self-hosted on Kubernetes
      30+ integrations
      SnowflakeBigQueryRedshiftHivePostgresLookerTableau
      Geography
      Global (community)
      #10

      Apache Atlas

      Hadoop-ecosystem heritage catalog with declining adoption as Hadoop matures down.

      Founded 2015 · Apache Software Foundation (project) · private · 500+ employees
      G2 3.9 (18)
      Capterra 4.0
      From $0 /mo
      ● Transparent pricing
      Visit Apache Atlas

      Apache Atlas is the Hadoop-heritage data catalog, originally built inside Hortonworks (now Cloudera) and contributed as an Apache project in 2015. Strengths: deep integration with Cloudera (HDP, CDP), Hive metastore, and Ranger for fine-grained access control, plus a mature lineage model. Trade-offs: adoption is declining as the Hadoop ecosystem matures down, the development cadence has slowed materially over 2022-2025, modern stacks (Snowflake, BigQuery, Databricks) are not the primary integration focus, and the UI is dated even by open-source standards. Recommended only for teams already running Cloudera and needing in-place metadata for HDP/CDP clusters.

      Best for

      Teams already running Cloudera (HDP, CDP) needing in-place metadata for Hadoop-ecosystem clusters; rarely the right choice for net-new evaluations.

      Worst for

      Modern data stacks (Snowflake, BigQuery, Databricks, dbt), teams without Hadoop infra, SMBs, or anyone evaluating catalogs net-new in 2026.

      Strengths

      • Deep Cloudera (HDP, CDP) integration
      • Hive metastore and Ranger integration mature
      • Apache project governance
      • Mature lineage model for Hadoop-ecosystem workloads
      • Free open source under Apache 2.0

      Weaknesses

      • Adoption declining as Hadoop ecosystem matures down
      • Development cadence slowed materially over 2022-2025
      • Modern stack (Snowflake, BigQuery, Databricks) is not the primary integration focus
      • UI dated even by open-source standards
      • No commercial entity beyond Cloudera distribution
      • Realistically a legacy choice in 2026

      Pricing tiers

      public
      • Apache Atlas OSS
        Free, self-hosted; typically deployed alongside Cloudera CDP
        $0 /mo
      Watch for
      • · Hadoop infra (HBase, Solr, Kafka) operating cost
      • · Cloudera CDP license if deployed in supported context
      • · DevOps and Hadoop platform engineering time

      Key features

      • +Hadoop-ecosystem metadata management
      • +Lineage across Hive, HDFS, HBase, Kafka
      • +Ranger integration for fine-grained access control
      • +Apache project governance
      • +Classification and tagging
      • +Business glossary
      • +REST API
      20+ integrations
      Cloudera CDPHiveHBaseKafkaRangerHDFS
      Geography
      Global (community)
      Buying guide

      8 steps to pick the right data catalog software

      1. 1
        1. Identify the primary buyer and use case

        Data office and regulator-facing? Collibra or Alation first. Data engineering team-led and modern stack? Atlan, Secoda, Select Star, or DataHub. Data mesh and federated ownership? data.world. Observability bundled with catalog and Datadog-anchored? Metaplane with caveats. Engineering-led and want open source? DataHub or Amundsen.

      2. 2
        2. Audit your actual data stack

        Warehouse(s), BI tools, dbt or other transformation, orchestration (Airflow, Dagster), source databases. Confirm every catalog finalist has column-level lineage parsing for your warehouse and dbt project. This is the highest-leverage feature and the gap between marketing and reality is widest here.

      3. 3
        3. Match team size and time-to-value tolerance

        Under 50 employees: Secoda Free or DataHub OSS. 50-500: Secoda, Select Star, Atlan Starter. 500-2,000: Atlan Pro or Alation. 2,000+ regulated: Collibra. Time-to-value: modern catalogs deliver 2-8 weeks; legacy catalogs (Collibra, Alation) 6-12 months with SI implementation.

      4. 4
        4. Run a 4-week parallel POC on real metadata

        Connect your actual warehouse, dbt, and top 3 BI dashboards. Score on (1) automatic lineage coverage, (2) time to first useful catalog entry, (3) Slack and BI integration friction, (4) AI documentation quality on your worst metadata. Do not score on headline feature lists; they are nearly identical across vendors.

      5. 5
        5. Get itemized written pricing

        Catalog pricing is among the most opaque in B2B software. Request itemized quotes including base subscription, active user counts, premium connector packs, AI consumption charges, SSO and audit log gating, premium support tier, and SI implementation. Push back on auto-renewal escalators and module-based upsells (especially Collibra and Alation).

      6. 6
        6. Negotiate exit, portability, and customer-success guarantees

        Confirm the metadata export format (most modern catalogs export to JSON or via API; legacy catalogs sometimes lock metadata into proprietary stores). For Collibra and Alation, ask explicit customer-success continuity questions tied to the 2023 layoff history. For Metaplane, request a written commitment on standalone catalog roadmap.

      7. 7
        7. Decide on AI-assisted features cautiously

        Every vendor pitches an AI Copilot in 2026; production value varies. Test on representative metadata (legacy, badly named, half-documented) before signing. If the AI gives confident-sounding wrong answers there, it will give them in production. Negotiate the right to opt out of AI consumption charges if the feature underperforms.

      8. 8
        8. Plan for governance from day one

        Business glossary, stewardship roles, certification workflows, and access policies are non-trivial to retrofit. Stand up at least a minimal governance framework (named stewards for top 20 domains, certified metrics, ownership tags) in week 1, not month 6. Catalog adoption fails when governance is treated as a post-launch task.

      Frequently asked questions

      The questions buyers actually ask before they sign a data catalog software contract.

      What does a data catalog actually do, and when do you actually need one?
      A data catalog inventories your data assets (tables, columns, dashboards, models), captures lineage between them, and surfaces a search and stewardship layer so analysts and engineers can find, trust, and govern data. You actually need one when: (1) you have more than 1,000 tables across your warehouse plus dbt models plus BI assets, (2) analysts spend more than 10% of their time asking "what is this column?" in Slack, or (3) a regulator or auditor needs you to show data lineage and stewardship. Smaller teams can survive on a documented dbt project plus Notion or Confluence; the catalog is the upgrade once that surface stops scaling.
      Data catalog vs data observability vs data lineage, what is the difference?
      A data catalog is the inventory and discovery surface (Atlan, Collibra, Secoda). Data observability is the freshness, volume, schema-change, and quality monitoring layer (Monte Carlo, Bigeye, Anomalo, Metaplane). Data lineage is the graph that connects assets to upstream and downstream (every modern catalog ships lineage; observability tools also use lineage for impact analysis). The categories are converging in 2026, Atlan, Secoda, and Metaplane all do lineage; DataHub does observability assertions; observability vendors are adding catalog. Most buyers pick one primary catalog plus one primary observability tool, or accept the trade-off of a less mature combined product.
      AI-assisted cataloging, is it real or hype?
      Real and hype, depending on what you measure. Working: auto-documentation of warehouse tables and columns where lineage and naming conventions are reasonable (Atlan, Secoda, DataHub, Alation all do this credibly). Hype: agent-grade natural-language data discovery that handles ambiguous business questions without a curated business glossary (still poor across all vendors). Editorial guidance: test AI features on a representative slice of your worst metadata (legacy, badly named, half-documented) before signing. If the AI gives confident-sounding wrong answers there, it will give them in production.
      Open source vs proprietary, which fits better?
      Open source (DataHub OSS, Amundsen, Apache Atlas) fits engineering-led teams with DevOps capacity who want vendor insurance and accept self-hosted operating cost. DataHub is the active open-source choice in 2026; Amundsen is in maintenance mode; Apache Atlas is legacy-Hadoop. Proprietary SaaS (Atlan, Secoda, Select Star, Collibra, Alation) fits teams that want time-to-value in days or weeks, formal governance, and a vendor SLA. The middle ground is Acryl Cloud (managed DataHub) for teams who want open-source insurance with a managed path.
      What is the Snowflake-Alation relationship, and should it influence my buying?
      Snowflake Ventures participated in the November 2022 Alation Series E at a $1.7B valuation. Snowflake field reps frequently route Alation in joint accounts and the technical integration is deeper than Snowflake plus other catalogs. This is real and a legitimate reason to evaluate Alation if you are Snowflake-anchored. It should not be the only reason, modern catalogs (Atlan especially) have closed the Snowflake metadata gap meaningfully over 2024-2025. Run a 4-week parallel evaluation if your stack is Snowflake plus dbt plus modern BI.
      What happened with Metaplane after the Datadog acquisition?
      Datadog acquired Metaplane in October 2024; terms were not disclosed. As of May 2026, the product is being integrated into the broader Datadog observability platform under a "Data Observability" SKU, the standalone catalog roadmap has not been publicly clarified, and pricing is moving onto the Datadog billing model. Editorial guidance: if you are not already standardizing on Datadog, do not pick Metaplane net-new in 2026 until the product strategy is clearer. If you are Datadog-anchored, get a written roadmap commitment from sales before signing a multi-year deal.
      Collibra had layoffs and a valuation reset, is it still safe to buy?
      Collibra is the largest pure-play catalog vendor and the deepest governance product; the company is not at existential risk. The 2023 layoffs (January and September) and post-2022 valuation reset are legitimate diligence items. Practical guidance for buyers: ask for customer-success continuity guarantees in writing, push for shorter initial terms (1-2 years rather than 3), and negotiate exit provisions. Regulated enterprises with formal governance mandates still default to Collibra; modern data-team-led buyers have viable alternatives in Atlan and Secoda.
      How much should I budget for a data catalog?
      SMB (under 50 employees): $0-$10K annually (Secoda Free or Team, DataHub OSS, Amundsen self-hosted). Lower mid-market (50-200): $20K-$60K (Secoda, Select Star, Atlan Starter). Mid-market (200-1,000): $60K-$200K (Atlan Pro, Alation, Secoda Business, Acryl Cloud). Enterprise (1,000-5,000): $200K-$500K (Collibra, Alation, Atlan Enterprise, Acryl Cloud Enterprise). Large enterprise (5,000+): $500K-$1.5M+ (Collibra, Alation, data.world enterprise). Collibra at the high enterprise tier routinely crosses $1M including SI implementation.
      How long does a catalog implementation actually take?
      Modern catalogs (Atlan, Secoda, Select Star, DataHub Acryl Cloud): 2-8 weeks to a working catalog with lineage on modern stack. Collibra: 6-12 months to production governance, typically with an SI partner. Alation: 3-9 months. data.world: 3-9 months. Self-hosted open source (DataHub OSS, Amundsen, Apache Atlas): plan for 4-12 weeks of platform engineering before going live, plus ongoing operating overhead. Implementation length is the single biggest hidden cost in the category.
      Should we evaluate via free trial or proof of concept?
      Free permanent: DataHub OSS, Amundsen, Apache Atlas, Secoda Free. Free trial: Atlan (demo), Secoda Team (14 days), Select Star (14 days), Acryl Cloud (demo). Demo only at enterprise tier: Collibra, Alation, data.world. Editorial guidance: run a 4-week parallel evaluation against your real warehouse, dbt project, and top 3 BI dashboards. Score on (1) automatic lineage coverage on your stack, (2) time to first useful catalog entry, (3) Slack or BI integration friction, and (4) AI documentation quality on your worst metadata. Headline feature lists are nearly identical across vendors in 2026; the gap is in real-data fidelity.

      Glossary

      Data catalog
      An inventory of an organization data assets (tables, columns, dashboards, models, files) with metadata, lineage, ownership, and governance attributes. The discovery and trust surface over the data stack.
      Active metadata
      Metadata that flows into and out of catalog, observability, governance, and operational tools rather than sitting passively in a wiki. The architectural shift that separates modern catalogs (Atlan, DataHub, Secoda) from legacy "shelf-and-search" catalogs.
      Column-level lineage
      A graph that connects individual columns (not just tables) across upstream sources and downstream consumers (BI dashboards, ML features). Parsed automatically by modern catalogs from warehouse query logs and dbt models.
      Business glossary
      A curated dictionary of business terms (e.g. "monthly active user") with definitions, owners, and links to the underlying technical metadata (the column or model that implements the term).
      Data dictionary
      A technical inventory of tables, columns, types, and constraints. Narrower than a full catalog; often a feature inside a catalog.
      Data steward
      A named role accountable for the quality, definition, and access policies of a defined data domain. Stewardship workflows are first-class in Collibra, Alation, and Atlan.
      Data mesh
      An architectural approach where data is owned by business domains (each producing data products) rather than a central data team. the data.world knowledge-graph model aligns naturally with mesh; modern catalogs (Atlan, DataHub) also support mesh patterns.
      Data product
      A discoverable, governed, versioned data asset (typically a curated table or model) treated as a product with an owner, SLA, and consumers. The data-mesh unit of ownership.
      Knowledge graph
      A graph data model (typically RDF / SPARQL) that represents entities and their relationships. data.world is the knowledge-graph-native catalog; other vendors implement metadata as graphs internally without exposing the paradigm.
      Metadata
      Data about data, schema, ownership, lineage, freshness, quality, business definitions, access policies. The substrate every catalog manages.
      Stewardship workflow
      A defined process (request, approval, sign-off) for tasks like onboarding a new data asset, certifying a metric, or approving access. Deepest in Collibra and Alation; lighter in modern catalogs.
      Impact analysis
      Using lineage to identify downstream consumers (dashboards, models, reports) affected by an upstream change (column rename, schema migration, deprecation). The primary use case for Select Star and a core feature of every modern catalog.

      Final word

      See the full intelligence profile for any product on this page, including verified pricing, vendor trust scores, and review patterns. Browse the Data Catalog Software category page →

      Last updated 2026-05-10. Pricing data is reverified quarterly. Found something inaccurate? Tell us.