Best Software for 2025 is now live!
|| products.size

Best Machine Learning Data Catalog Software

Shalaka Joshi
SJ
Researched and written by Shalaka Joshi

Machine learning data catalogs allow companies to categorize, access, interpret, and collaborate around company data across multiple data sources, while maintaining a high level of governance and access management. Artificial intelligence is key to many features of machine learning data catalogs, enabling functionality such as machine learning recommendations, natural language querying, and dynamic data masking for enhanced security purposes.

Companies can utilize machine learning data catalogs to maintain data sets in a single location so that searching for and discovering data is simple for everyday business users and analysts alike. Users have the ability to comment on, share, and recommend data sets so colleagues can have an immediate understanding of what they are querying. Additionally, IT administrators can put into place user provisioning to ensure unauthorized employees are not accessing sensitive data.

Machine learning data catalogs are most frequently implemented by companies that have multiple data sources, are searching for one source of truth, and are attempting to scale data usage company-wide. These products are generally administered by IT departments, who can maintain organization and security, but data can be accessed by data scientists or analysts and the average business user. The data can then be transformed, modeled, and visualized either directly in the machine learning data catalog or through an integration with business intelligence software.

It should be noted that not all machine learning data catalogs provide data preparation capabilities and may require an integration with a business intelligence platform. Additionally, these tools differ from master data management software due to their enhanced governance, collaboration, and machine learning functionality.

To qualify for inclusion in the Machine Learning Data Catalog category, a product must:

Organize and consolidate data from all company sources in a single repository
Provide user access management for security and data governance purposes
Allow business users to search and access the data from within the catalog
Offer collaboration features around data sets, including categorizing, commenting, and sharing
Give intelligent recommendations based on machine learning for quicker access to relevant data

Best Machine Learning Data Catalog Software At A Glance

Best for Mid-Market:
Best for Enterprise:
Highest User Satisfaction:
Best Free Software:
Show LessShow More
Highest User Satisfaction:
Best Free Software:

G2 takes pride in showing unbiased reviews on user satisfaction in our ratings and reports. We do not allow paid placements in any of our ratings, rankings, or reports. Learn about our scoring methodologies.

No filters applied
81 Listings in Machine Learning Data Catalog Available
(193)4.2 out of 5
2nd Easiest To Use in Machine Learning Data Catalog software
View top Consulting Services for AWS Glue
Save to My Lists
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    AWS Glue is a serverless data integration service that makes it easier for analytics users to discover, prepare, move, and integrate data from multiple sources for analytics, machine learning, and app

    Users
    • Data Engineer
    • DevOps Engineer
    Industries
    • Information Technology and Services
    • Computer Software
    Market Segment
    • 48% Enterprise
    • 28% Mid-Market
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • AWS Glue Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Ease of Use
    21
    Data Integration
    13
    ETL Process
    11
    ETL Efficiency
    10
    ETL Solutions
    10
    Cons
    Feature Limitations
    8
    Expensive
    7
    Limited Functionality
    7
    Complexity
    6
    Data Limitations
    5
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • AWS Glue features and usability ratings that predict user satisfaction
    8.4
    Ease of Use
    Average: 8.6
    8.9
    Business and Data Glossary
    Average: 8.5
    8.6
    Metadata Management
    Average: 8.5
    8.7
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Year Founded
    2006
    HQ Location
    Seattle, WA
    Twitter
    @awscloud
    2,230,610 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    136,383 employees on LinkedIn®
    Ownership
    NASDAQ: AMZN
Product Description
How are these determined?Information
This description is provided by the seller.

AWS Glue is a serverless data integration service that makes it easier for analytics users to discover, prepare, move, and integrate data from multiple sources for analytics, machine learning, and app

Users
  • Data Engineer
  • DevOps Engineer
Industries
  • Information Technology and Services
  • Computer Software
Market Segment
  • 48% Enterprise
  • 28% Mid-Market
AWS Glue Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Ease of Use
21
Data Integration
13
ETL Process
11
ETL Efficiency
10
ETL Solutions
10
Cons
Feature Limitations
8
Expensive
7
Limited Functionality
7
Complexity
6
Data Limitations
5
AWS Glue features and usability ratings that predict user satisfaction
8.4
Ease of Use
Average: 8.6
8.9
Business and Data Glossary
Average: 8.5
8.6
Metadata Management
Average: 8.5
8.7
Data Lineage
Average: 8.7
Seller Details
Year Founded
2006
HQ Location
Seattle, WA
Twitter
@awscloud
2,230,610 Twitter followers
LinkedIn® Page
www.linkedin.com
136,383 employees on LinkedIn®
Ownership
NASDAQ: AMZN
(116)4.5 out of 5
Optimized for quick response
3rd Easiest To Use in Machine Learning Data Catalog software
Save to My Lists
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    Built by a data team, for data teams, Atlan is THE Active Metadata platform for enterprises to find, trust, and govern AI-ready data, and a leader in The Forrester Wave™: Enterprise Data Catalogs, Q3

    Users
    No information available
    Industries
    • Financial Services
    • Information Technology and Services
    Market Segment
    • 54% Mid-Market
    • 40% Enterprise
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • Atlan Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Ease of Use
    32
    User Interface
    25
    Features
    20
    Data Lineage
    19
    User Experience
    18
    Cons
    Limited Functionality
    11
    Missing Features
    11
    Lacking Features
    10
    Integration Issues
    7
    Data Lineage Issues
    6
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • Atlan features and usability ratings that predict user satisfaction
    9.0
    Ease of Use
    Average: 8.6
    9.2
    Business and Data Glossary
    Average: 8.5
    9.4
    Metadata Management
    Average: 8.5
    9.3
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Seller
    Atlan
    Company Website
    Year Founded
    2019
    HQ Location
    New York, US
    Twitter
    @AtlanHQ
    9,421 Twitter followers
    LinkedIn® Page
    in.linkedin.com
    434 employees on LinkedIn®
Product Description
How are these determined?Information
This description is provided by the seller.

Built by a data team, for data teams, Atlan is THE Active Metadata platform for enterprises to find, trust, and govern AI-ready data, and a leader in The Forrester Wave™: Enterprise Data Catalogs, Q3

Users
No information available
Industries
  • Financial Services
  • Information Technology and Services
Market Segment
  • 54% Mid-Market
  • 40% Enterprise
Atlan Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Ease of Use
32
User Interface
25
Features
20
Data Lineage
19
User Experience
18
Cons
Limited Functionality
11
Missing Features
11
Lacking Features
10
Integration Issues
7
Data Lineage Issues
6
Atlan features and usability ratings that predict user satisfaction
9.0
Ease of Use
Average: 8.6
9.2
Business and Data Glossary
Average: 8.5
9.4
Metadata Management
Average: 8.5
9.3
Data Lineage
Average: 8.7
Seller Details
Seller
Atlan
Company Website
Year Founded
2019
HQ Location
New York, US
Twitter
@AtlanHQ
9,421 Twitter followers
LinkedIn® Page
in.linkedin.com
434 employees on LinkedIn®

This is how G2 Deals can help you:

  • Easily shop for curated – and trusted – software
  • Own your own software buying journey
  • Discover exclusive deals on software
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    Cloudera Navigator is a complete data governance solution for Hadoop, offering critical capabilities such as data discovery, continuous optimization, audit, lineage, metadata management, and policy en

    Users
    No information available
    Industries
    • Information Technology and Services
    Market Segment
    • 48% Enterprise
    • 38% Small-Business
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • Cloudera Data Platform Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Data Management
    6
    Ease of Use
    6
    Efficiency Improvement
    5
    Performance
    5
    User Interface
    5
    Cons
    Expensive
    6
    Complex Setup
    3
    Difficult Learning
    3
    Integration Issues
    3
    Not User-Friendly
    3
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • Cloudera Data Platform features and usability ratings that predict user satisfaction
    8.1
    Ease of Use
    Average: 8.6
    8.9
    Business and Data Glossary
    Average: 8.5
    9.1
    Metadata Management
    Average: 8.5
    8.8
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Seller
    Cloudera
    Year Founded
    2008
    HQ Location
    Palo Alto, CA
    Twitter
    @cloudera
    109,180 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    3,226 employees on LinkedIn®
    Phone
    888-789-1488
Product Description
How are these determined?Information
This description is provided by the seller.

Cloudera Navigator is a complete data governance solution for Hadoop, offering critical capabilities such as data discovery, continuous optimization, audit, lineage, metadata management, and policy en

Users
No information available
Industries
  • Information Technology and Services
Market Segment
  • 48% Enterprise
  • 38% Small-Business
Cloudera Data Platform Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Data Management
6
Ease of Use
6
Efficiency Improvement
5
Performance
5
User Interface
5
Cons
Expensive
6
Complex Setup
3
Difficult Learning
3
Integration Issues
3
Not User-Friendly
3
Cloudera Data Platform features and usability ratings that predict user satisfaction
8.1
Ease of Use
Average: 8.6
8.9
Business and Data Glossary
Average: 8.5
9.1
Metadata Management
Average: 8.5
8.8
Data Lineage
Average: 8.7
Seller Details
Seller
Cloudera
Year Founded
2008
HQ Location
Palo Alto, CA
Twitter
@cloudera
109,180 Twitter followers
LinkedIn® Page
www.linkedin.com
3,226 employees on LinkedIn®
Phone
888-789-1488
(28)4.4 out of 5
View top Consulting Services for Google Cloud Data Catalog
Save to My Lists
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    A fully managed and highly scalable data discovery and metadata management service.

    Users
    No information available
    Industries
    • Computer Software
    Market Segment
    • 46% Small-Business
    • 29% Mid-Market
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • Google Cloud Data Catalog features and usability ratings that predict user satisfaction
    8.7
    Ease of Use
    Average: 8.6
    8.5
    Business and Data Glossary
    Average: 8.5
    9.1
    Metadata Management
    Average: 8.5
    7.8
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Seller
    Google
    Year Founded
    1998
    HQ Location
    Mountain View, CA
    Twitter
    @google
    32,520,271 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    301,875 employees on LinkedIn®
    Ownership
    NASDAQ:GOOG
Product Description
How are these determined?Information
This description is provided by the seller.

A fully managed and highly scalable data discovery and metadata management service.

Users
No information available
Industries
  • Computer Software
Market Segment
  • 46% Small-Business
  • 29% Mid-Market
Google Cloud Data Catalog features and usability ratings that predict user satisfaction
8.7
Ease of Use
Average: 8.6
8.5
Business and Data Glossary
Average: 8.5
9.1
Metadata Management
Average: 8.5
7.8
Data Lineage
Average: 8.7
Seller Details
Seller
Google
Year Founded
1998
HQ Location
Mountain View, CA
Twitter
@google
32,520,271 Twitter followers
LinkedIn® Page
www.linkedin.com
301,875 employees on LinkedIn®
Ownership
NASDAQ:GOOG
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    Each entry in the dataset consists of a unique MP3 and corresponding text file. Many of the 1,368 recorded hours in the dataset also include demographic metadata like age, sex, and accent that can hel

    Users
    No information available
    Industries
    No information available
    Market Segment
    • 64% Small-Business
    • 27% Mid-Market
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • Common Voice dataset Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Customer Support
    3
    Ease of Use
    2
    Efficiency Improvement
    2
    Data Management
    1
    Product Improvement
    1
    Cons
    Bug Issues
    1
    Inaccuracy Issues
    1
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • Common Voice dataset features and usability ratings that predict user satisfaction
    8.2
    Ease of Use
    Average: 8.6
    6.8
    Business and Data Glossary
    Average: 8.5
    8.2
    Metadata Management
    Average: 8.5
    6.8
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Seller
    Mozilla
    Year Founded
    2005
    HQ Location
    San Francisco, CA
    Twitter
    @mozilla
    273,534 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    1,795 employees on LinkedIn®
Product Description
How are these determined?Information
This description is provided by the seller.

Each entry in the dataset consists of a unique MP3 and corresponding text file. Many of the 1,368 recorded hours in the dataset also include demographic metadata like age, sex, and accent that can hel

Users
No information available
Industries
No information available
Market Segment
  • 64% Small-Business
  • 27% Mid-Market
Common Voice dataset Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Customer Support
3
Ease of Use
2
Efficiency Improvement
2
Data Management
1
Product Improvement
1
Cons
Bug Issues
1
Inaccuracy Issues
1
Common Voice dataset features and usability ratings that predict user satisfaction
8.2
Ease of Use
Average: 8.6
6.8
Business and Data Glossary
Average: 8.5
8.2
Metadata Management
Average: 8.5
6.8
Data Lineage
Average: 8.7
Seller Details
Seller
Mozilla
Year Founded
2005
HQ Location
San Francisco, CA
Twitter
@mozilla
273,534 Twitter followers
LinkedIn® Page
www.linkedin.com
1,795 employees on LinkedIn®
(29)4.2 out of 5
Save to My Lists
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    Appen collects and labels images, text, speech, audio, video, and other data to create training data used to build and continuously improve the world’s most innovative artificial intelligence systems.

    Users
    No information available
    Industries
    • Information Technology and Services
    Market Segment
    • 55% Small-Business
    • 28% Mid-Market
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • Appen Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Ease of Use
    8
    User Experience
    5
    Customer Support
    4
    AI Integration
    3
    Flexibility
    3
    Cons
    Low Compensation
    4
    Limited Functionality
    3
    Poor Customer Support
    3
    Complexity
    2
    Connectivity Issues
    2
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • Appen features and usability ratings that predict user satisfaction
    8.1
    Ease of Use
    Average: 8.6
    8.5
    Business and Data Glossary
    Average: 8.5
    8.3
    Metadata Management
    Average: 8.5
    7.8
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Seller
    Appen
    Year Founded
    1996
    HQ Location
    Kirkland, Washington, United States
    LinkedIn® Page
    www.linkedin.com
    19,133 employees on LinkedIn®
    Ownership
    ASX:APX
    Total Revenue (USD mm)
    $244,900
Product Description
How are these determined?Information
This description is provided by the seller.

Appen collects and labels images, text, speech, audio, video, and other data to create training data used to build and continuously improve the world’s most innovative artificial intelligence systems.

Users
No information available
Industries
  • Information Technology and Services
Market Segment
  • 55% Small-Business
  • 28% Mid-Market
Appen Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Ease of Use
8
User Experience
5
Customer Support
4
AI Integration
3
Flexibility
3
Cons
Low Compensation
4
Limited Functionality
3
Poor Customer Support
3
Complexity
2
Connectivity Issues
2
Appen features and usability ratings that predict user satisfaction
8.1
Ease of Use
Average: 8.6
8.5
Business and Data Glossary
Average: 8.5
8.3
Metadata Management
Average: 8.5
7.8
Data Lineage
Average: 8.7
Seller Details
Seller
Appen
Year Founded
1996
HQ Location
Kirkland, Washington, United States
LinkedIn® Page
www.linkedin.com
19,133 employees on LinkedIn®
Ownership
ASX:APX
Total Revenue (USD mm)
$244,900
Entry Level Price:Free
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    Decube is the all-in-one Data Trust Platform designed for the modern data stack. Our mission is to make your data reliable, easily discoverable, and constantly monitored across your entire organizatio

    Users
    No information available
    Industries
    • Information Technology and Services
    Market Segment
    • 33% Enterprise
    • 33% Mid-Market
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • decube Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    User Interface
    7
    User Experience
    5
    UX Design
    5
    Data Cataloging
    4
    Ease of Use
    4
    Cons
    Limited Functionality
    2
    Missing Features
    2
    Monitoring Issues
    2
    API Limitations
    1
    Connector Issues
    1
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • decube features and usability ratings that predict user satisfaction
    9.4
    Ease of Use
    Average: 8.6
    9.7
    Business and Data Glossary
    Average: 8.5
    9.7
    Metadata Management
    Average: 8.5
    9.6
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Company Website
    Year Founded
    2022
    HQ Location
    Kuala Lumpur
    Twitter
    @decube_data
    114 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    40 employees on LinkedIn®
Product Description
How are these determined?Information
This description is provided by the seller.

Decube is the all-in-one Data Trust Platform designed for the modern data stack. Our mission is to make your data reliable, easily discoverable, and constantly monitored across your entire organizatio

Users
No information available
Industries
  • Information Technology and Services
Market Segment
  • 33% Enterprise
  • 33% Mid-Market
decube Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
User Interface
7
User Experience
5
UX Design
5
Data Cataloging
4
Ease of Use
4
Cons
Limited Functionality
2
Missing Features
2
Monitoring Issues
2
API Limitations
1
Connector Issues
1
decube features and usability ratings that predict user satisfaction
9.4
Ease of Use
Average: 8.6
9.7
Business and Data Glossary
Average: 8.5
9.7
Metadata Management
Average: 8.5
9.6
Data Lineage
Average: 8.7
Seller Details
Company Website
Year Founded
2022
HQ Location
Kuala Lumpur
Twitter
@decube_data
114 Twitter followers
LinkedIn® Page
www.linkedin.com
40 employees on LinkedIn®
(48)4.5 out of 5
5th Easiest To Use in Machine Learning Data Catalog software
Save to My Lists
Entry Level Price:Contact Us
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    Secoda is the fastest way to explore, understand, and use data. Companies like Chipotle, Cardinal Health, Kaufland, and Remitly use Secoda to get visibility into the health of their entire stack, red

    Users
    No information available
    Industries
    • Computer Software
    • Information Technology and Services
    Market Segment
    • 65% Mid-Market
    • 21% Small-Business
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • Secoda Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Ease of Use
    28
    Features
    23
    Customer Support
    19
    Integrations
    14
    Data Lineage
    12
    Cons
    Bug Issues
    10
    Bugs
    10
    Technical Issues
    9
    Missing Features
    5
    Learning Curve
    4
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • Secoda features and usability ratings that predict user satisfaction
    8.3
    Ease of Use
    Average: 8.6
    9.2
    Business and Data Glossary
    Average: 8.5
    9.4
    Metadata Management
    Average: 8.5
    8.8
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Seller
    Secoda
    Year Founded
    2021
    HQ Location
    Toronto, CA
    Twitter
    @SecodaHQ
    898 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    43 employees on LinkedIn®
Product Description
How are these determined?Information
This description is provided by the seller.

Secoda is the fastest way to explore, understand, and use data. Companies like Chipotle, Cardinal Health, Kaufland, and Remitly use Secoda to get visibility into the health of their entire stack, red

Users
No information available
Industries
  • Computer Software
  • Information Technology and Services
Market Segment
  • 65% Mid-Market
  • 21% Small-Business
Secoda Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Ease of Use
28
Features
23
Customer Support
19
Integrations
14
Data Lineage
12
Cons
Bug Issues
10
Bugs
10
Technical Issues
9
Missing Features
5
Learning Curve
4
Secoda features and usability ratings that predict user satisfaction
8.3
Ease of Use
Average: 8.6
9.2
Business and Data Glossary
Average: 8.5
9.4
Metadata Management
Average: 8.5
8.8
Data Lineage
Average: 8.7
Seller Details
Seller
Secoda
Year Founded
2021
HQ Location
Toronto, CA
Twitter
@SecodaHQ
898 Twitter followers
LinkedIn® Page
www.linkedin.com
43 employees on LinkedIn®
(62)4.7 out of 5
1st Easiest To Use in Machine Learning Data Catalog software
Save to My Lists
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    CastorDoc is a collaborative, automated data discovery & catalog tool. We believe that data people spend way too much time trying to find and understand their data. CastorDoc redesigns how dat

    Users
    No information available
    Industries
    • Financial Services
    • Information Technology and Services
    Market Segment
    • 60% Mid-Market
    • 26% Enterprise
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • CastorDoc Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Centralized Management
    1
    Collaboration
    1
    Data Governance
    1
    Data Lineage
    1
    Data Quality
    1
    Cons
    This product has not yet received any negative sentiments.
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • CastorDoc features and usability ratings that predict user satisfaction
    9.6
    Ease of Use
    Average: 8.6
    10.0
    Business and Data Glossary
    Average: 8.5
    9.9
    Metadata Management
    Average: 8.5
    9.9
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Seller
    Castor
    Company Website
    Year Founded
    2020
    HQ Location
    New York City, New York
    Twitter
    @castordoc_data
    476 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    55 employees on LinkedIn®
Product Description
How are these determined?Information
This description is provided by the seller.

CastorDoc is a collaborative, automated data discovery & catalog tool. We believe that data people spend way too much time trying to find and understand their data. CastorDoc redesigns how dat

Users
No information available
Industries
  • Financial Services
  • Information Technology and Services
Market Segment
  • 60% Mid-Market
  • 26% Enterprise
CastorDoc Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Centralized Management
1
Collaboration
1
Data Governance
1
Data Lineage
1
Data Quality
1
Cons
This product has not yet received any negative sentiments.
CastorDoc features and usability ratings that predict user satisfaction
9.6
Ease of Use
Average: 8.6
10.0
Business and Data Glossary
Average: 8.5
9.9
Metadata Management
Average: 8.5
9.9
Data Lineage
Average: 8.7
Seller Details
Seller
Castor
Company Website
Year Founded
2020
HQ Location
New York City, New York
Twitter
@castordoc_data
476 Twitter followers
LinkedIn® Page
www.linkedin.com
55 employees on LinkedIn®
(12)4.2 out of 5
View top Consulting Services for data.world
Save to My Lists
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    data.world is the most-adopted data catalog and governance platform on the market. Built on a unique knowledge graph foundation, data.world seamlessly integrates with your existing systems. We set

    Users
    No information available
    Industries
    No information available
    Market Segment
    • 67% Small-Business
    • 25% Mid-Market
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • data.world Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Ease of Use
    4
    Integrations
    3
    Data Discovery
    2
    Easy Integrations
    2
    User Interface
    2
    Cons
    Data Duplication
    1
    Data Inaccuracy
    1
    Data Quality
    1
    Learning Curve
    1
    Missing Features
    1
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • data.world features and usability ratings that predict user satisfaction
    8.8
    Ease of Use
    Average: 8.6
    9.2
    Business and Data Glossary
    Average: 8.5
    8.8
    Metadata Management
    Average: 8.5
    9.3
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Company Website
    Year Founded
    2016
    HQ Location
    Austin, Texas
    Twitter
    @datadotworld
    5,645 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    217 employees on LinkedIn®
Product Description
How are these determined?Information
This description is provided by the seller.

data.world is the most-adopted data catalog and governance platform on the market. Built on a unique knowledge graph foundation, data.world seamlessly integrates with your existing systems. We set

Users
No information available
Industries
No information available
Market Segment
  • 67% Small-Business
  • 25% Mid-Market
data.world Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Ease of Use
4
Integrations
3
Data Discovery
2
Easy Integrations
2
User Interface
2
Cons
Data Duplication
1
Data Inaccuracy
1
Data Quality
1
Learning Curve
1
Missing Features
1
data.world features and usability ratings that predict user satisfaction
8.8
Ease of Use
Average: 8.6
9.2
Business and Data Glossary
Average: 8.5
8.8
Metadata Management
Average: 8.5
9.3
Data Lineage
Average: 8.7
Seller Details
Company Website
Year Founded
2016
HQ Location
Austin, Texas
Twitter
@datadotworld
5,645 Twitter followers
LinkedIn® Page
www.linkedin.com
217 employees on LinkedIn®
(86)4.3 out of 5
Optimized for quick response
6th Easiest To Use in Machine Learning Data Catalog software
View top Consulting Services for Collibra
Save to My Lists
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    Do more with trusted data. Collibra unites your entire organization with trusted data that's easy to find, understand and access so you can do more with your data. And with new artificial intelligence

    Users
    No information available
    Industries
    • Financial Services
    • Pharmaceuticals
    Market Segment
    • 72% Enterprise
    • 20% Mid-Market
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • Collibra Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Ease of Use
    10
    Features
    8
    Data Management
    6
    User Interface
    6
    Customization
    5
    Cons
    Limited Functionality
    5
    Feature Limitations
    4
    Improvement Needed
    4
    Missing Features
    4
    Time-Consuming
    4
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • Collibra features and usability ratings that predict user satisfaction
    8.1
    Ease of Use
    Average: 8.6
    8.1
    Business and Data Glossary
    Average: 8.5
    7.7
    Metadata Management
    Average: 8.5
    7.7
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Seller
    Collibra
    Company Website
    Year Founded
    2008
    HQ Location
    New York, New York
    Twitter
    @collibra
    5,789 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    1,014 employees on LinkedIn®
Product Description
How are these determined?Information
This description is provided by the seller.

Do more with trusted data. Collibra unites your entire organization with trusted data that's easy to find, understand and access so you can do more with your data. And with new artificial intelligence

Users
No information available
Industries
  • Financial Services
  • Pharmaceuticals
Market Segment
  • 72% Enterprise
  • 20% Mid-Market
Collibra Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Ease of Use
10
Features
8
Data Management
6
User Interface
6
Customization
5
Cons
Limited Functionality
5
Feature Limitations
4
Improvement Needed
4
Missing Features
4
Time-Consuming
4
Collibra features and usability ratings that predict user satisfaction
8.1
Ease of Use
Average: 8.6
8.1
Business and Data Glossary
Average: 8.5
7.7
Metadata Management
Average: 8.5
7.7
Data Lineage
Average: 8.7
Seller Details
Seller
Collibra
Company Website
Year Founded
2008
HQ Location
New York, New York
Twitter
@collibra
5,789 Twitter followers
LinkedIn® Page
www.linkedin.com
1,014 employees on LinkedIn®
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    Coginiti is a SQL-first collaborative data operations platform that empowers teams to build, publish, and consume quality data products, streamlining the data analytics lifecycle from inception to ins

    Users
    No information available
    Industries
    No information available
    Market Segment
    • 66% Enterprise
    • 28% Mid-Market
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • Coginiti Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Ease of Use
    5
    User Experience
    5
    User Interface
    4
    Data Management
    3
    Efficiency Improvement
    3
    Cons
    Poor Documentation
    3
    Search Functionality
    2
    Limited Customization
    1
    Slow Performance
    1
    Training Required
    1
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • Coginiti features and usability ratings that predict user satisfaction
    9.4
    Ease of Use
    Average: 8.6
    8.9
    Business and Data Glossary
    Average: 8.5
    8.8
    Metadata Management
    Average: 8.5
    8.7
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Year Founded
    2020
    HQ Location
    Atlanta , GA
    Twitter
    @coginiti
    69 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    28 employees on LinkedIn®
Product Description
How are these determined?Information
This description is provided by the seller.

Coginiti is a SQL-first collaborative data operations platform that empowers teams to build, publish, and consume quality data products, streamlining the data analytics lifecycle from inception to ins

Users
No information available
Industries
No information available
Market Segment
  • 66% Enterprise
  • 28% Mid-Market
Coginiti Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Ease of Use
5
User Experience
5
User Interface
4
Data Management
3
Efficiency Improvement
3
Cons
Poor Documentation
3
Search Functionality
2
Limited Customization
1
Slow Performance
1
Training Required
1
Coginiti features and usability ratings that predict user satisfaction
9.4
Ease of Use
Average: 8.6
8.9
Business and Data Glossary
Average: 8.5
8.8
Metadata Management
Average: 8.5
8.7
Data Lineage
Average: 8.7
Seller Details
Year Founded
2020
HQ Location
Atlanta , GA
Twitter
@coginiti
69 Twitter followers
LinkedIn® Page
www.linkedin.com
28 employees on LinkedIn®
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    A machine-learning-based data catalog that allows to classify and organize data assets across cloud, on-premises, and big data. It provides maximum value and reuse of data across enterprise.

    Users
    No information available
    Industries
    • Information Technology and Services
    • Computer Software
    Market Segment
    • 48% Enterprise
    • 24% Mid-Market
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • Informatica Enterprise Data Catalog Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Data Cataloging
    1
    Data Governance
    1
    Data Lineage
    1
    Metadata Management
    1
    Cons
    Expensive
    1
    Lineage Limitations
    1
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • Informatica Enterprise Data Catalog features and usability ratings that predict user satisfaction
    7.7
    Ease of Use
    Average: 8.6
    7.7
    Business and Data Glossary
    Average: 8.5
    8.0
    Metadata Management
    Average: 8.5
    8.3
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Year Founded
    1993
    HQ Location
    Redwood City, CA
    Twitter
    @Informatica
    102,081 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    5,576 employees on LinkedIn®
    Ownership
    NYSE: INFA
Product Description
How are these determined?Information
This description is provided by the seller.

A machine-learning-based data catalog that allows to classify and organize data assets across cloud, on-premises, and big data. It provides maximum value and reuse of data across enterprise.

Users
No information available
Industries
  • Information Technology and Services
  • Computer Software
Market Segment
  • 48% Enterprise
  • 24% Mid-Market
Informatica Enterprise Data Catalog Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Data Cataloging
1
Data Governance
1
Data Lineage
1
Metadata Management
1
Cons
Expensive
1
Lineage Limitations
1
Informatica Enterprise Data Catalog features and usability ratings that predict user satisfaction
7.7
Ease of Use
Average: 8.6
7.7
Business and Data Glossary
Average: 8.5
8.0
Metadata Management
Average: 8.5
8.3
Data Lineage
Average: 8.7
Seller Details
Year Founded
1993
HQ Location
Redwood City, CA
Twitter
@Informatica
102,081 Twitter followers
LinkedIn® Page
www.linkedin.com
5,576 employees on LinkedIn®
Ownership
NYSE: INFA
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    Alation is the data intelligence company. Nearly 600 global enterprises — including 40% of the Fortune 100 — rely on Alation to realize value from their data and AI initiatives. Customers such as Cisc

    Users
    No information available
    Industries
    • Financial Services
    • Computer Software
    Market Segment
    • 66% Enterprise
    • 25% Mid-Market
  • Pros and Cons
    Expand/Collapse Pros and Cons
  • Alation Pros and Cons
    How are these determined?Information
    Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
    Pros
    Ease of Use
    10
    Customer Support
    5
    Data Cataloging
    5
    Integrations
    5
    User Interface
    5
    Cons
    Lineage Limitations
    4
    Data Quality
    3
    Difficult Learning
    3
    Expensive
    3
    Missing Features
    3
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • Alation features and usability ratings that predict user satisfaction
    8.4
    Ease of Use
    Average: 8.6
    9.0
    Business and Data Glossary
    Average: 8.5
    8.3
    Metadata Management
    Average: 8.5
    7.1
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Seller
    Alation
    Company Website
    Year Founded
    2012
    HQ Location
    Redwood City, CA
    Twitter
    @Alation
    3,609 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    668 employees on LinkedIn®
Product Description
How are these determined?Information
This description is provided by the seller.

Alation is the data intelligence company. Nearly 600 global enterprises — including 40% of the Fortune 100 — rely on Alation to realize value from their data and AI initiatives. Customers such as Cisc

Users
No information available
Industries
  • Financial Services
  • Computer Software
Market Segment
  • 66% Enterprise
  • 25% Mid-Market
Alation Pros and Cons
How are these determined?Information
Pros and Cons are compiled from review feedback and grouped into themes to provide an easy-to-understand summary of user reviews.
Pros
Ease of Use
10
Customer Support
5
Data Cataloging
5
Integrations
5
User Interface
5
Cons
Lineage Limitations
4
Data Quality
3
Difficult Learning
3
Expensive
3
Missing Features
3
Alation features and usability ratings that predict user satisfaction
8.4
Ease of Use
Average: 8.6
9.0
Business and Data Glossary
Average: 8.5
8.3
Metadata Management
Average: 8.5
7.1
Data Lineage
Average: 8.7
Seller Details
Seller
Alation
Company Website
Year Founded
2012
HQ Location
Redwood City, CA
Twitter
@Alation
3,609 Twitter followers
LinkedIn® Page
www.linkedin.com
668 employees on LinkedIn®
  • Overview
    Expand/Collapse Overview
  • Product Description
    How are these determined?Information
    This description is provided by the seller.

    IBM Watson® Knowledge Catalog is a unified data catalog that can help your data users quickly find, curate, categorize and share data, analytical models and their relationships with other members of y

    Users
    No information available
    Industries
    No information available
    Market Segment
    • 42% Enterprise
    • 32% Small-Business
  • User Satisfaction
    Expand/Collapse User Satisfaction
  • IBM Knowledge Catalog features and usability ratings that predict user satisfaction
    8.7
    Ease of Use
    Average: 8.6
    7.5
    Business and Data Glossary
    Average: 8.5
    7.5
    Metadata Management
    Average: 8.5
    8.3
    Data Lineage
    Average: 8.7
  • Seller Details
    Expand/Collapse Seller Details
  • Seller Details
    Seller
    IBM
    Year Founded
    1911
    HQ Location
    Armonk, NY
    Twitter
    @IBM
    711,154 Twitter followers
    LinkedIn® Page
    www.linkedin.com
    317,108 employees on LinkedIn®
    Ownership
    SWX:IBM
Product Description
How are these determined?Information
This description is provided by the seller.

IBM Watson® Knowledge Catalog is a unified data catalog that can help your data users quickly find, curate, categorize and share data, analytical models and their relationships with other members of y

Users
No information available
Industries
No information available
Market Segment
  • 42% Enterprise
  • 32% Small-Business
IBM Knowledge Catalog features and usability ratings that predict user satisfaction
8.7
Ease of Use
Average: 8.6
7.5
Business and Data Glossary
Average: 8.5
7.5
Metadata Management
Average: 8.5
8.3
Data Lineage
Average: 8.7
Seller Details
Seller
IBM
Year Founded
1911
HQ Location
Armonk, NY
Twitter
@IBM
711,154 Twitter followers
LinkedIn® Page
www.linkedin.com
317,108 employees on LinkedIn®
Ownership
SWX:IBM

Learn More About Machine Learning Data Catalog Software

What is a Machine Learning Data Catalog?

Machine learning data catalog (MLDC) is an automated data catalog that carries out tasks like crawling metadata, cataloging, and classifying personally identifiable information (PII) data. Machine learning data catalogs organize the dataset inventory using metadata.

Data catalogs help companies know where the data is stored, thus reducing the time taken to identify data and making it easily accessible for analytics. They are inventories of assets like tables, schema, files, and charts in organizations, aiding in solving a company's data discovery, quality, and governance challenges.

What does MLDC Stand For?

MLDC is an acronym for Machine Learning Data Catalog. 

What are the Common Features of Machine Learning Data Catalogs?

Machine learning data catalogs simplify the manual functions of a data catalog. A data catalog is an essential part of the data management strategy of any organization. Some of the features of machine learning data catalogs are:

Data ingestion and discovery: Machine learning data catalogs must have prebuilt adapters to connect to different company systems like applications, databases, files, and external APIs. These adapters help in discovering metadata from systems. Metadata can be table names, attribute names, and constraints. The feature helps build native connectivity like integrations for data sources, business intelligence (BI) solutions, and data science tools.

Business glossary: Although a good amount of data is stored in the repository, it is also essential for the users to understand what the stored data means. The glossary feature links this data to business terms giving it more meaning. 

Automated data labeling: Data labeling is a prerequisite for machine learning algorithms. Automated data labeling is more accurate than manual since it eliminates human errors. Data labeling usually involves annotators identifying objects in images to build quality artificial intelligence (AI) training data. Automated labeling eliminates the challenges posed by the tedious annotation cycles.

Data lineage: Data lineage is the process that helps the users know who, why, when, and where changes are made to the data. It is a part of metadata management. MLDCs automate the data lineage process. Data lineage helps determine when new or changed data require retraining machine learning models. MLDCs usually parse through query logs into data lakes and other data sources automatically to create a data lineage map.

Data quality monitoring and anomaly detection: Data quality monitoring helps users understand if the data came from a trusted source. The machine learning data catalog also has a feature to identify sudden changes in data using machine learning algorithms. The users are immediately alerted to any changes or anomalies that are detected. 

Semantic search for data sets: Machine learning data catalogs provide users with visual and intuitive searches like search engines. Almost every user in any organization is a data user, but not everyone can use SQL queries to use data. The semantic search feature makes it easier for all users to discover data sets.

Compliance capabilities: This feature ensures that sensitive data is not exposed and that the user can trust the data. It further helps keep data governance policies in place and strengthen data management in the organization. Data stewards can identify low-quality data and restrict access to sensitive data, thus helping comply with regulations such as the General Data Protection Regulation (GDPR).

Data profiling: Data profiling helps check the data from the data source and collects information about it. This process helps in knowing data quality issues much better, thus making the data management process more efficient.

What are the Benefits of Machine Learning Data Catalogs?

A machine learning data catalog provides several benefits to different types of users in the organization. These include:

Ease in data curation: Data curation is a process of collecting, organizing, labeling, and cleaning data. Machine learning data catalogs validate metadata and organize insights into correct repositories using machine learning algorithms.

Ease of search: Because of semantic search, it becomes easier for non-technical users to search and discover data for use since they do not have to use SQL queries every time to access data.

Ease in data collaboration: Machine learning data catalogs help the users collaborate, use, and share data sets because machine learning data catalogs ease finding and storing siloed data.

Who Uses Machine Learning Data Catalogs?

Machine learning data catalogs centralize metadata for various data assets. By organizing the metadata, MLDCs help organizations to govern data access.

Data analysts: Data analysts use MLDC to discover, classify, and manipulate data for their analytics processes. They can also discover AI or machine learning models, understand how they work, and import them into their BI tools. Data catalogs help data analysts make companies into self-service organizations. Self-service analytics is important for any organization that wants to be driven by insights. Machine learning data catalogs help the users know the means to find, understand, and trust data.

Marketers: Marketing teams use the machine learning data catalog more commercially. They obtain insights for making better decisions using data catalogs.

Data scientists: Data scientists usually publish their models for reuse. Data scientists always look for one platform that centralizes data for different projects. 

Challenges with Machine Learning Data Catalogs

Although machine learning data catalogs help solve major challenges in traditional data catalogs like data discovery and data lineage, MLDCs also come with challenges.  

Scalability: It is tricky for all MLDCs to support a huge metadata volume. Sometimes, the data catalogs break down due to performance issues when overloaded with enormous amounts of metadata. Initially, data used to be stored in the company's mainframe data center. However, due to today's big data, machine learning data catalogs must keep track of data in both cloud and data lakes.

Fragmentation in evaluating a product: If a data catalog is too bulky, it causes fragmentation in the user's journey of evaluating a product. Too much data makes users use too many tools, thus breaking a seamless experience into fragments.

How to Buy Machine Learning Data Catalogs

Requirements Gathering (RFI/RFP) for Machine Learning Data Catalogs

The machine learning data catalog offers many features to help users identify usable data. A buyer can choose the right MLDC software depending on the organization's needs. RFP/RFIs help the organization look for pricing, product features, and guidelines.

Compare Machine Learning Data Catalog Products

Create a long list

The first step is to look for all the possible players in the space. This gives an advantage of evaluating the vendors for the price, product features, and customer service. 

Create a short list

After evaluating the potential vendors, the company can narrow the list to those who check all their boxes.

Conduct demos

Demos help in understanding the product as a whole. A team of IT professionals and data scientists should join these demos to understand the product's functionality, whereas the marketing team can join in to analyze the business use of the software in the projects.

Selection of Machine Learning Data Catalogs

Choose a selection team

A team of marketing professionals with data scientists and IT professionals can communicate any queries related to the MLDC product with the vendors. A data scientist would be more interested in knowing the technical features of the software. A marketing manager would be curious to know how the marketing team could use MLDC for any project. An IT professional would want to understand the software installation procedure.

Negotiation

Once the vendor quotes the price, the negotiations begin. The price is fixed based on the cost of other similar products available in the market and the extent to which the product can solve the challenges.

Final decision

The final decision is based on agreements between the vendor and the buyer.