Keboola Features
Data Integration (6)
1st-Party Data Integration
Integration with ad servers, DSPs, and SSPs, and other data sources for in- and out-bound data transfer.
2nd-Party Data Integration
Allows user to combine first party data collected with additional first party data of strategic partners.
3rd-Party Data Integration
Breadth of standard integrations to third-party data providers.
Offline Data Integration
Able to easily integrate data from offline sources.
Mobile Data Integration
Able to easily integrate mobile data within the DMP platform.
Data Export Tools
Able to extract data from the DMP in bulk through a structured file.
Data Analysis & Optimization (4)
Audience Segmentation
Ease of segmenting user profiles into groups.
Recommendations
Analyzes data to find and recommend the highest value customer segmentations.
Standard Dashboards
Provides pre-built dashboards to aggregate and display data.
Custom Reports
Allows users to easily build customized reports and dashboards.
Platform (5)
Data Permissions
Able to set data permissions for strategic partners.
User, Role, and Access Management
Grant access to select data, features, objects, etc. based on the users, user role, groups, etc.
Performance and Reliability
Software is consistently available (uptime) and allows users to complete tasks quickly because they are not waiting for the software to respond.
Enterprise Scalability
Provides features to allow scaling for large organizations.
Internationalization
Enables users to use the DMP in multiple languages.
Data Transformation (2)
Real-Time Analytics
Facilitates analysis of high-volume, real-time data.
Data Querying
Allows user to query data through query languages like SQL.
Connectivity (4)
Hadoop Integration
Aligns processing and distribution workflows on top of Apache Hadoop
Spark Integration
Aligns processing and distribution workflows on top of Apache Spark
Multi-Source Analysis
Integrates data from multiple external databases.
Data Lake
Facilitates the dissemination of collected big data throughout parallel computing clusters.
Operations (5)
Data Visualization
Processes data and represents interpretations in a variety of graphic formats.
Data Workflow
Strings together specific functions and datasets to automate analytics iterations.
Governed Discovery
Isolates certain datasets and facilitates management of data access.
Embedded Analytics
Allows big data tool to run and record data within external applications.
Notebooks
Use notebooks for tasks such as creating dashboards with predefined, scheduled queries and visualizations
Functionality (11)
Diverse Extraction Points
Pull any required data from a variety of sources, including email, web pages, PDFs, and other documents.
Data Structuring
Organize extracted data into a more easily digestible structure.
Consolidation
Amass extracted data in a variety of data formats like spreadsheets and .csv.
Data Cleaning
Clean extracted data by removing duplicates, clearing excess characters, grouping by characteristic, and more.
Cloud Extraction
Stores data in cloud storage for access at any point.
Visualization
Generate visual data representations from extracted data.
Extraction
Extract data from the designated source(s) like relational databases, JSON files, and XML files.
Transformation
Cleanse and re-format extracted data to the needed target format.
Loading
Load reformatted data into target database, data warehouse, or other storage location.
Automation
Arrange ETL processes to occur automatically on needed time schedule (e.g., daily, weekly, monthly).
Scalability
Capable of scaling processing power up or down based on ETL volume.
Management (6)
Reporting
View ETL process data via reports and visualizations like charts and graphs.
Auditing
Record ETL historical data for auditing and potential data correction needs.
Real-Time Analytics
Track data and system backups, replication, and failover events.
Solution Integration
Integrate with other management, resilience, or backup solutions
Infrastructure Compatibility
Be compatible with a variety of cloud and/or virtual infrastructure.
Monitoring
Monitor replication environment via GUI
Solution Provision (2)
System Failover
Provide as-before failover capabilities for cloud environments.
Pay by Usage
Services are offered under a pay-as-you-go or utilization-based purchase model.
Data Management (10)
Data Integration
Consolidates, Cleanses and Normalizes data from multiple disparate sources.
Data Compression
Helps save storage capacity and improves query performance.
Data Quality
Eliminates data inconsistency and duplications ensuring data integrity.
Built-In Data Analytics
SQL based analytics functions like Time series, pattern matching, geospatial analytics etc.
In-Database Machine Learning
Provides built in capabilities like machine learning algorithms, data preparation functions, model evaluation and management etc.
Data Lake Analytics
Allows data querying across data formats like parquet, ORC, JSON etc and analyze complex data types on HDFS
Data Integration
Integrates data and data-related technologies into a single environment.
Metadata
Provides metadata management capabilities.
Self-service
Empowers the user via a self-service capability to manage data workflows.
Automated workflows
Completely automates end-to-end data workflows across the data integration lifecycle.
Integration (3)
AI/ ML Integration
Integrates with data science workflows, Machine Learning and artificial intelligence (AI) capabilities.
BI Tool Integration
Integrates with BI Tools to transform data into Actionable Insights.
Data lake Integration
Provides speed in data processing and capturing unstructured, semi-structured and streaming data.
Deployment (2)
On-Premise
Provides On-Premise deployment options.
Cloud
Provides Cloud deployment options (private or public cloud, hybrid cloud).
Performance (1)
Scalability
Manages huge volumes of data, upscale or downscale as per demand.
Security (5)
Data Governance
Policies, procedures and standards to manage and access data.
Data Security
Restricts data access at a cell level, mask or hide parts of cells, and encrypt data at rest and in motion
Data Encryption
Employs data encryption both at rest and in transit.
Security Standards
Comply to key industry standards like SOC 2, ISO 27001, PCI DSS, HIPAA etc to protect and safeguard data.
Communication Protocol
Supports secure communication protocols like FTPS, SFTP etc
Development (7)
Real-Time Integration
Supports event or transaction based integrations that react to changes in real-time
API Designer
Provides a web-based interface for designing, documenting, and testing APIs.
Flow Designer
Allows for integration development through the visual development of integration logic flows with the help od a drag and drop user interface.
Pre-Built Connectors
Facilitiates API development and integrations with prebuilt connectors, templates, and examples.
Custom Connectors
Provides the ability to create connectors from existing services and APIs in catalog.
Reusable connectors
Provides prebuilt, reusable connectors and workflows for all user integration requirements.
Multi - tenant architecture
Enables multiple tenants (customers) to securely share physical computing resources.
Management (5)
Monitoring & Notification
Console for monitoring resource utilization, system health, ability to start and stop processes, etc.
Routing And Orchestration
Enables data routing on configuration basis and management of complex workflows through an orchestration engine.
Data Mapping
Facilitates to and fro data map according to the data model between applications/web services.
Data Transformation
Provides standard tools and functions to convert data values from the data format of a source system into the data format of a destination system.
API Management
Supports and oversees the management of API products across their full lifecycle
Integration Options (4)
Data Virtualization
Integrates data from disparate sources, without physical data movement
Managed File Transfers
Provides one or more methods to securely transfer data from one location to another through a network
Big Data Processing
Provides integration to BigData sources such as Hadoop and other NoSQL sources (MongoDB, Cassandra and HBase)
EDI
Provides Integrations to EDI service providers.
Deployment (1)
Cloud to Cloud
Supports deployment of integration apps to a managed cloud environment
Analytics (2)
Analytics capabilities
Provides a high performance, flexibile analytics platform to support data management and embrace data driven decision making.
Dasboard visualizations
Collect and displays metrics across the data integration via a dashboard.
Monitoring and Management (2)
Data Observability
Involved solely in monitoring data pipelines, sending alerts and troubleshooting data.
Testing capabilities
Deploys testing capabilities such as report testing, big data testing, cloud data migration testing, ETL and data warehouse testing.
Cloud Deployment (2)
Hybrid cloud support
Supports analytical platforms and data pipelines across complex hybrid environments.
Cloud migration capabilities
Supports migration of component or pipeline to different cloud environments.
Diverse Extraction Points (1)
Diverse Extraction Points
Pull any required data from a variety of sources, including email, web pages, PDFs, and other documents.
Generative AI (4)
AI Text Generation
Allows users to generate text based on a text prompt.
AI Text Summarization
Condenses long documents or text into a brief summary.
AI Text Generation
Allows users to generate text based on a text prompt.
AI Text Summarization
Condenses long documents or text into a brief summary.
Agentic AI - DataOps Platforms (5)
Autonomous Task Execution
Capability to perform complex tasks without constant human input
Multi-step Planning
Ability to break down and plan multi-step processes
Cross-system Integration
Works across multiple software systems or databases
Adaptive Learning
Improves performance based on feedback and experience
Decision Making
Makes informed choices based on available data and objectives
Agentic AI - Data Management Platform (DMP) (6)
Autonomous Task Execution
Capability to perform complex tasks without constant human input
Multi-step Planning
Ability to break down and plan multi-step processes
Cross-system Integration
Works across multiple software systems or databases
Adaptive Learning
Improves performance based on feedback and experience
Proactive Assistance
Anticipates needs and offers suggestions without prompting
Decision Making
Makes informed choices based on available data and objectives





