

Spark SQL lets you query structured data inside Spark programs, using either SQL or a familiar DataFrame API. Usable in Java, Scala, Python and R.

Spark Streaming brings Apache Spark's language-integrated API to stream processing, letting you write streaming jobs the same way you write batch jobs. It supports Java, Scala and Python. Spark Streaming recovers both lost work and operator state (e.g. sliding windows) out of the box, without any extra code on your part.

Apache OFBiz is an open source product for the automation of enterprise processes that includes framework components and business applications for ERP (Enterprise Resource Planning), CRM (Customer Relationship Management), E-Business / E-Commerce, SCM (Supply Chain Management), MRP (Manufacturing Resource Planning), MMS/EAM (Maintenance Management System/Enterprise Asset Management), POS (Point Of Sale).

Writer has everything you would expect from a modern, fully equipped word processor. It is simple enough for a quick memo, yet powerful enough to create complete books with contents, diagrams, indexes, etc. You're free to concentrate on your ideas while Writer makes them look great.

Cassandra's data model offers the convenience of column indexes with the performance of log-structured updates, strong support for denormalization and materialized views, and powerful built-in caching.

Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.

Apache Arrow is a cross-language development platform designed for in-memory data processing and efficient data interchange. It provides a standardized, language-independent columnar memory format that supports both flat and hierarchical data structures. This format is optimized for analytical operations on modern hardware, including CPUs and GPUs, facilitating high-performance data analytics and seamless integration across various data processing systems. Key Features and Functionality: - Columnar Memory Format: Arrow's in-memory columnar format is tailored for efficient analytic operations, enabling vectorized computations that leverage modern processor capabilities. - Zero-Copy Data Sharing: The platform allows for zero-copy reads, enabling rapid data access without the overhead of serialization and deserialization, thus enhancing performance in data-intensive applications. - Multi-Language Support: Arrow offers libraries in multiple programming languages, including C++, Java, Python, R, and more, ensuring broad compatibility and ease of integration into diverse development environments. - Interoperability with Data Formats: It provides tools for reading and writing various file formats such as CSV, Apache Parquet, and Apache ORC, facilitating smooth data interchange between different systems. - In-Memory Analytics and Query Processing: Arrow includes components for in-memory analytics and query processing, supporting efficient data manipulation and analysis directly in memory. Primary Value and Problem Solved: Apache Arrow addresses the challenges associated with processing large datasets by offering a unified, efficient in-memory data representation. By standardizing the columnar memory format and providing zero-copy data sharing, it significantly reduces the computational overhead typically involved in data serialization and deserialization. This leads to faster data processing and analytics, enabling developers to build high-performance applications that can handle complex data structures across various programming languages and platforms. Arrow's interoperability with existing data formats and its support for multiple languages make it a versatile tool for developers aiming to optimize data workflows and enhance the performance of data-driven applications.

Apache Geronimo is an open source server runtime that integrates open source projects to create Java/OSGi server runtimes designed to meet the needs of enterprise developers and system administrators.

Aurora runs applications and services across a shared pool of machines, and is responsible for keeping them running, forever. When machines experience failure, Aurora intelligently reschedules those jobs onto healthy machines.



Community-led development since 1999. FoundationProjectsPeopleGet InvolvedDownloadSupport ApacheHome. We consider ourselves not simply a group of projects sharing a server, but rather a community of developers and users.