Apache Spark 1.5.0 recently released, is the sixth release on the 1.x line. This release represents 1400+ patches from 230+ contributors and 80+ institutions. Apache Spark is a fast and general engine for large-scale data processing. Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing and runs on Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources including HDFS, Cassandra, HBase, and S3.
The Apache Software Foundation has announced the release of Apache HBase v1.0. “Apache HBase v1.0 marks a major milestone in the project’s development,” said Michael Stack, Vice President of Apache HBase. “It is a monumental moment that the army of contributors who have made this possible should all be proud of. The result is a thing of collaborative beauty that also happens to power key, large-scale Internet platforms.“ Dubbed the “Hadoop Database“, HBase is used on top of Apache Hadoop and HDFS (Hadoop Distributed File System) for random, real-time read/write access for Big Data (billions of rows X millions of columns) across clusters of commodity hardware.