Aster nCluster: In-Database MapReduce
The "Big Data" Analysis Challenge
Exponential data growth coupled with the desire to do more with data is forcing organizations to seek newer methods of analyzing data. The limited functionality of SQL has traditionally forced companies toward architectures that required the majority of data analysis to be done in the application-tier using programming languages such as Java, Python, C++, R, etc. At current volumes, such architectures are no longer feasible, as transferring big data volumes between the data warehouse and application-tier does not scale.
The Power of MapReduce with the Rich Functionality of an RDBMS
Aster nCluster provides a first in the database world: Aster In-Database MapReduce. MapReduce is a programming model that was invented at Google in 2003 to process large unstructured data-sets distributed across thousands of nodes. In-Database MapReduce enables enterprises to harness the power of MapReduce while managing their data in Aster nCluster, a highly-scalable relational database for frontline data warehousing.
Just like its massively parallel execution environment for standard SQL queries, Aster nCluster now adds the ability to implement flexible MapReduce functions for parallel data analysis and transformation inside the database. Aster nCluster In-Database MapReduce functions are simple to write and are seamlessly integrated within SQL statements. They rely on SQL queries to manipulate the underlying data and provide input. The functions can procedurally manipulate such input data and provide outputs that can be further consumed by SQL queries or be written into tables within the database.
In-Database MapReduce unites MapReduce with functionally-rich SQL
Aster nCluster understands the input and output data characteristics to automatically optimize SQL processing and provide fault tolerance, load balancing and workload management for MapReduce functions.
Aster In-Database MapReduce provides:
- Expressive Flexibility - polymorphic SQL MapReduce (SQL/MR) functions with any popular language including Java, Python, Perl, and more.
- Reusability of SQL/MR Components � SQL/MR functions that are developed can be reused by analysts as simple SQL extensions or through standard BI tools.� SQL/MR unleashes the power of MapReduce to the entire enterprise.
- High Performance
- eliminates I/O bottlenecks by moving computation into the database; dynamically optimize execution of SQL/MR queries via a cost-based optimizer.
- High Availability & Resource Management
- maximize availability through fault isolation of SQL/MR functions; manage memory resources consumed by SQL/MR functions
Read much more on In-Database MapReduce on our blog
|