blog   contact    
 
 

Aster nCluster: In-Database MapReduce

The "Big Data" Analysis Challenge

Exponential data growth coupled with the desire to do more with data is forcing organizations to seek newer methods of analyzing data. The limited functionality of SQL has traditionally forced companies toward architectures that required the majority of data analysis to be done in the application-tier using programming languages such as Java, Python, C++, R, etc. At current volumes, such architectures are no longer feasible, as transferring big data volumes between the data warehouse and application-tier does not scale.

The Power of MapReduce with the Rich Functionality of an RDBMS

Aster nCluster provides a first in the database world: Aster In-Database MapReduce. MapReduce is a programming model that was invented at Google in 2003 to process large unstructured data-sets distributed across thousands of nodes. In-Database MapReduce enables enterprises to harness the power of MapReduce while managing their data in Aster nCluster, a highly-scalable relational database for frontline data warehousing.

Just like its massively parallel execution environment for standard SQL queries, Aster nCluster now adds the ability to implement flexible MapReduce functions for parallel data analysis and transformation inside the database. Aster nCluster In-Database MapReduce functions are simple to write and are seamlessly integrated within SQL statements. They rely on SQL queries to manipulate the underlying data and provide input. The functions can procedurally manipulate such input data and provide outputs that can be further consumed by SQL queries or be written into tables within the database.

In-Database MapReduce unites MapReduce with functionally-rich SQL


Aster nCluster understands the input and output data characteristics to automatically optimize SQL processing and provide fault tolerance, load balancing and workload management for MapReduce functions. Aster In-Database MapReduce provides:

  • Expressive Flexibility - polymorphic SQL MapReduce (SQL/MR) functions with any popular language including Java, Python, Perl, and more.
  • Reusability of SQL/MR Components � SQL/MR functions that are developed can be reused by analysts as simple SQL extensions or through standard BI tools.� SQL/MR unleashes the power of MapReduce to the entire enterprise.
  • High Performance - eliminates I/O bottlenecks by moving computation into the database; dynamically optimize execution of SQL/MR queries via a cost-based optimizer.
  • High Availability & Resource Management - maximize availability through fault isolation of SQL/MR functions; manage memory resources consumed by SQL/MR functions

Read much more on In-Database MapReduce on our blog

Download Now:
Scaling Up to Support Large-Scale Reporting and Analytics
Beyond Reporting: Requirements for Large-Scale Analytics
Technical Details: In-Database MapReduce
Overview/Demo: In-Database MapReduce
"We're excited about In-Database MapReduce and the promise it offers of scalable execution of advanced statistics without having to move data to a separate statistics platform."

Scott Becker, CTO of Invite Media

"Aster is already pushing the envelope for in-database analytics … through its unique integration of SQL with MapReduce … that scales to potentially thousand of nodes and petabytes of data."

Wayne Eckerson, Director of Research for TDWI