Aster MapReduce Analytics Portfolio:
Supercharge Analytics with SQL-MapReduce®
Analytic applications as varied as digital marketing optimization, social network analysis, fraud detection, and machine data analysis require massively parallel processing of very large data volumes. Until recently, massively parallel processing (MPP) of data for these types of rich analytics required extremely specialized programming skills - a combination of both deep SQL skills and parallel programming expertise.
MapReduce, an emerging standard for advanced analytics and data science, allows for parallel processing of terabytes to petabytes of data. However, in its raw form, MapReduce programming is a significant hurdle for many organizations. Aster Data makes MapReduce accessible to every enterprise by coupling MapReduce with standard SQL to deliver analytics through the patented Aster Data SQL-MapReduce® framework. Now even business analysts can leverage the power of MapReduce through the familiarity of SQL.
The Aster MapReduce Analytics Portfolio
Aster MapReduce Analytics Portfolio, formerly called the Aster Data Analytic Foundation, provides a suite of ready-to-use SQL-MapReduce modules to accelerate data science application development. Pattern, time series, market basket, graph, and advanced statistical analysis are as simple as writing a single SQL statement to call the appropriate pre-packaged module embedded within the Aster MapReduce Platform, which incorporates new multi-structured data sources and types. Parallel performance, big data scale and analytic richness are only a SQL statement away.
Advantages of Aster Data Analytic Foundation for customers include:
- High performance on large data sets: Clickstream analysis that took SQL 6 minutes to run, now runs in 77 seconds with SQL-MapReduce
- Fast development: A 7-step, 350-line SQL query for marketing analysis is now delivered in under 20 lines of SQL-MapReduce
- Richer analytics: Doubling of scope for market basket analysis requires only a single parameter change in SQL-MapReduce doable by any business analyst who knows SQL.
Aster MapReduce Analytics Portfolio:
A suite of business-ready analytic modules powered by SQL-MapReduce® that run fully embedded in the Aster MapReduce Platform
To enable big data analytics in the enterprise, Aster MapReduce Analytics Portfolio delivers a unique framework that provides:
- Automatic parallelization for applications running in the Aster Database
- 100% embedded processing so that analytic processing is collocated in-database with the data – no sampling required
- Extensive suite of pre-built analytics that are MapReduce-enabled, e.g. pattern, time-series, clustering, graph, market basket, statistical analysis, etc.
- Easily usable by business analysts by coupling SQL with MapReduce for ultra-simple formulation of advanced queries
Aster Database uniquely leverages both MapReduce and SQL within an embedded analytic engine. Aster Data's patented SQL-MapReduce analytics framework is unique to Teradata Aster MapReduce Platform. It enables analytic applications to be automatically parallelized upon deployment into the Aster Data analytic platform. SQL-MapReduce provides:
- Powerful Expressiveness – Dramatic reduction in SQL code complexity and the procedural flexibility to express any ad hoc query using the language of choice (Java, C/C++, Python, Perl, R, etc).
- Seamless SQL Integration – Users (data scientists, SQL developers, data miners, business users via BI tool) simply plug in the SQL-MapReduce function in arbitrarily composable SQL code that they already know.
- Re-Usability – Save significant resource time by avoiding re-writing a new function every time the output changes. Late binding support enables SQL-MapReduce to adapt at the last possible moment (run-time).
- Scalable Performance – Distributed query planning and optimizations apply the same for your complete analytic application, including custom analytic code and SQL-MapReduce programs, as well as standard SQL, enabling high-speed parallel processing and complete distributed application optimization.
- Fault Isolation – Sandboxed containers and process management ensure strict isolation to prevent any single SQL-MapReduce statement from taking down another (e.g. due to poorly written code).
Aster MapReduce Analytic Portfolio modules are processed 100% in-database, bringing analysis close to the data. This avoids forced data sampling and the massive data movement required by traditional systems. By pushing analytic processing down into the database, Aster Database is able to efficiently distribute both data and analytic computations across massively parallel processing nodes, ensuring the highest levels of performance and scalability.
Aster Data customers typically see 10x or greater performance improvement from in-database SQL-MapReduce implementations compared to running standard SQL on a traditional system. More importantly, performance is exceptional on terabytes to petabytes of data, which SQL-only systems cannot deliver.
Overcoming the barriers to advanced analytic adoption means making analytic applications easy to develop and easy to scale. Aster MapReduce Analytics Portfolio provides 40+ pre-built SQL-MapReduce modules that are fully parallelized and ready to deliver business value.
Aster MapReduce Analytics Portfolio includes many pre-built rich analytic packages that simplify usage of MapReduce. Examples include:
- Path and Pattern Analysis — Discover patterns in rows of sequential data
- Statistical Analysis — Process common statistical calculations with exceptionally high performance
- Relational Analysis — Discover important relationships between data points
- Text Analysis — Derive insights from lengthy descriptive fields and textual strings
- Clustering Analysis — Discover natural groupings of data points
- Data Transformation — Transform raw data for more advanced insights
- Data Parsers—Parse raw multi-structured data sources like Apache weblogs and heavily nested XML feeds
Aster MapReduce Analytics Portfolio modules can be used on their own or in conjunction with one another, with standard SQL, with custom SQL-MapReduce functions or with any analytic logic designed to run in-database in Aster Database.
In addition, Aster Data provides 1000's of MapReduce-ready functions for the developer. An extensive library of Java and C packages is available out of the box to speed development of custom SQL-MapReduce analytic applications. These packages are available in native development languages like Java or C and do not impose a learning curve of a specialized, proprietary language. Sample packages for the power user include:
- Monte Carlo simulation
- Linear algebra
- And many more
Advanced analytics are becoming so critical to compete in today's business world that the barrier needs to be lowered for advanced analytic usage. Aster Data makes it easy to create advanced analytic data science applications by providing an environment and tools for organizations to go beyond the limitations and complexity of SQL analytics, reducing time and effort to deliver deeper analytic insights such as pattern analysis, path analysis, graph analysis, and much more.
Analysts can not only easily incorporate SQL-MapReduce functions into complete analytic applications, but module parameters allow for easy expansion of the data scope of SQL-MapReduce functions. Take the Basket Generator function for example. Implemented in standard SQL, a market basket analysis function requires the addition of tens of lines of code to increase basket size for analysis. With the Aster Data business-ready SQL-MapReduce function, Basket Generator, a business analyst can increase basket size with a single parameter change. This dramatically simplifies and speeds application development, putting the power of iterative, ad hoc analysis directly in the hands of the data analyst.
Read much more on SQL-MapReduce on our blog.
Data Sheet: Aster Database
Data Sheet: Aster MapReduce Analytics Portfolio
Whitepaper: In-Database Analytics with R
Advanced In-Database Analytics Done Right
Research Report: MapReduce and the Data Scientist
Recent attempts to bring analytic logic into databases as user defined functions or stored procedures are a step in the right direction, but inherently limited because most databases aren't optimized for application logic. Aster Data has tackled this issue by embedding the equivalent of an application server in the database, such that application logic is fully parallelized for maximum speed and scalability with advanced data analytics.
Philip Russom, Senior Manager of TDWI Research