Back in 2005, when we first founded Aster Data, our vision was to take some of the latest technology innovations – including MPP shared-nothing architectures; Linux-based commodity hardware; and novel analytical interfaces like Google’s MapReduce – and bring them to mainstream enterprises. This vision translated into a strategy focused not only on big data innovations, but also on delivering technologies that make big data viable for enterprise environments. SQL-MapReduce®, our industry-leading patented technology that combines standard SQL processing with a native MapReduce execution environment, is one example of how we make big data enterprise ready.
Today we have completed another major milestone on providing value to our customers by announcing a major innovation: Aster SQL-H™, a seamless way to execute SQL & SQL-MapReduce on Apache™ Hadoop™ data.
This is a significant step forward from what was state-of-the-art until yesterday. What was missing? A common DBMS-Hadoop connector operating at the physical layer. This means that getting data from Hadoop to a database required a Hadoop expert in the middle to do the data cleansing and the data type translation. If the data was not 100% clean (which is the case in most circumstances) a developer was needed to get it to a consistent, proper form. Besides wasting the valuable time of that expert, this process meant that business analysts couldn’t directly access and analyze data in Hadoop clusters. Other database connectors require duplicating the data into HDFS by using proprietary formats; a cumbersome and expensive approach by any measure.
SQL-H, an industry-first, solves all those problems.
First, we have integrated Aster’s metadata engine with Hadoop’s emerging metadata standard, HCatalog. This means that data stored in Hadoop using Pig, Hive & HBase can be “seen” in an Aster system as if they are just another Aster view. The business implication is that a business analyst using standard SQL or a BI tool can have full and seamless access to Hadoop data through the Aster’s standard ODBC/JDBC connector and Aster’s SQL engine. There is no need to have a human in the middle to translate the data or ensure its consistency; and no need to file tickets or call up experts to get the data the business needs. Everything happens transparently, seamlessly, and instantly. This is an industry first, since today all available Hadoop tools either do not provide standard SQL interfaces that are well optimized, do not provide native BI compatibility, or require manual data translation and movement from Hadoop to a third party system. None of these approaches are viable options for SQL & BI execution on Hadoop data, thus making it hard for enterprises to get value from Hadoop.
Secondly, SQL-H provides a high-performance, type-safe data connector, that can take a SQL or SQL-MapReduce query that involves Hadoop data, automatically select the minimum subset of data in Hadoop that is required for execution of the query, and run the query on the Aster system. The performance of running SQL and SQL-MapReduce analytics in Aster is significantly higher than Hadoop because (a) Aster can optimize data partitioning and distribution, thus reducing network transfers and overhead; (b) Aster’s engine can keep statistics about the data and use that to optimize execution of both SQL & MapReduce; (c) Aster’s SQL queries are cost-based-optimized which means that it can handle very complex SQL, including SQL produced by BI tools, very efficiently.
In addition, one can take advantage of SQL-H to apply the 50+ pre-build SQL-MapReduce apps that Teradata Aster provides on Hadoop data, thus doing big data analytics that are impossible to do in every other database without having to write a single line of Java MapReduce code! These apps include functions for path & pattern analysis, statistics, graph, text analysis, and more.
Teradata Aster is committed to groundbreaking product innovation as the key strategy in maintaining our #1 position in the big analytics market. SQL-H is another important step that we expect will make Hadoop and big data analytics much more palatable for enterprise environments, allowing business analysts, SQL power-users & BI tool users to analyze Hadoop data without having to learn about Hadoop interfaces and code.
If you want to find out more we’ll be talking about SQL-H at Hadoop Summit, on webcast taking place June 21st, at the upcoming Big Analytics 2012 events in Chicago & New York, and at the annual Teradata Partners event.