About one year ago, Teradata Aster launched a powerful new way of integrating a database with Hadoop. With Aster SQL-H™, users of the Teradata Aster Discovery Platform got the ability to issue SQL and SQL-MapReduce® queries directly on Hadoop data as if that data had been in Aster all along. This level of simplicity and performance was unprecedented, and it enabled BI & SQL analysts that knew nothing about Hadoop to access Hadoop data and discover new information through Teradata Aster.
This innovation was not a one-off. Teradata has put forward the most complete vision for a data and analytics architecture in the 21st century. We call that the Unified Data Architecture™. The UDA combines Teradata, Teradata Aster & Hadoop into a best-of-breed, tightly integrated ecosystem of workload-specific platforms that provide customers the most powerful and cost-effective environment for their analytical needs. With Aster SQL-H™, Teradata provided a level of software integration between Aster & Hadoop that was, and still is, unchallenged in the industry.
Teradata Unified Data Architecture™
Today, Teradata makes another leap in making its Unified Data Architecture™ vision a reality. We are announcing SQL-H™ for Teradata, bringing the best SQL engine for data warehousing and analytics to Hadoop. From now on, Enterprises that use Hadoop to store large amounts of data will be able to utilize Teradata’s analytics and data warehousing capabilities to directly query Hadoop data securely through ANSI standard SQL and BI tools by leveraging the open source Hortonworks HCatalog project. This is fundamentally the best and tightest integration between a data warehouse engine and Hadoop that exists in the market today. Let me explain why.
It is interesting to consider Teradata’s approach versus alternatives. If one wants to execute SQL on Hadoop, with the intent of building Data Warehouses out of Hadoop data, there are not many realistic options. Most databases have a very poor integration with Hadoop, and require Hadoop experts to manage the overall system – not a viable option for most Enterprises due to cost. SQL-H™ removes this requirement for Teradata/Hadoop deployments. Another “option” are the SQL-on-Hadoop tools that have started to emerge; but unfortunately, there are about a decade away from becoming sufficiently mature to handle true Data Warehousing workloads. Finally, the approach of taking a database and shoving it inside Hadoop has significant issues since it suffers from the worst of both worlds – Hadoop activity has to be limited so that it doesn’t disrupt the database, data is duplicated between HDFS and the database store, and performance of the database is less compared to a stand–alone version.
In contrast, a Teradata/Hadoop deployment with SQL-H™ offers the best of both worlds: unprecedented performance and reliability in the Teradata layer; seamless BI & SQL access to Hadoop data via SQL-H™; and it frees up Hadoop to perform data processing tasks at full efficiency.
Teradata is committed to being the strategic advisor of the Enterprise when it comes to Data Warehousing and Big Data. Through its Unified Data Architecture™ and today’s announcement on Teradata SQL-H™, it provides even more performance, flexibility and cost-effective options to Enterprises eager to use data as a competitive advantage.
Today Aster took a significant step and made it easier for developers building fraud detection, financial risk management, telco network optimization, customer targeting and personalization, and other advanced, interactive analytic applications.
Along with the release of Aster Data nCluster 4.5, we added a new Solution Partner level for systems integrators and developers.
Why is this relevant?
Recession or no-recession, IT executives are constantly challenged. They are asked to execute strategies based on better analytics and information to improve effectiveness of business processes (customer loyalty, inventory management, revenue optimization, ..), while staying on top of technology-based disruptions and managing (shrinking or flat) IT budgets.
IT organizations have taken on the challenge by building analytics-based offeringsleveraging existing data management skills and increasingly taking advantage of MapReduce, a disruptive technology introduced by Google and now being rapidly adopted by mainstream enterprise IT shops in Finance, Telco, LifeSciences, Govt. and other verticals.
As MapReduce and big data analytics goes mainstream, our customers and ecosystem partners have asked us to make it easier for their teams to leverage MapReduce across enterprise application lifecycles, while harvesting existing IT skills in SQL, Java and other programming languages. The Aster development team that brought us the SQL-MapReduce® innovation, has now delivered the market’s first integrated visual development environment for developing, deploying and managing MapReduce and SQL-based analytic applications.
Enterprise MapReduce developers and system integrators can now leverage the integrated Aster platform and deliver compelling business results in record time.
I am delighted with the rapid adoption of Aster Data’s platform by our partners and the strong continued interest from enterprise developers and system integrators in building big data applications using Aster. New partners are endorsing our vision and technical innovation as the future of advanced analytics for large data volumes.
Sign up today to be an Aster solution partner and join the revolution to deliver compelling information and analytics-driven solutions.
When you hear the word “warehouse,” you normally think of an oversized building with high ceilings and a ton of storage space. In the data warehousing world, it’s all too easy to fill that space faster than expected. Even companies with predictable data growth trajectories don’t want to pay for storage space they won’t need for months or even years out. For either type of company, the ability to scale on-demand, and to the appropriate degree, is critical.
That’s why I’m so excited about a webinar we are hosting next week with James Kobielus, Senior Analyst for Forrester Research. In case you haven’t read it, James recently released his report “Massive But Agile: Best Practices for Scaling the Next-Generation Data Warehouse.” In the report, James thoroughly address several issues around scalability for which Aster is well-suited (parallelism, optimized storage, in-database analytics, etc.).
We’ll get into much more detail on these and other issues over the course of the webinar. If you haven’t had a chance yet, please register for the webinar to hear what James, a leader and visionary in the industry, has to say. And make sure to leave a comment below if there are any facets of data warehouse scalability that you would like us to cover.
I am excited to be announcing Aster’s Global Partner Program which will be singularly focused on empowering our software and service provider partners grow robust, profitable businesses solving rich, big data analytics challenges for end customers.
Aster is leading a revolution in frontline data warehousing and analytic solutions and has shown success with several marquee customers. Earlier this year we launched our channel efforts and as I look ahead channel partners will play a critical role in Aster’s strategy and success. Through industry domain expertise, specialized data management knowledge and experience, our partners extend Aster’s offerings helping our customers maximize their investment and benefit from innovative “Aster-powered” solutions.
In this blog, I want to focus on outlining the differentiating value Aster brings to partner service providers and independent software (application) vendors.
In a recent analyst briefing, after presenting Aster’s company and product differentiators, I was asked why a service provider (system integrator) gets excited about working with Aster.
I responded with my top 3 reasons:
- Gain access to new “big-data” (e.g. risk management, customer targeting, churn analysis, customer behavior analysis) projects at enterprises across vertical domains
- Deliver real economic benefits to enterprises by changing the business of enterprise data warehousing (challenging current norms of scale, performance and price)
- Opportunity to build a high margin, competitive, domain-specific services practice working w/ the world-class Aster product team
As we continue to push the envelope of technical innovation with our application-friendly relational database, we are witnessing a surge of interest from application software vendors (and developers) who realize analytics and big data management cannot be an after-thought.
Savvy application developers realize storing and analyzing structured and semi-structured user and usage data are critical to success. Being able to plan and provision a robust, proven internet-scale database for current and growing data needs is now a necessity.
With the rapid consolidation in the enterprise application market (thank you, Oracle) and constant pressure to harvest economic value of business data, we notice the shift in application development:
- Application developers and vendors want a data platform which scales elastically with business (and not held captive to proprietary hardware vendor lock-ins) while being flexible to be deployed on-premise or in the cloud
- Demand for a data platform which can seamlessly embrace the power of relational (SQL) with modern frameworks for big data processing (MapReduce) with overall lower total cost of ownership so developers can focus on applications (and not manageability, scalability, reliability of data)
Back in March 2005, I attended the AFCOM Data Center World Conference while working at NetApp. It was a great opportunity to learn about enterprise data center challenges and network with some very experienced folks. One thing that caught my attention was a recurring theme on growing power & cooling challenges in the data center.
Vendors, consultants, and end user case study sessions trumpeted dire warnings that the proliferation of powerful 1U blade servers would result in power demands outstripping supply (for example, a typical 42U rack consumed 7-10kW, while new-generation blade servers were said to exhibit peak rack heat loads of 15-25kW). In fact, estimates were that HVAC cooling (for heat emissions) were an equally significant power consumer (ie. for every watt you burn to power the hardware, you burn another watt to cool it down).
Not coincidentally, 2005 marked the year when many server, storage, and networking vendors came out with “green” messaging. The idea was to convey technologies that reduce power consumption and heat emissions, saving both money and the environment. While some had credible stories (eg. VMware), more often than not the result was me-too bland positioning or sheer hype (also known as “green washing”).
Luckily, Aster doesn’t suffer from this, as the architecture was designed for cost-efficiency (both people costs and facilities costs). Among many examples:
 Heterogeneous scaling: we use commodity hardware but the real innovation is making new servers work with pre-existing older ones. This saves power & cooling costs because rather than having to create a new cluster from scratch (which requires new Queen nodes, new Loader nodes, more networking equipment, etc), you can just plug in new-generation Worker nodes and scale-out on the existing infrastructure…
 Multi-layer scaling: A related concept is nCluster doesn’t require the same hardware for each “role” in the data warehousing lifecycle. This division-of-labor approach ensures cost-effective scaling and power efficiency. For example, Loader nodes are focused on ultra-fast partitioning and loading of data – since data doesn’t persist to disk, these servers contain minimal spinning disk drives to save power. On the opposite end, Backup nodes are focused on storing full/incremental backups for data protection – typically these nodes are “bottom-heavy” and contain lots of high-capacity SATA disks for power efficiency benefits (fewer servers, fewer disk drives, slower spinning 7.2K RPM drives).
 Optimized partitioning: one of our secret sauce algorithms ensures maximizing locality of joins via intelligent data placement. As a result, less data transfers over the network, which means IT orgs can stretch their existing network assets (without having to buy more networking gear and burn power).
 Compression: we love to compress things. Tables, cross-node transfers, backup & recovery, etc all leverage compression algorithms to get 4x – 12x compression ratios – this means fewer spinning disk drives to store data and lower power consumption.
…and others (too many to list in a short blog like this)
I’d love to continue the conversation with IT folks passionate about power consumption…what are your top challenges today and what trends do you see in power consumption for different applications in the data center?
We are very excited that OnMedia has just announced Aster as one of the winners of the AlwaysOn OnMedia 100, a listing of the top 100 private, emerging technology companies in the advertising, publishing, marketing, branding and PR spaces. As a technology enabler for many media clients like MySpace, Invite Media, aCerno, Aggregate Knowledge (and others soon to be announced), we understand the pressures that media faces today. Our customers are a great testament to the fact that Aster has the best solution to meet the rapidly changing needs of media, to keep up with the huge amounts of data they are managing for themselves and their end clients. More info on Aster’s win here.
If you would have told me that Aster Data Systems would be referenced by Gartner in the Magic Quadrant for Data Warehouse Database Management Systems report after just 6 months of coming out of stealth mode, I would have said you were pretty ambitious. There are some pretty high criteria for being placed, such as having a generally-available product for more than one year, as well as over 10 customers in production.
Well, we were included. Even if it was only a mention. And I’m proud to say that as of the publishing date, I believe the only criteria that held us back from being placed on it was having a product available for less than one year.
Although Aster’s high-performance analytic database, Aster nCluster, has been generally available since May – we’re actually on the third major release tested in customer environments. We were busy building out its functionality and testing Aster nCluster in some pretty rigorous frontline data warehouses such as Aggregate Knowledge and MySpace, where we have strict requirements for scale (100+ terabytes) and uptime (no unplanned downtime for more than one year).
2008 was a break-out year for Aster. 2009 will be even more exciting as we continue to set new limits for what customers expect of a relational database management system.
I suppose the next thing you’ll tell me is that we should be placed in the Leaders or Visionaries Quadrants on the chart? That’s ambitious. But it’s just the sort of thing this team and product are capable of.
The Magic Quadrant is copyrighted 2008 by Gartner, Inc. and is reused with permission. The Magic Quadrant is a graphical representation of a marketplace at and for a specific time period. It depicts Gartner’s analysis of how certain vendors measure against criteria for that marketplace, as defined by Gartner. Gartner does not endorse any vendor, product or service depicted in the Magic Quadrant, and does not advise technology users to select only those vendors placed in the “Leaders” quadrant. The Magic Quadrant is intended solely as a research tool, and is not meant to be a specific guide to action. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
I recently wrote an article with Enterprise Systems Journal on how retailers can benefit from innovations in “always on” databases. This story is based on the real-life experiences of one of our customers during the last holiday shopping season. They saw a spike in traffic and quickly scaled their frontline data warehouse built with Aster Data Systems. This allowed them to maintain the service level agreements with the business for product cross-promotion without needing to wait to plan an upgrade in the off-season.
After next week, let us know – how did your favorite e-tailers fare on ‘Black Friday’ and ‘Cyber Monday’? Did their recommendation engines deliver, or did they leave cash in your wallet that got spent elsewhere?
I was at Defrag 2008 yesterday and it was a wonderful, refreshing experience. A diverse group of Web 2.0 veterans and newcomers came together to accelerate the “Aha!” moment in today’s online world. The conference was very well organized and there were interesting conversations on and off the stage.
The key observation was that individuals, groups and organizations are struggling to discover, assemble, organize, act on, and gather feedback from data. Data itself is growing and fragmenting at an exponential pace. We as individuals feel overwhelmed by the slew of data (messages, emails, news, posts) in the microcosm, and we as organizations feel overwhelmed in the macrocosm.
The very real danger is that an individual or organization’s feeling of being constantly overwhelmed could result in the reduction of their “Aha!” moments – our resources will be so focused on merely keeping pace with new information that we won’t have the time or energy to connect the dots.
The goal then is to find tools and best practices to enable the “Aha!” moments – to connect the dots even as information piles up on our fingertips.
My thought going into the conference was that we need to understand what causes these “Aha!” moments. If we understand the cause, we can accelerate the “Aha!” even at scale.
Earlier this year, Janet Rae-Dupree published an insightful piece in the International Herald Tribune on Reassessing the Aha! Moment. Her thesis is that creativity and innovation – “Aha! Moments” – do not come in flashes of pure brilliance. Rather, innovation is a slow process of accretion, building small insight upon interesting fact upon tried-and-true process.
Building on this thesis, I focused my talk on using frontline data warehousing as an infrastructure piece that allows organizations to collect, store, analyze and act on market events. The incremental fresh data loads in a frontline data warehouse add up over time to build a stable historical context. At the same time, applications can contrast fresh data with historical data to build the small contrasts gradually until the contrasts become meaningful to act upon.
I’d love to hear back from you on how massive data can accelerate, rather than impede, the “Aha!” moment.
In a down economy, marketing and advertising are some of the first budgets to get cut. However, recessions are also a great time to gain market share from your competitors. If you take a pragmatic, data-driven approach to your marketing, you can be sure you’re getting the most ROI from every penny spent. It is not a coincidence that in the last recession, Google and Advertising.com came out stronger since they provided channels that were driven by performance metrics.