|
|
|
|
|
About one year ago, Teradata Aster launched a powerful new way of integrating a database with Hadoop. With Aster SQL-H™, users of the Teradata Aster Discovery Platform got the ability to issue SQL and SQL-MapReduce® queries directly on Hadoop data as if that data had been in Aster all along. This level of simplicity and performance was unprecedented, and it enabled BI & SQL analysts that knew nothing about Hadoop to access Hadoop data and discover new information through Teradata Aster.
This innovation was not a one-off. Teradata has put forward the most complete vision for a data and analytics architecture in the 21st century. We call that the Unified Data Architecture™. The UDA combines Teradata, Teradata Aster & Hadoop into a best-of-breed, tightly integrated ecosystem of workload-specific platforms that provide customers the most powerful and cost-effective environment for their analytical needs. With Aster SQL-H™, Teradata provided a level of software integration between Aster & Hadoop that was, and still is, unchallenged in the industry.
 Teradata Unified Data Architecture™
Today, Teradata makes another leap in making its Unified Data Architecture™ vision a reality. We are announcing SQL-H™ for Teradata, bringing the best SQL engine for data warehousing and analytics to Hadoop. From now on, Enterprises that use Hadoop to store large amounts of data will be able to utilize Teradata’s analytics and data warehousing capabilities to directly query Hadoop data securely through ANSI standard SQL and BI tools by leveraging the open source Hortonworks HCatalog project. This is fundamentally the best and tightest integration between a data warehouse engine and Hadoop that exists in the market today. Let me explain why.
It is interesting to consider Teradata’s approach versus alternatives. If one wants to execute SQL on Hadoop, with the intent of building Data Warehouses out of Hadoop data, there are not many realistic options. Most databases have a very poor integration with Hadoop, and require Hadoop experts to manage the overall system – not a viable option for most Enterprises due to cost. SQL-H™ removes this requirement for Teradata/Hadoop deployments. Another “option” are the SQL-on-Hadoop tools that have started to emerge; but unfortunately, there are about a decade away from becoming sufficiently mature to handle true Data Warehousing workloads. Finally, the approach of taking a database and shoving it inside Hadoop has significant issues since it suffers from the worst of both worlds – Hadoop activity has to be limited so that it doesn’t disrupt the database, data is duplicated between HDFS and the database store, and performance of the database is less compared to a stand–alone version.
In contrast, a Teradata/Hadoop deployment with SQL-H™ offers the best of both worlds: unprecedented performance and reliability in the Teradata layer; seamless BI & SQL access to Hadoop data via SQL-H™; and it frees up Hadoop to perform data processing tasks at full efficiency.
Teradata is committed to being the strategic advisor of the Enterprise when it comes to Data Warehousing and Big Data. Through its Unified Data Architecture™ and today’s announcement on Teradata SQL-H™, it provides even more performance, flexibility and cost-effective options to Enterprises eager to use data as a competitive advantage.
|
|
|
|
|
|
|
|
|
|
|
|
|
Ever since Aster Data became part of Teradata a couple years ago, we have been fortunate to have the resources and focus to accelerate our rate of product innovation. In the past 8 months alone, we have led the market in deploying big analytics on Hadoop and introducing an ultra-fast appliance for discovering big data insights. Our focus is to provide the market with the best big data discovery platform; that is, the most efficient, cost-effective, and enterprise-friendly way to extract valuable business insights form massive piles of structured and unstructured data.
Today I am excited to announce another significant innovation that extends our lead in this direction. For the first time, we are introducing in-database, SQL-MapReduce-based visualization functions, as part of the Teradata Aster Discovery Platform 5.10 software release. These are functions that take the output of an analytical process (either SQL or MapReduce) and create an interactive data visualization that can be accessed directly from our platform through any web browser. There are several functions that we are introducing with today’s announcement, including functions that let you visualize flows of people or events, graphs, and arbitrary patterns. These functions complement your existing BI solution by extending the types of information you can visualize without adding the complexity of another BI deployment.
It did take some significant engineering effort and innovation from our field in working with customers to make a discovery platform produce in-database, in-process visualizations. So, why bother? Because these functions have three powerful characteristics: they are beautiful; powerful; and instant. Let me elaborate in reverse order.
Instant: the goal of a discovery platform like Aster’s is to accelerate the hypothesis –> analysis –> validation iteration process. One of the major big data challenges is that the data is so complex that you don’t even know what questions to ask. So you start with 10s or 100s of possible questions that you need to quickly implement and validate until you find the couple questions that extract the gold nuggets of information from the data. Besides analyzing the data, having access to instant visualizations can help data scientists and business analysts understand if they are down the right path of finding the insights they’re looking for. Being able to rapidly analyze and – now – visualize the insights in-process can rapidly accelerate the discovery cycle and save an analysts time and cost by more than 80% as has been recently validated.
Powerful: Aster comes with a broad library of pre-built SQL-MapReduce functions. Some of the most powerful, like nPath, crunch terabytes of customer or event data and produce patterns of activity that yield significant insights in a single pass of the data, regardless of the complexity of the pattern or history being analyzed. In the past, visualizing these insights required a lot of work – even after the insight was generated. This is because there were no specialized visualization tools that could consume the insight as-is to produce the visualizations. Abstracting the insights in order to visualize them is sub-optimal since it is killing the ‘a-ha!’ moment. With today’s announcement, we provide analysts with the ability to natively visualize concepts such as a graph of interactions or patterns of customer behavior with no compromises and no additional effort!
Beautiful: We all know that numbers and data are only as good as the story that goes with them. By having access to instant, powerful and also aesthetically beautiful in-database visualizations, you can do justice to your insights and communicate them effectively to the rest of the organization, whether that means business clients, executives, or peer analysts.
In addition, with this announcement we are introducing four buckets of pre-built SQL-MapReduce functions, I.e. Java functions that can be accessed through a familiar SQL or BI interface. These buckets are Data Acquisition (connecting to external sources and acquiring data); Data Preparation (manipulate structured and unstructured data to quickly prepare for analysis); Data Analytics (everything from path and pattern analysis to statistics and marketing analytics); and Data Visualization (introduced today). This is the most powerful collection of big data tools available in the industry today, and we’re proud to provide them to our customers.
 Teradata Aster Discovery Portfolio
Our belief is that our industry is still scratching the surface in terms of providing powerful analytical tools to enterprises that help them find more valuable insights, more quickly and more easily. With today’s launch, the Teradata Aster Discovery Platform reconfirms its lead as the most powerful and enterprise-friendly tool for big data analytics.
|
|
|
|
|
|
|
|
|
|
|
|
|
Last month in New York we completed the 4th and final event in the Big Analytics 2012 roadshow. This series of events shared ideas on practical ways to address the big data challenge in organizations and change the conversation from “technology” to “business value”. In New York alone, 500 people attended from across both business and IT and we closed out the event with two speaker panels. The data science panel was, in my opinion, one of the most engaging and interesting panels I’ve ever seen at an event like this. The topic was on whether organizations really need a data scientist (and what’s different about the skill set from other analytic professionals). Mike Gualtieri from Forrester Research did a great job leading & prodding the discussion.
Overall, these events were a great way to learn and network. The events had great speakers from cutting-edge companies, universities, and industry thought-leaders including LinkedIn, DJ Patil, Barnes & Noble, Razorfish, Gilt Groupe, eBay, Mike Gualtieri from Forrester Research, Wayne Eckerson, and Mohan Sawhney from Kellogg School of Management.
As an aside, I’ve long observed that there has been a historic disconnect between marketing groups and the IT organizations and data warehouses that they support. I noticed this first when I worked at Business Objects where very few reporting applications ever included Web clickstream data. The marketing department always used a separate tool or application like Web Side Story (now part of Adobe) to handle this. There is a bridge being built to connect these worlds – both in terms of technology which can handle web clickstream and other customer interactional data, but also new analytic techniques which make it easier for marketing/business analysts to understand their customers more intimately and better serve them a relevant experience.
We ran a survey at the events, and I wanted to share some top takeaways. The events were split into business and technical tracks with themes of “data science” and “digital marketing”. Thus, the survey data compares the responses from those who were more interested in technology than the business content, so we can compare their responses. The survey data includes responses from 507 people in San Francisco, 322 in Boston, 441 in Chicago, and 894 in New York City for a total of 2164 respondents.
You can get the full set of graphs here, but here are a couple of my own observations / conclusions in looking at the data:
1) “Who is talking about big data analytics in your organization?” – IT and Marketing were by far the largest responses with nearly 60% of IT organizations and 43% of marketing departments talking about it. New York had slightly higher # of CIO’s and CEO’s talking about big data at 23 and 21%, respectively

2) “Where is big data analytics in your company” – Across all cities, “customer interactions in Web/social/mobile” was 62% – the biggest area of big data analytics. With all the hype around machine/sensor data, it was surprisingly only being discussed in 20% of organizations. Since web servers and mobile devices are machines, it would have been interesting to see how the “machine generated data” responses would have been if we had taken the more specific example of customer interactions away

3) This chart is a more detailed breakdown of the areas where big data analytics is found, broken down by city. NYC has a few more “other.” Some of the “other” answers in NYC included:
- Claims
- Client Data Cloud
- Development, and Data Center Systems
- Customer Solutions
- Data Protection
- Education
- Financial Transaction
- Healthcare data
- Investment Research
- Market Data
- Predictive Analytics (sales and servicing)
- Research
- Risk management /analytics
- Security

4) “What are the Greatest Big Analytics Application Opportunities for Businesses Today? – on average, general “data discovery or data science” was highest at 72%, with “digital marketing optimization” as second with just under 60% of respondents. In New York, “fraud detection and prevention” at 39% was slightly higher than in other cities, perhaps tied to the # of financial institutions in attendance

In summary, there are lots of applications for big data analytics, but having a discovery platform which supports iterative exploration of ALL types of data and can support both business/marketing analysts as well as savvy data scientists is important. The divide between business groups like marketing and IT are closing. Marketers are more technically savvy and the most demanding for analytic solutions which can harness the deluge of customer interaction data. They need to partner closely with IT to architect the right solutions which tackle “big analytics” and provide the right toolsets to give the self-service access to this information without always requiring developer or IT support.
We are planning to sponsor the Big Analytics roadshow again in 2013 and take it international, as well. If you attended the event and have feedback or requests for topics, please let us know. I hear that there will be a “call for papers” going out soon. You can view the speaker bios & presentations from the Big Analytics 2012 events for ideas.
|
|
|
|
|
|
|
|
|
|
|
By MGoel in Analytic platform, Analytics, Availability, Blogroll, Business analytics, Database, Digital Marketing, MapReduce, Scalability, TCO, Teradata Aster on December 18, 2012 |
| |
|
|
|
|
It’s been about two months since Teradata launched the Aster Big Analytics Appliance and since then we have had the opportunity to showcase the appliance to various customers, prospects, partners, analysts, journalists etc. We are pleased to report that since the launch the appliance has already received the “Ventana Big Data Technology of the Year” award and has been well received by industry experts and customers alike.
Over the past two months, starting with the launch tweetchat, we have received numerous enqueries around the appliance and think now is a good time to answer the top 10 most frequently asked questions about the new Teradata Aster offering. Without further ado here are the top 10 questions and their answers:
WHAT IS THE TERADATA ASTER BIG ANALYTICS APPLIANCE?
The Aster Big Analytics Appliance is a powerful, ready to-run platform that is pre-configured and optimized specifically for big data storage and analysis. A purpose built, integrated hardware and software solution for analytics at big data scale, the appliance runs Teradata Aster patented SQL-MapReduce® and SQL-H™ technology on a time-tested, fully supported Teradata hardware platform. Depending on workload needs, it can be exclusively configured with Aster nodes, Hortonworks Data Platform (HDP) Hadoop nodes, or a mixture of Aster and Hadoop nodes. Additionally, integrated backup nodes are available for data protection and high availability
WHO WILL BENEFIT MOST BY DEPLOYING THE APPLIANCE?
The appliance is designed for organizations looking for a turnkey integrated hardware and software solution to store, manage and analyze structured and unstructured data (ie: multi-structured data formats). The appliance meets the needs of both departmental and enterprise-wide buyers and can scale linearly to support massive data volumes.
WHY DO I NEED THIS APPLIANCE?
This appliance can help you gain valuable insights from all of your multi-structured data. Using these insights, you can optimize business processes to reduce cost and better serve your customers. More importantly, these insights can help you innovate by identifying new markets, new products, new business models etc. For example, by using the appliance a telecommunications company can analyze multi-structured customer interaction data across multiple channels such as web, call center and retail stores to identify the path customers take to churn. This insight can be used proactively to increase customer retention and improve customer satisfaction.
WHAT’S UNIQUE ABOUT THE APPLIANCE?
The appliance is an industry first in tightly integrating SQL-MapReduce®, SQL-H™ and Apache Hadoop. The appliance delivers a tightly integrated hardware and software solution to store, manage and analyze big data. The appliance delivers integrated interfaces for analytics and administration, so all types of multi-structured data can be quickly and easily analyzed through SQL based interfaces. This means that you can continue to use your favorite BI tools and all existing skill sets while deploying new data management and analytics technologies like Hadoop and MapReduce. Furthermore, the appliance delivers enterprise class reliability to allow technologies like Hadoop to now be used for mission critical applications with stringent SLA requirements.
WHY DID TERADATA BRING ASTER & HADOOP TOGETHER?
With the Aster Big Analytics Appliance, we are not just putting Aster and Hadoop in the same box. The Aster Big Analytics Appliance is the industry’s first unified big analytics appliance, providing a powerful, ready to run big analytics and discovery platform that is pre-configured and optimized specifically for big data analysis. It provides intrinsic integration between the Aster Database and Apache Hadoop, and we believe that customers will benefit the most by having these two systems in the same appliance.
Teradata’s vision stems from the Unified Data Architecture. The Aster Big Analytics Appliance offers customers the flexibility to configure the appliance to meet their needs. Hadoop is best for capture, storing and refining multi-structured data in batch whereas Aster is a big analytics and discovery platform that helps derive new insights from all types of data. Hadoop is best for capture, storing and refining multi-structured data in batch. Depending on the customer’s needs, the appliance can be configured with all Aster nodes, all Hadoop nodes or a mix of the two.
WHAT SKILLS DO I NEED TO DEPLOY THE APPLIANCE?
The Aster Big Analytics appliance is an integrated hardware and software solution for big data analytics, storage, and management, which is also designed as a plug and play solution that does not require special skill sets.
DOES THE APPLIANCE MAKE DATA SCIENTISTS OR DATA ANALYSTS IRRELEVANT?
Absolutely not. By integrating the hardware and software in an easy to use solution and providing easy to use interfaces for administration and analytics, the appliance allows data scientists to spend more time analyzing data.
In fact, with this simplified solution, your data scientists and analysts are freed from the constraints of data storage and management and can now spend their time on value added insights generation that ultimately leads to a greater fulfillment of your organization’s end goals.
HOW IS THE APPLIANCE PRICED?
Teradata doesn’t disclose product pricing as part of its standard business operating procedures. However, independent research conducted by industry analyst Dr. Richard Hackathorn, president and founder, Bolder Technology Inc., confirms that on a TCO and Time-to-Value basis the appliance presents a more attractive option vs. commonly available do-it-yourself solutions. http://teradata.com/News-Releases/2012/Teradata-Big-Analytics-Appliance-Enables-New-Business-Insights-on–All-Enterprise-Data/
WHAT OTHER ASTER DEPLOYMENT OPTIONS ARE AVAILABLE?
Besides deploying via the appliance, customers can also acquire and deploy Aster as a software only solution on commodity hardware] or in a public cloud.
WHERE CAN I GET MORE INFORMATION?
You can learn more about the Big Analytics Appliance via http://asterdata.com/big-analytics-appliance/ – home to release information, news about the appliance, product info (data sheet, solution brief, demo) and Aster Express tutorials.
Join the conversation on Twitter for additional Q&A with our experts:
Manan Goel @manangoel | Teradata Aster @asterdata
For additional information please contact Teradata at http://www.teradata.com/contact-us/
|
|
|
|
|
|
|
|
|
|
|
|
|
What questions are important to you about Social Media?
John Lovett from Web Analytics Demystified just published a new white paper on Social Analytics. Lovett, who has written the book on Social Analytics (literally), lays out a compelling vision for Deeper Social Analytics for companies. He clearly presents the value of companies to go beyond surface level analytics of likes, followers and friends and challenges the CMO to ask deeper and more important questions.

I love the three key questions presented in the paper that really hit the c-suite. These are
- What is the Audience?
- What is the Activity?
- What is the Action?
These 3 questions provide a framework to share social media initiatives with business leaders and strip away all the non-business related questions that become so distracting in understanding the impact of social media on the enterprise.
Although we are still in the early, Black and White TV stage of social analytics, Teradata Aster has been heavily influenced by our customers’ needs in the social space. Customers like LinkedIn, Gilt Groupe, MySpace, and Myzinga have redefined how consumers interact with each other through music, shopping, and content. Attensity running on Aster promises to bring together big data and social analytics that starts to deliver on Mr. Lovett’s proposition of deeper social analytics.
Teradata’s strategy to marry big data analytics and marketing applications with its industry-leading database solutions is steeped with the concept of deeper analytics. In social analytics, we have identified 10 key business questions that should be asked about every social post. In a market where posts can go viral, impact brand, customer perception, and revenue, being able to quickly and effectively navigate deeper social analytics becomes a mission critical capability.
Beyond John’s questions, the 10 key questions are:
- What was said?
- What is it about? (ie, product, service, brand, experience)
- Is it a common sentiment?
- What are the trends on this topic?
- Who said it?
- What is their value?
- How engaged are they?
- What is their influence?
- How do I respond?
- Was my response effective?
To effectively answer these questions CMO’s need a set of marketing technologies. These include:
- A social listening platform that analyzes social timelines for owned, embassy and public feeds. These tools identify what was said, what it is about and if it is a common sentiment.
- The second level demands a customer hub to understand who posting and what the customer relationship is with them and to measure the customer value of that relationship.
- The third level requires social network analytics and the ability to find implicit and explicit social connections. This helps illuminate how engaged customers are and their level of influence – or who influences them
- The fourth level is where integrated marketing management and customer facing marketing applications come it. Once you understand what was said, who said it and the potential impact – how do you respond? Is it a one-on-one conversation, a social discussion or was a bigger issue identified that may result in marketing campaigns?
Are these the right business questions to ask? What else do you want to know about social media posts in your business?
|
|
|
|
|
|
|
|
|
|
|
|
|
We live in interesting times!
In the past 30 years, data was used to record business events and report on business events. Over the last 5 years, data has gotten closer to business. Now data is being used to record business events, report on business events as well as influence business events. We now realize that the more data we record, the more comprehensively data can influence business events.
Hence the excitement of “big data” – it is a business opportunity for each line of business – to influence business events to have favorable outcomes.
The responsibility for technologists is to provide the right platforms and tools to make influencing business easy and simple.
There are TWO relentless forces that are playing out in the big data space to which technology has to respond.
The first force is the diversity of data. As we record more data, we end up having different formats of data to manage. About 20% is relational, but we also have text, emails, PDF, Twitter feeds, Facebook profiles, social graphs, CDRs, Apache logs, JSON formats, …
The second force is the richness of analytics. As we influence more business, we end up having richer analytics to perform. About 20% is SQL, but we also have time series analysis, statistical analysis, geo-spatial analysis, graph analysis, sentiment analysis, entity extraction, …
Note that I am not saying MapReduce doesn’t have a diverse set of analytics to do: MapReduce is a way of programming to do analysis – time series, statistical, geo-spatial – each require different MapReduce programs to be written.
Today, the platforms and tools for big data are very complex. They expect lines of business owners to write programs to manage different forms of big data, to write sophisticated programs to analyze big data, to master the management and administration of big clusters and be self-sustaining in managing data quality. This last point is very important – data values change over time. We have to keep values consistent, otherwise our analysis will be wrong and our influence on business will be negative – garbage in, garbage out rule of computing.
As a result, big data is in danger of entering the DIY (do it yourself) space. A line of business is now expected to support big clusters = big administration = big programs = big friction = low influence.
We have to acknowledge these challenges as technologists. If we let big data solutions be a DIY solution, only pockets of enterprise will embrace big data – the rest of the non-technology savvy business leaders will be left out of the opportunity.
We have to simplify this equation. We need to enable line of business owners to benefit from big data a lot more easily. We have to make it simpler for business leaders to get from big data to big analytics.
Our goal, big data = small clusters = easy administration = big analytics = big influence.
This entails solving the following problems:
[1] Make platform and tools to be easier to use to manage and curate data. Otherwise, garbage in = garbage out, and you will get garbage analytics.
[2] Provide rich analytics functions out of the box. Each line of programming cuts your reachable audience by 50%.
[3] Provide tools to update or delete data. Otherwise, data consistency will drift away from truth as history accumulates.
[4] Provide applications to leverage data and find answers relevant to business. Otherwise the cost of DIY applications is too high to influence business – and won’t be done.
At Teradata Aster, we are continuing to lead the big data revolution. We have led the revolution for the past 5 years, and helped shape the market and technologies. We are convinced that the path to big data success is to connect it with Big Analytics in the coming 5 years.
|
|
|
|
|
|
|
|
|
|
|
|
|
Yesterday I presented at the Los Angeles Teradata User Group on the topic of “Data Science: Finding Patterns in Your Data More Quickly & Easily with MapReduce”. One point discussed was the common misnomer that big data is about volume, which is certainly part of the issue organizations are facing. However, the big story in big data is the complexity and additional processing required to make “unstructured” data actionable through analytics. This is where procedural frameworks like MapReduce can help. Here is a great post by Teradata’s own Bill Franks about unstructured data which helps describe the requirements unstructured data demands in the context of analytics.
As Franks notes, “the thought of using unstructured data really shouldn’t intimidate people as much as it often does.” Read more to learn why.
|
|
|
|
|
|
|
|
|
|