<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Data Blog: Aster Data Blog</title>
	<atom:link href="http://www.asterdata.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.asterdata.com/blog</link>
	<description>The convergence of Big Data, analytic applications, MPP data warehouses, and MapReduce</description>
	<lastBuildDate>Thu, 29 Sep 2011 13:58:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>The Simplest Way for Businesses to Analyze Big Data</title>
		<link>http://www.asterdata.com/blog/2011/09/29/the-simplest-way-for-businesses-to-analyze-big-data/</link>
		<comments>http://www.asterdata.com/blog/2011/09/29/the-simplest-way-for-businesses-to-analyze-big-data/#comments</comments>
		<pubDate>Thu, 29 Sep 2011 13:58:23 +0000</pubDate>
		<dc:creator>Tasso Argyros</dc:creator>
				<category><![CDATA[Analytic platform]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/?p=357</guid>
		<description><![CDATA[One of the great things about starting your own company (if you’re lucky and your company does well) is that you take part in the evolution of a whole new market, from its nascent days to its heyday. This was the case with Aster and the “Big Data” market. Back when we started Aster, in [...]]]></description>
			<content:encoded><![CDATA[<p>One of the great things about starting your own company (if you’re lucky and your company does well) is that you take part in the evolution of a whole new market, from its nascent days to its heyday. This was the case with Aster and the “Big Data” market. Back when we started Aster, in 2005, MPP systems that could store and analyze data using off-the-self servers was still a pretty new concept. I also recall in 2008, when we first came out with our native in-database MapReduce support — and our <a href="http://www.asterdata.com/resources/mapreduce.php">SQL-MapReduce</a>® technology — we had to explain to most people what MapReduce even was. In 2009, we came out with the first Big Data event series — “Big Data Summit” — because we knew we were doing something new and wanted a term to describe it. “Big Data” caught on more than we had imagined back then, and the rest is history. Product innovation was at the core of Aster’s existence, and we kept pushing ourselves and our product to become the best platform for enterprise-class data analytics using <strong><span style="text-decoration: underline;">both</span></strong> SQL <strong><span style="text-decoration: underline;">and</span></strong> MapReduce as first class citizens on one analytic platform.</p>
<p>Today there is a lot of innovation in the big data market. However, we see a “chasm” between the SQL technologies—which are very enterprise-friendly—and the new wave of open source big data or “NoSQL” software which is used extensively by engineering organizations. In the middle is a very large number of enterprises trying to understand how they can use these new technologies to push their analytical capabilities beyond purely SQL, while at the same time utilizing their existing investments in technologies and people. This is the problem that Aster solves.</p>
<p>With last week’s <a href="http://www.asterdata.com/news/110922-Aster-Database.php">announcement</a>, the launch of our Teradata Aster MapReduce solutions which include Aster Database 5.0 software (formerly Aster <em>n</em>Cluster) and our new Aster MapReduce Appliance, we bring to market the best answer for the organizations that are “caught in the middle.”  Unlike SQL-only systems focused primarily on analyzing structured data, our database and appliance provide support for native MapReduce which enables a new generation of analytics, such as digital marketing optimization, social graph analysis, fraud detection based on customer behavior, etc. Our newly extended libraries of pre-built MapReduce analytical functions allows such applications to be developed with significantly less time and cost versus other MapReduce technologies. And, unlike other MapReduce-based systems, we offer full SQL support, integration with all major BI and ETL vendors and a data adapter to EDW systems that allows enterprises to utilize existing tools and skills to bring big data analytics to their businesses. Finally, with our new appliance, we leverage Teradata’s strength and engineering to provide a proven and performance-optimized system for businesses to start analyzing untapped diverse data while cutting down on time, cost and frustration!</p>
<p>As we move forward, Aster is committed to being the leader in SQL and MapReduce analytics for multi-structured data. Having spent 6 years in this market, we believe that it’s not just the coolest technologies that will win, but the ones that make it easier for business analysts and data scientists within organizations to solve their business problems and innovate with analytics. With the launch of our new Teradata Aster solutions — including the revamped SQL-MapReduce interfaces and the new Aster MapReduce appliance—we are pushing the state of the art towards this direction (or as my marketing team likes to say – “bringing the science of data to the art of business”). <img src='http://www.asterdata.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/2011/09/29/the-simplest-way-for-businesses-to-analyze-big-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Design for the Multi-Structured Big Data Platform: The future is opening in front of us</title>
		<link>http://www.asterdata.com/blog/2011/08/03/a-design-for-the-multi-structured-big-data-platform-the-future-is-opening-in-front-of-us/</link>
		<comments>http://www.asterdata.com/blog/2011/08/03/a-design-for-the-multi-structured-big-data-platform-the-future-is-opening-in-front-of-us/#comments</comments>
		<pubDate>Wed, 03 Aug 2011 14:41:53 +0000</pubDate>
		<dc:creator>Mayank Bawa</dc:creator>
				<category><![CDATA[Analytic platform]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/?p=339</guid>
		<description><![CDATA[The world of big data will benefit tremendously from a hybrid big data platform. Teradata’s Aster Data nCluster provides such a hybrid big data platform. It enables multi-structured data to be stored natively in the database. Therefore, we can store relational data as tables with rows and columns. We can store PDF documents as PDF [...]]]></description>
			<content:encoded><![CDATA[<p>The world of big data will benefit tremendously from a hybrid big data platform. Teradata’s  Aster Data <em>n</em>Cluster provides such a hybrid big data platform.</p>
<p>It enables multi-structured data to be stored natively in the database. Therefore, we can store relational data as tables with rows and columns. We can store PDF documents as PDF documents, HTML pages as HTML pages – and the same with Java objects, JPG files, Word documents, GIS data, and others.</p>
<p>It enables multi-structured data to be automatically (dynamically) interpreted natively in the database. For example, we can process PDF data to retrieve the various text blocks in that document, HTML pages to retrieve its content, and JPG files to render images or extract features. In other words, we can interact with the data in its native form to leverage the structure inherent in the stored data.</p>
<p>The final piece is that it enables a human or application user to step across the different structures seamlessly. For example, you can write a query that:</p>
<table width="100%" border="0">
<tr>
<td width="5">&nbsp;</td>
<td  width="5" valign="top">1.</td>
<td>Identifies your valuable customers by analyzing payment history table</td>
</tr>
<tr>
<td>&nbsp;</td>
<td  width="5" valign="top">2.</td>
<td>Analyzes and interprets customer sentiment by analyzing logs of customer calls</td>
</tr>
<tr>
<td>&nbsp;</td>
<td  width="5" valign="top">3.</td>
<td>Builds a decision tree to determine the most common problem detected in customer logs</td>
</tr>
<tr>
<td>&nbsp;</td>
<td  width="5" valign="top">4.</td>
<td>Builds a linear regression model to predict the loss in revenue that can be prevented by solving customers’ problem and the cost of acquiring net new customers to overcome the losses</td>
</tr>
</table>
<p></p>
<p>This can all be done in one workflow and one session. Impressive?</p>
<p>We live in interesting times. The future is opening up in front of us.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/2011/08/03/a-design-for-the-multi-structured-big-data-platform-the-future-is-opening-in-front-of-us/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What Big Data Can Learn From the PC Era: The Need for a Multi-Structured Big Data Platform</title>
		<link>http://www.asterdata.com/blog/2011/07/28/what-big-data-can-learn-from-the-pc-era-the-need-for-a-multi-structured-big-data-platform/</link>
		<comments>http://www.asterdata.com/blog/2011/07/28/what-big-data-can-learn-from-the-pc-era-the-need-for-a-multi-structured-big-data-platform/#comments</comments>
		<pubDate>Thu, 28 Jul 2011 14:40:34 +0000</pubDate>
		<dc:creator>Mayank Bawa</dc:creator>
				<category><![CDATA[Analytic platform]]></category>
		<category><![CDATA[Analytics]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/?p=333</guid>
		<description><![CDATA[I wrote earlier that data is structured in multiple forms. In fact, it is the structure of data that allows applications to handle it “automatically” &#8211; as an automaton, i.e., programmatically – rather than relying on humans to handle it “semantically”. Thus a search engine can search for words, propose completion of partially typed words, [...]]]></description>
			<content:encoded><![CDATA[<p>I <a href="http://www.asterdata.com/blog/2011/06/13/multi-structured-data-platform-capabilities-required-for-big-data-analytics/">wrote earlier</a> that data is structured in multiple forms. In fact, it is the structure of data that allows applications to handle it “automatically”  &#8211; as an automaton, i.e., programmatically – rather than relying on humans to handle it “semantically”.</p>
<p>Thus a search engine can search for words, propose completion of partially typed words, do spell checking, and suggest grammar corrections “automatically”.</p>
<p>In the last 30 years, we’ve built specialized systems to handle each data structure differently at scale. We index a large corpus of documents in a dedicated search engine for searches, we arrange lots of words in a publishing framework to compose documents, we store relational data in a RDBMS to do reporting, we store emails in an e-discovery platform to identify emails that satisfy a certain pattern, we build and store cubes in a MOLAP engine to do interactive analysis, and so on. </p>
<p>Each such system is a silo – it imposes a particular structure on big data, and then it leverages that structure to do its tasks efficiently at scale.</p>
<p>The silo approach imposes fragmentation of data assets. It is expensive to maintain these silos. It is inefficient for humans and programs to master these silos – they have to learn the nuances of each silo to become an expert in exploiting it. As a result, we have all kinds of data administrators – a cube expert, a text expert, a spreadsheet expert, and so on.</p>
<p>The state of data fragmentation reminds me of the “dedicated function machines” that pre-dated the “Personal Computer”. We used to have electronic type-writers that would create documents, calculators that would calculate formulae, fax machines that would transmit documents, even tax machines that would calculate taxes. All of these machines were booted to relic-status at a museum by a general-purpose computer – the functions were ported on top of its computing framework and the data was stored in its file system. The unity of all of these functions and its data on the general-purpose computer gave rise to “integration” benefits. It made tasks easier: we can now fill our tax forms in (structured form-based) PDF documents, do tax calculations, and file taxes by transmitting the document &#8211; all on one platform. Our productivity has gone up. Indeed, the assimilation of data is leading to net new tasks that were not possible before. We can let programs search for previous year’s filings, read the entries, and populate this year’s forms from previous year’s filing to minimize data-entry errors.</p>
<p>We have the same opportunity in front of us now in the field of big data. For too long, have we relegated functions that work on big data to isolated “dedicated function machines.” These dedicated function machines are bad because they are not “open.” Data in a search engine can only be “searched” &#8211; it cannot be analyzed for sentiments or plagiarism or edited to insert or remove references. The data is the same, but each of these tasks requires a “dedicated function machine.”</p>
<p>We have the option to build a general purpose machine for big data – a multi-structured big data platform – that allows multiple structures of data to co-exist on a single platform that is flexible enough to perform multiple functions on data. </p>
<p>Such a platform, for example, would allow us to analyze structured payments data to identify our valuable customers, interpret sentiments of calls they made to us, analyze the most common problem across negative sentiment interactions, and predict the loss in revenue that can be prevented by solving that problem and the cost of acquiring net new customers to overcome the losses. Without a multi-structure big data platform, the above workflow is a 12-18 month cycle performed by a cross-functional team of “dedicated function experts” (CFO group, Customer Support group, Products group, Marketing group) – a bureaucratic mess of project management that produces results too expensively, too infrequently and too inaccurately, making simplifying assumptions at each step as they cannot agree on even basic metrics.</p>
<p>An open “Multi-Structured Big Data Platform” would be hugely enabling and open up vast efficiency and functionality that we can’t imagine today.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/2011/07/28/what-big-data-can-learn-from-the-pc-era-the-need-for-a-multi-structured-big-data-platform/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Multi-structured Data: Platform Capabilities Required for Big Data Analytics</title>
		<link>http://www.asterdata.com/blog/2011/06/13/multi-structured-data-platform-capabilities-required-for-big-data-analytics/</link>
		<comments>http://www.asterdata.com/blog/2011/06/13/multi-structured-data-platform-capabilities-required-for-big-data-analytics/#comments</comments>
		<pubDate>Mon, 13 Jun 2011 14:54:46 +0000</pubDate>
		<dc:creator>Mayank Bawa</dc:creator>
				<category><![CDATA[Analytic platform]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/?p=320</guid>
		<description><![CDATA[The “big data” world is a product of exploding applications. The number of applications that are generating data has just gone through the roof. The number of applications that are being written to consume the generated data is also growing rapidly. Each application wants to produce and consume data in a structure that is most [...]]]></description>
			<content:encoded><![CDATA[<p>The “big data” world is a product of exploding applications. The number of applications that are <em><span style="text-decoration: underline;">generating</span></em> data has just gone through the roof. The number of applications that are being written to <em><span style="text-decoration: underline;">consume the generated data</span></em> is also growing rapidly. Each application wants to produce and consume data in a structure that is most efficient for its own use.  As Gartner points out in a recent report on big data<a href="#Citation">[1]</a>, “Too much information is a storage issue, certainly, but too much information is also a massive analysis issue.”</p>
<p>In our data-driven economy, business models are being created (and destroyed) and shifting based on the ability to compete on data and analytics. The winners realize the advantage of having platforms,  that allow data to be <em><span style="text-decoration: underline;">stored</span></em> in multiple structures and (more importantly) allow data to be <em><span style="text-decoration: underline;">processed</span></em> in multiple structures. This allows companies to more easily 1) harness and 2) quickly process ALL of the data about their business to better understand customers, behaviors, and opportunities/threats in the market. We call this “multi-structured” data, which has been a topic of discussion lately with IDC Research (where we first saw the term referenced) and <a href="http://www.dbms2.com/2011/05/17/poly-structured-database/">other industry analysts</a>. It is also the upcoming topic of a <a href="http://www.asterdata.com/wc_110615-Big-Data-Analytics/index.php?ref=blog">webcast we’re doing with the IDC</a> on June 15<sup>th</sup>.</p>
<p>To us, multi-structured data means “a variety of data formats and types.” This could include any data “structured” or “unstructured”  - “relational” or “non-relational”. Curt Monash has blogged about naming such data Poly-structured or Multi-structured. At the core is the ability for an analytic platform to both 1) store and 2) process a diversity of formats in the most efficient means possible.</p>
<p><strong>Handling Multi-structured Data</strong></p>
<p>We in the industry use the term “structured” data to mean “relational” data. And data that is not “relational” is called “unstructured” or “semi-structured.”</p>
<p>Unfortunately, this definition lumps text, csv, pdf, doc, mpeg, jpeg, html, log files as unstructured data. Clearly, all of these forms of data have an implicit “structure” to them!</p>
<p>My first observation is that Relational is one way of manifesting the data. Text is another way of expressing the data &#8211; Jpeg, gif, bmp and other formats are structured forms of expressing images. For example, (Mayank, Aster Data, San Carlos, 6/1/2011) is a relational row stored in a table (Name, Company Visited, City Visited, Date Visited) – the same data can be expressed in text as “Mayank visited Aster Data, based in San Carlos, on June 1, 2011.” A geo-tagged photograph of Mayank entering the Aster Data office in San Carlos on June 1, 2011 will also capture the same information.</p>
<p>My second observation is that “structure” of data is what makes applications understand the data and know what to do with it. For example, a SQL-based application can issue the right SQL queries to process its logic; an image viewer can interpret JPG/GIF/BMP files to interpret the data; a text-engine can parse subject-object-verbs to interpret the data; etc.</p>
<p>Each application leverages the structure of data to do its processing in the most efficient manner. Thus, search engines recognize the white-space structure in English and can build inverted indexes on words to do fast searches. Relational engines recognize row headers and tuple boundaries to build indexes that can be used to retrieve selected rows very quickly. And so on.</p>
<p>My third observation is that each application produces data in a structure that is most efficient for its use. Thus, applications produce logs; cameras produce images; business applications produce relational rows; Web content engines produce HTML pages; etc. It is very hard to “Transform” data from one structure to the other. ETL tools have their hands full in just doing transformations from a relational schema to another relational schema. And semantic engines have a hard time “transforming” text to relational forms. All such “across structure” transforms are lost in the information.</p>
<p>Relational databases handle relational structure and relational processing very efficiently, but they are severely limiting in their capabilities to store and process other structures (e.g., text, xml, jpg, pdf, doc). In these engines, <span style="text-decoration: underline;">r</span>elations are a first-class citizen; every other structure is a distant second-class citizen.</p>
<p>Hadoop is exciting in the “Big Data” world because it doesn’t pre-suppose any structure. Data in any structure can be stored in plain files. Applications can read the files and build their own structures on the fly. It is liberating. However, it is not efficient – precisely because it reduces all data to its base form of files and robs the data of its structure &#8211; the structure that would allow for efficient processing or storage by applications! Each application has to redo its work from scratch.</p>
<p>What would it take for a platform to treat multiple structures of data as first class citizens? How could it natively support each format, yet provide a unified way to express queries or analytic logic at the end-user level to as to abstract away the complexity/diversity of the data and provide insights more quickly?  It’d be liberating as well as efficient!</p>
<hr size="1" /><a name="Citation" id="Citation">[1]</a> “&#8217;Big Data&#8217; Is Only the Beginning of Extreme Information Management”. Gartner Research, April 7, 2011</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/2011/06/13/multi-structured-data-platform-capabilities-required-for-big-data-analytics/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Introducing the First Collaborative Community for SQL-MapReduce</title>
		<link>http://www.asterdata.com/blog/2011/05/25/introducing-the-first-collaborative-community-for-sql-mapreduce/</link>
		<comments>http://www.asterdata.com/blog/2011/05/25/introducing-the-first-collaborative-community-for-sql-mapreduce/#comments</comments>
		<pubDate>Wed, 25 May 2011 14:58:12 +0000</pubDate>
		<dc:creator>jonbock</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/?p=315</guid>
		<description><![CDATA[In case you missed the news, Aster Data just took another step to make SQL-MapReduce the best programming framework for big data analytics. The Aster Data SQL-MapReduce® Developer Portal is the first collaborative online developer community for SQL-MapReduce analytics, our framework for processing non-relational data and ultra-fast analytics. It builds on other efforts to enable [...]]]></description>
			<content:encoded><![CDATA[<p>In case you missed the news, Aster Data just took <a href="http://www.asterdata.com/news/110502-Developer-Portal.php">another step</a> to make SQL-MapReduce the best programming framework for big data analytics. The <a href="http://www.asterdata.com/community">Aster Data SQL-MapReduce® Developer Portal</a> is the first collaborative online developer community for SQL-MapReduce analytics, our framework for processing non-relational data and ultra-fast analytics. It builds on other efforts to enable MapReduce analytics including: <a href="http://www.asterdata.com/resources/developer-center.php">Developer Center</a>, a resource center for MapReduce and SQL-MapReduce developers; <a href="http://www.asterdata.com/download_developer_express/index.php?ref=DataBlog">Aster Data Developer Express</a>, the first integrated development environment for SQL-MapReduce; and <a href="http://www.asterdata.com/product/advanced-analytics.php">Aster Data Analytic Foundation</a>, a suite of ready-to-use SQL-MapReduce functions.</p>
<p>The Developer Portal gives our customers and partners a community for collaborating with peers to leverage the flexibility and power of <a href="http://www.asterdata.com/resources/mapreduce.php">SQL-MapReduce</a> for analytics that were previously impossible or impractical. Data scientists, quantitative analysts, and developers from customers, partners, and Aster Data are using the portal to highlight insights and best practices, share analytic functions, and leverage the experience and knowledge of the community to easily harness the power of SQL-MapReduce for big data analytics.</p>
<p>The portal enables collaboration that is key in making it easy for our customers to become SQL-MapReduce experts so they can solve core business challenges. As Navdeep Alam, director of data architecture at Mzinga, said, the portal “will allow us the ability to share and leverage insights with others in using big data analytics to attain a deeper understanding of customers’ behavior and create competitive advantage for our business.”</p>
<p>We’re seeing strong interest in the Developer Portal from our current customers. Early activity and content on the portal includes discussions about using the GSL libraries, programming in .NET, and writing sessionization and sampling functions. We plan to expand on this with tutorials for additional functions over the next few months.</p>
<p>If you aren’t already a customer, we encourage you to get started at the <a href="http://www.asterdata.com/resources/developer-center.php">Aster Data Developer Center</a>, where you can get your hands on SQL-MapReduce by downloading <a href="http://www.asterdata.com/download_developer_express/index.php?ref=DataBlog">Aster Data Developer Express</a> for free and find links to other resources like <a href="http://www.mapreduce.org/">www.mapreduce.org</a>.  If you are an Aster Data customer, we encourage you to also register for access to the new SQL-MapReduce Developer Portal for additional content and learning.</p>
<p>We’re always interested in your feedback as to how we can better help developers learn about and use MapReduce and Aster Data’s SQL-MapReduce.  If you have any suggestions, please feel free to add them below in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/2011/05/25/introducing-the-first-collaborative-community-for-sql-mapreduce/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Going Big – Teradata to Acquire Aster Data</title>
		<link>http://www.asterdata.com/blog/2011/03/03/going-big-%e2%80%93-teradata-to-acquire-aster-data/</link>
		<comments>http://www.asterdata.com/blog/2011/03/03/going-big-%e2%80%93-teradata-to-acquire-aster-data/#comments</comments>
		<pubDate>Thu, 03 Mar 2011 12:55:10 +0000</pubDate>
		<dc:creator>Aster</dc:creator>
				<category><![CDATA[Statements]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/?p=277</guid>
		<description><![CDATA[We are very excited to share with you that today we announced our company, Aster Data, is being acquired by Teradata, who as you all know commands the #1 position in data warehousing. Together, we will tackle the massive opportunity in the big data and big data analytics market. Upon close, Aster Data will become [...]]]></description>
			<content:encoded><![CDATA[<p>We are very excited to share with you that today we announced our company, Aster Data, is being acquired by Teradata, who as you all know commands the #1 position in data warehousing. Together, we will tackle the massive opportunity in the big data and big data analytics market. Upon close, Aster Data will become part of the Teradata organization and our products will become part of the Teradata family of products, sold stand-alone, and integrated into their product line.</p>
<p>The combined goal is big, as said on Teradata’s web site home page:</p>
<p><img style="float: none;" src="http://www.asterdata.com/resources/images/blog/TD-hp-banner-450w.png" alt="Teradata homepage banner" /></p>
<p>Today marks a major milestone in our continuing journey, and we are thrilled to join forces with the market leader in data management. Our company has achieved a lot since our inception just 5 years ago, and we look forward to accelerating our innovation and market reach even further – with the market strength of Teradata and the speed of our combined cultures. In 5 years, we’ve played a big role in shaping the Big Data Analytics Platform market and innovated on new technologies that enable customers to store diverse, granular data and process it in diverse ways. The big data opportunity as we see it is more about extracting insights from your diverse data than just finding cost-effective ways to store it. Processing and extracting deep insights from diverse and big data is where we’ve innovated and broken new ground, and with this merger we will accelerate it further.</p>
<p>Our journey started when we realized that (a) it was hard and expensive to manage big data, and (b) it was nearly impossible to process and analyze diverse (non-relational) data types like Web clicks, social connections, and text files at scale. The two worlds of data management and data processing were separate – RDBMSs would store and manage data in their world; however, applications and tools would do analytics <em>outside </em>of the database. This division severely restricted the types of analytics possible on large amounts of data. We discussed this in more detail on an earlier blog post from <a href="http://www.asterdata.com/blog/2011/01/26/2011-the-year-of-the-analytics-platform-part-i/">January 26</a>.<em> </em></p>
<p>The real impact of the above two restrictions was that organizations were drawing in a flood of data and couldn’t make any sense out of it. For instance, organizations couldn’t analyze enough data to understand their customers at an individual level, and thus they couldn’t improve their products and customer experience. Or, they couldn’t detect advanced fraud schemes because the offenders were hiding in terabytes of data (the outliers) and complicated money network schemes, resulting in huge losses.</p>
<p>Foreseeing this opportunity, we decided to change the enterprise data infrastructure and build a platform that (a) uses commodity hardware to scale at unprecedented levels while keeping costs low, (b) <em>combines</em> data management and data processing in one platform to allow much deeper analysis of data at much larger scale, and (c) accommodates the processing of diverse data types (e.g. machine generated data, social network data, text data, etc.) in a <em>single platform</em>.</p>
<p>Over the past 5 years we have been aggressively building our technology and developing this big new market. We’ve had continuous and increasing success &#8211; one recognition of this was Gartner’s recent <a href="http://www.asterdata.com/ar_Gartner-Magic-Quadrant-for-Data-Warehouse-Database-Management-Systems-2010/index.php?ref=blog">Magic Quadrant</a>. And <a href="http://www.asterdata.com/blog/2011/02/02/2011-the-year-of-the-analytics-platform-%e2%80%93-part-ii/">looking forward</a>, we were seeing a 2011 where the new market we were creating would become mainstream reality across organizations. As a Gartner press release recently stated: “<em>2011 will be the year when data warehousing reaches what could well be its most-significant inflection point since its inception&#8230; The biggest, and possibly most-elaborate data management system in the IT house is changing. The new data warehouse will introduce new scope for flexibility in adding new information types and change detection.”</em></p>
<p>And this execution now sets the stage for our joining forces with Teradata. We love this merger for 3 reasons:</p>
<ul>
<li>First, we love that Teradata is by far the most successful data warehousing and data-driven applications company in the world. As founders, we understood that Teradata will accelerate our vision and will back us in realizing the full potential of the Big Data Analytics Platform.</li>
<li>Second, we have always had a big and ambitious technology vision. A bold vision needs time and resources to execute to its full potential. As part of Teradata, we will have the resources and support needed to accelerate our technology. We will also have access to a global sales organization and channel to accelerate the adoption of the Big Data Analytics Platform, and ultimately bring more benefits to our customers, more quickly.</li>
<li>Third, Aster Data <em>n</em>Cluster is very complementary to Teradata’s existing product portfolio. By combining products from both companies, we can come to market with solutions that solve a very wide range of diverse data management and data analysis problems using “best of breed” components. We expect both Aster Data and Teradata customers to find our joint offerings very unique and valuable for their business, thus increasing their opportunities and decreasing their costs.</li>
</ul>
<p>In closing, we want to re-iterate that we have never been more excited about our market, our company and our opportunity! Our vision has proven to be right early on and we’ve watch other players in our market try to follow suit – that’s just one external validation of our direction, and there have been many more as our customers use our products to break new ground in analytics insights on diverse and big data. As we innovated and as we delivered on the vision for big and diverse data management, our team’s execution has truly defined and helped shape the market. And in this evolution, we are more confident and tremendously excited as we write the next chapter of this market.</p>
<p>Upon close of this transaction, the merger with Teradata is about taking our products, our innovations, our IP, and the Aster Data team, and accelerating our lead in the big data and big data analytics market. Or simply put it’s about ‘going big.’</p>
<p>We really want to thank our customers that believed in us and drove key input into our product roadmap and see the big data opportunity. We promise that our commitment and support to you all will only increase in the future. Also our team, who joined a small company and have worked hard to make it so successful. And finally, our investors who understood the opportunity and believed they were going to be part of something new, valuable and exciting.</p>
<p>For more information on today’s announcement read the full press release at <a href="http://www.asterdata.com/news/110303-Teradata-to-Acquire-Aster-Data.php">www.asterdata.com</a> and also visit Teradata’s web site <a href="http://www.teradata.com/" target="_blank">www.teradata.com</a></p>
<p>- Mayank Bawa &amp; Tasso Argyros</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/2011/03/03/going-big-%e2%80%93-teradata-to-acquire-aster-data/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>2011: The Year of the Analytics Platform – Part II</title>
		<link>http://www.asterdata.com/blog/2011/02/02/2011-the-year-of-the-analytics-platform-%e2%80%93-part-ii/</link>
		<comments>http://www.asterdata.com/blog/2011/02/02/2011-the-year-of-the-analytics-platform-%e2%80%93-part-ii/#comments</comments>
		<pubDate>Wed, 02 Feb 2011 13:58:54 +0000</pubDate>
		<dc:creator>Tasso Argyros</dc:creator>
				<category><![CDATA[Analytic platform]]></category>
		<category><![CDATA[Analytics]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/?p=262</guid>
		<description><![CDATA[In my previous post, I spoke about how strongly I feel that this is the year that the analytic platform will become its own distinct and unique category.  As the market as a whole realizes the value of integrated data and process management, in-database applications and in-database analytics, the “analytic platform”, or “analytic computing system”, [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous post, I spoke about how strongly I feel that this is the year that the analytic platform will become its own distinct and unique category.  As the market as a whole realizes the value of integrated data and process management, in-database applications and in-database analytics, the “analytic platform”, or “analytic computing system”, or “data analytics server” (pick your name) will gain even more momentum, reaching critical mass this year.</p>
<p>In this process, you will see significant movement from vendors, first in their marketing collateral (as it is always the case for followers in a technology space) and then scrambling to cover their product gaps in the 5 categories that define a true analytic platform that I mentioned in Part I of <a href="http://www.asterdata.com/blog/2011/01/26/2011-the-year-of-the-analytics-platform-part-i/" target="_self"><strong><span style="text-decoration: underline;">2011: &#8211; The Year of the Analytics Platform</span></strong></a>.</p>
<p>What took Aster Data 6+ years to build is impossible to be done overnight, or over a few releases (side note: if you are interested in software product development and haven’t read the <a href="http://www.amazon.com/Mythical-Man-Month-Software-Engineering-Anniversary/dp/0201835959#_" target="_blank">Mythical Man-Month</a>, now is a good time – it’s an all-time classic and explains this point very clearly), and especially if the fundamental architecture is not there from day one.</p>
<p>But the momentum for the analytic platform category is there and, at this point, is irreversible. Part of this powerful trend is derived from the central place that analytics is taking in the enterprise and government. Analytics today is not a luxury, but a necessity for competitiveness. Every industry today is thinking how to employ analytics to better understand its customers, cut costs, and increase revenues. For example, companies in the financial services sector, a fiercely competitive space, want to use the wealth of data they have to become more relevant to their customers, increase customer satisfaction and retention rates. Governments’ use of data and analytics is one of few last resorts against terrorism and cyber threats. In retail, the advent of Internet, social networks, and globalization has increased competition and reduced margins. Using analytics to understand cross-channel behavior and preferences of consumers improves the returns of marketing campaigns and optimizes product pricing and placement, and can make the difference between red and black ink at the bottom of the balance sheet.<span id="more-262"></span></p>
<p>I believe I’m not alone in thinking the analytic platform revolution is here to stay. Probably the strongest statement about this came October 2010 when Merv Adrian and Colin White released their research report through the BEyeNETWORK, “<a href="http://www.asterdata.com/ar_analytic_platforms/index.php?ref=DataBlog" target="_blank">Analytic Platforms: Beyond the Traditional Data Warehouse</a>” (registration required, but you’ll only be contacted by Aster Data vs the other 8 sponsors) <img src='http://www.asterdata.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  . They did what no one vendor could do, which is build consensus and a broad view of the market forces at work which unequivocally put analytic platforms on the map as a distinct category with defined edges. The abstract says it all:</p>
<p><em>“The once staid and settled database market has been disrupted by an upwelling of new entrants targeting use cases that have nothing to do with transaction processing. Focused on making more sophisticated, real-time business analysis available to more simultaneous users on larger, richer sets of data, these analytic database management system (ADBMS) players have sought to upend the notion that one database is sufficient for all storage and usage of corporate information. <strong><span style="text-decoration: underline;">They have evangelized and successfully introduced the analytic platform and proven its value</span></strong>.”</em></p>
<p>If you read nothing else on the topic, read this report. Colin and Merv (who has since gone on to work for Gartner) did fantastic primary research on the core capabilities lacking in traditional data warehouse systems, the business needs analytic platforms meet, and much, much more – all backed with research and statistics.</p>
<p>Many others are thinking along the same lines, as pointed out in a recent <a href="http://www.information-management.com/newsletters/analytics_business_intelligence_databases_push_down-10019473-1.html?pg=2" target="_blank">Information Management Newsletter</a>:</p>
<p><em>“With enterprises becoming more aggressive in reducing their go-to-market time, alternate solutions to relational database management system products and solutions are emerging that focus on performance, efficient data storage and in-database analytical capabilities. It typically takes a fairly long time to build analytics out of transactional data and convert those insights into any tangible marketing action in any organization. <span style="text-decoration: underline;">The primary reason for this delay is that analytics is traditionally kept separate from the database layer. This can result in data replication, and with big data, agility does get compromised to an extent. The reason analytics is typically externalized (from the DB layer) is possibly because of the inadequacy and non-procedural nature of SQL, the language that has evolved as a standard to manipulate structured data. SQL in its native form is not meant for analytics but is intended for data storage, retrieval and creation of simple summaries.&#8221;</span></em></p>
<p><em> </em></p>
<p>While this article does not talk about analytic platforms as a category, it talks about the technical reasons why critical new capabilities such as advanced in-database analytic techniques which support both declarative-based SQL as well as procedural-based languages to process data without replication outside the system, are necessary.</p>
<p>Curt Monash wrote a <a href="http://www.dbms2.com/2011/01/24/analytic-computing-system/" target="_blank">post</a> on January 24 saying:</p>
<p><em>“I’m going to refer to an analytic RDBMS that has been extended by advanced-analytics functionality as an<strong> analytic computing system, </strong>rather than as some kind of “platform,” although I suspect the latter term is more likely to wind up winning…</em>”<strong> </strong></p>
<p><strong> </strong></p>
<p>While some haven’t fully adopted the idea of an analytic platform, several are already recognizing Aster Data’s potential as a leader of the category. For instance, on January 21, CIO pointed to Aster Data as one of<strong> </strong><a href="http://www.cio.co.uk/article/3257491/twenty-companies-to-watch-in-2011/" target="_blank"><strong>twenty companies to watch in 2011</strong></a><strong>:</strong></p>
<p><em>“Using a combination of embedded analytics and a high performance, large scale data management platform, Aster Data supports complex and data intensive applications such as those for web analytics, customer behaviour analytics and fraud detection. With organisations looking to find new ways to use and analyse the vast amount of data they have been collecting but not exploiting in 2011, Aster Data and its analytic database platform is well positioned to make the most of this opportunity.”</em></p>
<p>I’m interested in hearing your thoughts about this. Leave a comment below to share your own insights…</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/2011/02/02/2011-the-year-of-the-analytics-platform-%e2%80%93-part-ii/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>2011: The Year of the Analytics Platform &#8211; Part I</title>
		<link>http://www.asterdata.com/blog/2011/01/26/2011-the-year-of-the-analytics-platform-part-i/</link>
		<comments>http://www.asterdata.com/blog/2011/01/26/2011-the-year-of-the-analytics-platform-part-i/#comments</comments>
		<pubDate>Wed, 26 Jan 2011 14:04:58 +0000</pubDate>
		<dc:creator>Tasso Argyros</dc:creator>
				<category><![CDATA[Analytic platform]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/?p=251</guid>
		<description><![CDATA[When we kicked off Aster Data back in 2005, we envisioned building a product that would advance the state of the art in data management in two areas; (1) size and diversity of data and (2) depth of insight/analytics. My co-founders and I quickly realized that building just another database wouldn’t cut it. With yet-another-database, [...]]]></description>
			<content:encoded><![CDATA[<p>When we kicked off Aster Data back in 2005, we envisioned building a product that would advance the state of the art in data management in two areas; (1) size and diversity of data and (2) depth of insight/analytics. My co-founders and I quickly realized that building just another database wouldn’t cut it. With yet-another-database, even if we enabled companies to more cost-effectively manage large data sizes, it was not going to be enough given the explosion in diverse data types and the massive need to process all of it. So we set out to build a new platform that would solve these challenges &#8211; what’s now commonly known as the ‘Big Data’ challenge.</p>
<p>Fast forward to 2008 when Aster Data led the way in putting massive parallel processing <em>inside</em> a MPP database, using MapReduce, to advance how you process massive amounts of diverse data. While this was fully aligned with our vision for managing hoards of diverse data and allowing deep data processing in <em>a single platform</em>, most thought it was intriguing but couldn’t quite see the light in terms of where the future was going. At one point, we thought of naming our product XAP – “extreme analytic platform” or “extreme analytic processing” as that’s what it was designed to do from day one. However, we thought better of it since we thought we would have to educate people too much on what an “analytic platform” was and how it was different from a traditional DBMS for data warehousing. Since we also were serving the data architects in organizations as well as the front-line business that demands better, faster analytics, we needed to use terminology that resonated with both.</p>
<p>Then, in the fall of 2009, with our flagship product Aster Data <em>n</em>Cluster 4.0, we made further strides in running advanced analytics inside the database by including all the <em>built-in application services</em> (e.g. like dynamic WLM, backup, monitoring, etc) to go with it. At that time, we referred to it as a Data-Application Server &#8211; which our customers quickly started calling a Data-Analytics Server.  I remember when analyst Jim Kobielus at Forrester <a href="http://pcworld.about.com/od/softwareservices/Aster-Adds-data-application-S.htm" target="_blank">said</a>,<em></em></p>
<p><em>&#8220;It&#8217;s really innovative and I don&#8217;t use those terms lightly. Moving application logic into the data warehousing environment is ‘a logical next step’.&#8221;<br />
</em></p>
<p>And <a href="http://www.cbronline.com/news/aster_data_tackles_big_data_analysis_281009" target="_blank">others</a> saying,<em></em></p>
<p><em>&#8220;The platform takes a different approach from traditional data warehouses, DBMS and data analytics solutions by housing data and applications together in one system, fully parallelizing both. This eradicates the need for movements of massive amounts of data and the problems with latency and restricted access that creates.&#8221;</em></p>
<p>What they started to fully appreciate and realize is that big data is not just about storing hoards of data, but rather, cracking the code on how to process all of it in deep ways, at blazing fast speeds.<span id="more-251"></span></p>
<p>By 2010, everyone whose roots were in cost-effective data warehousing suddenly started claiming to be a platform for deep analytics. How ironic &#8212; since none of the other vendors had even built any fundamental architecture that lent itself to being a true analytics machine.  Aster Data did so &#8211; from the very beginning &#8211; with a deep in-database analytics engine; SQL-MapReduce analytics processing; data-application services so you could really run all procedural code in-database and support it with all the key application services; out-of-the-box analytics modules (that now number over 1000); and a visual development environment for point-and-click development of apps and push down of apps into our platform.</p>
<p>By late 2010, the term “analytic platform” started to take shape. The definition of it fit exactly with what Aster Data has built. And now, traditional DW appliances are claiming to be analytic platforms. Even Netezza is taking the same box they had before and calling it “An Appliance for Deep Business Analytics,” and pure columnar MPP DBMS&#8217;s like Vertica and ParAccel overnight went from being &#8216;the world&#8217;s fastest database&#8217; to ALL claiming to be an analytic(s) platform.  This is a typical marketing trajectory if you now see where the future lies in big data management.  The market as a whole is gravitating to accept that if you truly want to manage big, diverse data, you ultimately want to analyze all of it, and for that you&#8217;re really in need of a big data <em>analytic platform</em> &#8211; not just a big data <em>store</em>. Recently, Curt Monash supports a similar notion when describing <a href="http://www.dbms2.com/2011/01/24/analytic-computing-system/" target="_blank">choices in analytic computing system design</a>.</p>
<p>I predict this year will be the year where the analytic platform &#8211; which we at Aster Data started to talk about and deliver in 2008 &#8211; will now be a distinct and unique category: distinct from an enterprise data warehouse (EDW); distinct from traditional DBMSs; distinct from even some pure MPP DBMSs; and distinct from even Hadoop.</p>
<p>An analytic platform, put simply, must have the following:</p>
<p>1.<em> Native in-database processing engine</em> for application embedding that provides the capability to run applications inside an MPP database with high performance and high reliability and provides necessary services so that applications can process the right data at the right time.  This is not UDFs, which due to architectural limitations have always been a small “niche” in the RDBMS world, or Stored Procedures which can’t match the performance and flexibility needed to push applications inside in a large scale MPP system.</p>
<p>2.<em> Native support for MapReduce</em>.  Just like the world needs a language for basic data management (SQL), it also needs a framework for writing and deploying applications inside the data analytics platform.  MapReduce is by far the most prominent and promising interface for doing exactly that. Hadoop’s open source popularity will only propel MapReduce’s significance and already MapReduce has become the de-facto standard for writing large, data intensive parallel applications inside the database.  “Native” means that MapReduce is not built on top of UDFs or Stored Procedures, nor is it a side-implementation on the DBMS, nor is it a simple Hadoop connector as every other DBMS vendor has done.  All these approaches make MapReduce totally unusable for big data applications &amp; analytics.</p>
<p>3.<em> Tight integration with SQL</em>. A great Analytics Platform for diverse data management, built for the Enterprise needs to respect SQL or otherwise risk running into objections in the Enterprise world where the majority of the skill set is around SQL. As of now, Aster Data’s patent-pending SQL-MapReduce is the only technology that manages to tightly blend two different worlds – SQL and MapReduce – together.  Integration means that SQL analysts can seamlessly use MR applications inside the database, and MR data scientists can use the flexibility and power of SQL to complement their MR applications.</p>
<p>4.<em> Enterprise feature support for in-database applications</em> – High Availability; Backup/Restore; Access and Security; Resource Partitioning and Prioritization for concurrent operations (WLM). Several database systems offer these capabilities when it comes to SQL queries; but none offers all the same when it comes to in-database applications.  Having the same support for both interfaces is critical because otherwise in-database apps are naturally forced into a second-citizen status making DBAs reluctant to let users run in-db apps and Enterprise security and compliance groups reserved about risks of going beyond SQL. Aster Data’s platform is unique in that SQL is just another interface, just like MapReduce.  All platform capabilities have been built as services and treat each interface in the same way, resulting in 100% enterprise feature support for both interfaces.</p>
<p>5. <em>Making in-database application development EASY and cost-effective</em>.  An otherwise great Analytical Platform that’s too hard and too expensive to use will end up doing no analytics and thus does not deserve to be called an Analytical Platform.  This is why elements like the following are critical: libraries of pre-packaged SQL/MR analytic functions; 3rd party in-database application integrations; integrated, graphical, development environment for in-database apps and MapReduce; data visualization tools; unified monitoring and debugging of SQL and in-database application processes; and integrations with front-end visualization tools that can distribute the power of MapReduce to 1000s of analysts and business users.</p>
<p>So from what I see in organizations I talk to every week, their need for a new Analytics Platform for managing and processing big, diverse data is loud and clear. I think this is the year when it is validated and legitimized as a new and distinct category. If you agree – or disagree for that matter – leave a comment or point me to a resource that argues the counter points.  I look forward to hearing from you.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/2011/01/26/2011-the-year-of-the-analytics-platform-part-i/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Clarifying the Terms Around MapReduce</title>
		<link>http://www.asterdata.com/blog/2010/12/08/clarifying-the-terms-around-mapreduce/</link>
		<comments>http://www.asterdata.com/blog/2010/12/08/clarifying-the-terms-around-mapreduce/#comments</comments>
		<pubDate>Wed, 08 Dec 2010 13:00:26 +0000</pubDate>
		<dc:creator>Tasso Argyros</dc:creator>
				<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/?p=234</guid>
		<description><![CDATA[In the past couple of years, MapReduce – once an unknown, funky word – became a prominent, mainstream trend in data management and analytics. However even today I meet people that are not clear on what MapReduce exactly is and how it relates to some other terms and trends. In this post I attempt to [...]]]></description>
			<content:encoded><![CDATA[<p>In the past couple of years, MapReduce – once an unknown, funky word – became a prominent, mainstream trend in data management and analytics. However even today I meet people that are not clear on what MapReduce exactly is and how it relates to some other terms and trends. In this post I attempt to clarify some of the MapReduce-related terminology. So here it goes.</p>
<p><strong>MapReduce (the framework).</strong> MapReduce is a framework that allows programmers to develop analytical applications that run on (usually large) clusters of commodity hardware and process (usually large) amounts of data. <a href="http://labs.google.com/papers/mapreduce.html" target="_blank">It was first introduced by Google</a> and it is language independent. It is abstract in the sense that an application that uses MapReduce doesn’t have to care about things like the number of servers/processes, fault tolerance, etc. MapReduce is supported by multiple implementations including the open source project Hadoop and Aster Data. Google also has its own proprietary implementation which, unfortunately, is also called MapReduce and sometimes creates confusion.</p>
<p><strong>MapReduce (the Google <em>implementation</em> of MapReduce framework). </strong>As mentioned above, Google has its own implementation of MapReduce. This was described in the <a href="http://labs.google.com/papers/mapreduce.html" target="_blank">2004 OSDI paper</a> and it was the theoretical basis upon which Hadoop was developed. Google’s MapReduce was a processing framework and it was using <a href="http://labs.google.com/papers/gfs.html" target="_blank">Google’s GFS (Google File System)</a> for data storage.</p>
<p><strong>Aster Data’s SQL-MapReduce. </strong>Aster Data has a <a href="http://www.asterdata.com/resources/mapreduce.php#SQLMR" target="_self">patent-pending implementation of MapReduce</a> that (a) uses a database for data persistence, (b) is tightly integrated with SQL, i.e. an analyst or BI tool can invoke MapReduce via SQL queries, thus making MapReduce accessible to the enterprise. It supports multiple programming languages such as Java and C and it is accessible through standard interfaces such as ODBC and JDBC.</p>
<p><strong>Hadoop. </strong>Hadoop is an Apache “umbrella” project that hosts many sub-projects, including Hadoop MapReduce and HDFS, Hadoop’s version of the Google File System which Hadoop MapReduce uses for data storage. Hadoop is the core open source project &#8211; however, there are many distributions for Hadoop, just as there are many distributions for Linux. These distributions contain Hadoop binaries together with other utilities and tools. The most popular distributions are the <a href="http://www.cloudera.com/hadoop/" target="_blank">Cloudera</a> distribution, the <a href="http://developer.yahoo.com/blogs/hadoop/" target="_blank">Yahoo distribution</a> and the baseline <a href="http://hadoop.apache.org/common/docs/r0.20.0/quickstart.html#Download" target="_blank">Apache distribution</a>.</p>
<p><strong>HDFS. </strong>HDFS is Hadoop’s version of GFS and it is a distributed file system. HDFS can exist without Hadoop MapReduce, but usually Hadoop MapReduce requires HDFS. Aster Data’s MapReduce does not require HDFS as it uses an extensible MPP database for data storage and persistence.</p>
<p><strong>Cloudera. </strong>Cloudera usually means either (a) <a href="http://www.cloudera.com" target="_blank">the company</a>, (b) <a href="http://www.cloudera.com/hadoop/" target="_blank">Cloudera&#8217;s Distribution for Hadoop</a>.</p>
<p><strong>Sqoop. </strong>Sqoop which is short for &#8220;SQL to Hadoop&#8221; is an open source project that provides a framework for connecting to SQL data stores for data exchange.</p>
<p><strong>NoSQL. </strong>NoSQL started as a term to describe a collection of products that did not support or rely on SQL. This included Hadoop and other products like Cassandra. However, as more people realized that SQL is a necessary interface  for many data management systems, the term evolved to mean (N)ot (o)nly SQL. These days there are attempts to port SQL on top of Hadoop and other NoSQL products.</p>
<p>Are there any MapReduce-related terms I omitted? Please add them in the comments below and include a definition and links to good resources if you&#8217;d like.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/2010/12/08/clarifying-the-terms-around-mapreduce/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Dell + Aster data analytics solution: Live and in use</title>
		<link>http://www.asterdata.com/blog/2010/11/19/dell-aster-data-analytics-solution-live-and-in-use/</link>
		<comments>http://www.asterdata.com/blog/2010/11/19/dell-aster-data-analytics-solution-live-and-in-use/#comments</comments>
		<pubDate>Fri, 19 Nov 2010 23:12:22 +0000</pubDate>
		<dc:creator>Barton George</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Cloud Computing]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/blog/?p=217</guid>
		<description><![CDATA[Barton George is Cloud Computing and Scale-Out Evangelist for Dell. Today at a press conference in San Francisco we announced the general availability of our Dell cloud solutions. One of the solutions we debuted was the Dell Cloud Solution for Data Analytics, a combination of our PowerEdge C servers with Aster Data’s nCluster, a massively [...]]]></description>
			<content:encoded><![CDATA[<p><em>Barton George is Cloud Computing and Scale-Out Evangelist for Dell.</em></p>
<p>Today at a press conference in San Francisco <a href="http://content.dell.com/us/en/corp/d/press-releases/2010-11-19-dell-dcs-customer-wins.aspx">we announced</a> the general availability of our Dell cloud solutions. One of the  solutions we debuted was the Dell Cloud Solution for Data Analytics, a  combination of our <a href="http://www.dell.com/content/topics/topic.aspx/global/products/landing/en/poweredge-c-series?c=us&amp;l=en&amp;s=gen&amp;redirect=1" target="_blank">PowerEdge C servers</a> with <a href="../../product/index.php" target="_blank">Aster Data’s nCluster</a>, a massively parallel processing database with an integrated analytics engine.</p>
<p>Earlier this week I stopped by <a href="../../" target="_blank">Aster Data</a>&#8216;s  headquarters in San Carlos, CA and met up with their EVP of marketing,  Sharmila Mulligan. I recorded <a href="http://www.youtube.com/watch?v=LQzSRLt9AuA" target="_blank">this video</a> where Sharmila  discusses the Dell and Aster solution and the fantastic results a  customer is seeing with it.</p>
<p><strong>Some of the ground Sharmila covers:</strong></p>
<ul>
<li>What customer pain points and problems does this solution address  (hint: organizations trying to manage huge amounts of both structured  and unstructured data)</li>
<li>How Aster’s <a href="../../product/index.php">nCluster</a> software is optimized for Dell <a href="http://www.dell.com/us/en/enterprise/servers/poweredge-c2100/pd.aspx?refid=poweredge-c2100&amp;s=biz&amp;cs=555">PowerEdge C2100</a> and how it provides very high performance analytics as well as a cost effective way to store very large data.</li>
<li>(2:21) <a href="http://www.insightexpress.com/">InsightExpress</a>,    a leading provider of digital marketing research solutions, has    deployed the Dell and Aster analytics solution and has seen great  results:
<ul>
<li>Up and running w/in 6 weeks</li>
<li>Queries that took 7-9 minutes now run in 3 seconds</li>
</ul>
</li>
</ul>
<p>Pau for now…</p>
<p><strong>Extra-credit reading</strong></p>
<ul>
<li><a href="../../news/101119-Aster-Data-Dell-reseller.php">Aster  Data and Dell Partner to Provide Customers with Integrated Solutions  for Large Scale Data Management and Advanced Analytics</a></li>
<li><a href="../../news/101119-InsightExpress-Aster-Data-Dell.php">Aster Data and Dell Selected by InsightExpress to Enable Advanced Data Analytics for  Digital Marketing Research</a></li>
<li><a title="Permanent Link to Talking to Aster Data’s brand new CEO" rel="bookmark" href="http://bartongeorge.net/2010/09/30/talking-to-aster-datas-brand-new-ceo/">Talking to Aster Data’s brand new CEO</a></li>
<li>An Overview:<a title="Permanent Link: Aster’s Big Data Architecture" rel="bookmark" href="http://bartongeorge.net/2010/09/03/asters-big-data-architecture/"> Aster’s Big Data Architecture</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/blog/2010/11/19/dell-aster-data-analytics-solution-live-and-in-use/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

