<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Winning with Data</title>
	<atom:link href="http://www.asterdata.com/ceo-blog/index.php/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.asterdata.com/ceo-blog</link>
	<description>Aster Data CEO Blog</description>
	<lastBuildDate>Wed, 28 Apr 2010 01:21:47 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Design Patterns &#8211; The (Iterative) Analytical Data Warehouse</title>
		<link>http://www.asterdata.com/ceo-blog/index.php/2010/04/26/design-patterns-the-analytical-data-warehouse/</link>
		<comments>http://www.asterdata.com/ceo-blog/index.php/2010/04/26/design-patterns-the-analytical-data-warehouse/#comments</comments>
		<pubDate>Mon, 26 Apr 2010 19:57:48 +0000</pubDate>
		<dc:creator>Mayank</dc:creator>
				<category><![CDATA[Analytics]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/ceo-blog/?p=289</guid>
		<description><![CDATA[I&#8217;ve remarked in an earlier post that the usage of data is changing and new applications are on the horizon. In the last couple of years, we&#8217;ve observed interesting design patterns for business processes that use data.
In a previous post, I outlined a design pattern that we call &#8220;The Automated Feedback Loop.&#8221; In this post, [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve remarked in an earlier post that the usage of data is changing and new applications are on the horizon. In the last couple of years, we&#8217;ve observed interesting design patterns for business processes that use data.</p>
<p>In a previous post, I outlined a design pattern that we call &#8220;<a href="http://www.asterdata.com/ceo-blog/index.php/2008/05/20/the-automated-feedback-loop/">The Automated Feedback Loop</a>.&#8221; In this post, I want to outline a design pattern that we call &#8220;The (Iterative) Analytics Data Warehouse&#8221;.</p>
<p>The traditional well-understood design pattern of a <a href="http://www.asterdata.com/product/index.php">data warehouse</a> is a <strong><em>central </em></strong>(for Enterprise Data Warehouse) or <strong><em>departmental </em></strong>(for Data Marts) repository of data. Data is fed into the warehouse from ETL processes that pull data from a variety of sources. The data is organized in a data model that caters to 3 use-cases of the warehouse:</p>
<ol>
<li><strong><em>Reports </em></strong>- A set of BI queries are run with regular frequency to monitor the state of the business. The target of the reports are business users who want to understand what happened. The goal is to keep them in touch with the pulse of the business.</li>
<li><em><strong>Exports </strong></em>- A set of export jobs are run with adhoc frequency to provide data sets for further analysis. The target of the exports are business analysts who want to optimize business practices. The goal is to provide them with true, quality-stamped data so that they can make confident optimization recommendations.</li>
<li><strong><em>Adhoc </em></strong>- A set of queries are run with adhoc frequency to detect or verify patterns that influence business events. The source of the queries are data scientists who want to understand and optimize business practices. The goal is to provide them with computation capabilities (good query interfaces, enough processing, memory and storage resources) to allow them to interact with the data.</li>
</ol>
<p>The exports and adhoc tasks are transient tasks. Once the data analysts or data scientists find a pattern valuable to the business, that pattern is incorporated into a report so that business users can monitor that pattern on a frequent repeatable practice.</p>
<p>In a typical data warehouse, the bulk of tasks (~80%) are from [1] Reports. The remainder of 20% is from [2] Exports and [3] Adhoc.</p>
<p>Since Reports are frequent and generate known queries, the design of the data warehouse is done to cater to reporting. This includes data models, indexes, materialized views or derived tables &#8211; and other optimizations &#8211; to make the known Reporting queries go fast.<span id="more-289"></span></p>
<p>Since exports and adhoc tasks are infrequent and generate unknown queries, the design of the data warehouse is unable to cater to them upfront. This means that exports and adhoc tasks generate queries that are harder to satisfy (since they have to fight the data modeling decisions made for reporting) and therefore impose more load on the data warehouse.</p>
<p>The net result is that reports run fast while exports and adhocs are slow. In fact, exports and adhocs consume so much resources, that reporting starts running slower. And that is <em>not </em>good &#8211; reports are distributed widely and reach a wide variety of business users. They are unhappy and put pressure on the data warehousing team to &#8220;get the reports in time.&#8221; At the same time, the export and adhoc users are unhappy because they can&#8217;t get to the data fast enough to benefit the business.</p>
<p>The poor data warehouse team now looks for solutions invariably prioritizing reports. Historically, the answer was to prioritize via workload management &#8211; constrain adhoc &amp; export usage to devote resources to reporting. If workload management didn&#8217;t work, the answer was to create walled gardens by enforcing rules -</p>
<blockquote><p>&#8220;thou shalt not report when data is loading; thou shalt not adhoc query when reports are being generated; thou shalt not export except in the evening&#8221;.</p></blockquote>
<p>Let&#8217;s call this design pattern of data warehouse to be a <strong><em>&#8220;Reporting Data Warehouse&#8221;</em></strong>.</p>
<p>The first-class citizen of a &#8220;Reporting Data Warehouse&#8221; is reports. The exports and adhoc are second-class citizens &#8211; they do not get dedicated data models, they do not get large chunks of resources, and their protests are answered by asking them to constrain their requirements (use samples, use rolled up aggregates that were built to make reports faster, use smaller timeframes of history that were retained to just satisfy reporting requirements, phrase queries that are simpler even though they may be compromises on the pattern sought, &#8230;).</p>
<p>This motivates the definition of a different design pattern of a data warehouse whose use is to be an &#8220;<strong><em>Analytical (Iterative) Data Warehouse</em></strong>&#8220;.</p>
<p>The first-class citizens of an &#8220;Analytical Data Warehouse&#8221; are exports and adhoc analytics, and the primary users of the Analytical Data Warehouse are business analysts and data scientists. The data models are built to support their adhoc usage &#8211; fine-granularity data is retained, rich dimension tables are frequently imported, derived views and tables are created promptly, interfaces are opened up to express their patterns in a computationally simple and natural manner, scale-out is used to create resources for the tasks to finish interactively, enough storage is allocated for several exports to proceed simultaneously.</p>
<p>In other words, the infrastructure and the team exist to support export and adhoc usage as their primary customers.</p>
<p>Once an insight is confirmed, it can be added with careful design to the Reporting Data Warehouse &#8211; with carefully defined data models, indexes and materialized views and support to maintain it during the ETL process.</p>
<p>The infrastructure can play a significant role in enabling Analytical Data Warehouses.</p>
<ol>
<li><strong><em>Query interface to support in-database computations</em></strong>: The export and adhoc queries want to manipulate data in rich ways, and often SQL is not enough. The infrastructure should support an easy-to-express interface for rich computations (e.g., SQL/MapReduce). This is important because the correct perspective of data, amenable to downstream manipulation, cannot be defined upfront in the ETL process.</li>
<li><em><strong>Incremental scale-out MPP capabilities: </strong></em>The infrastructure should allow for an ability to scale out both storage and computing resources incrementally and easily (i.e., without months of planning). This is important because temporary storage requirements (as data is transformed for analysis) or temporary processing requirements (as several models are generated to validate insights) at the peak of the analysis can be much higher than normal use.</li>
<li><strong><em>Cheap hardware:</em></strong> The size of data demanded by exports and adhoc users may be large and the computations may be rich. The infrastructure must enable analysis of data at a cheap operating point.</li>
<li><strong><em>Workload manage both export and analysis</em></strong>: Some data analysts are comfortable in manipulating data in their preferred tools (e.g., Excel, SAS, Matlab) &#8211; others are comfortable writing in-database queries (SQL, Java, C++, Perl, Python). The infrastructure should elegantly manage all tasks, and not require a &#8220;walled garden&#8221; to favor queries over export, or vice-versa.</li>
</ol>
<p>My observation has been that the design methodology of an Analytical Data Warehouse is substantially different from a Reporting Data Warehouse. Understanding the primary customer of a data warehouse can often help simplify operations of the data warehouse and help lower the operating point costs substantially by making priorities clearer.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/ceo-blog/index.php/2010/04/26/design-patterns-the-analytical-data-warehouse/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2010 and Beyond &#8211; Data Clouds and Next-generation Analytics</title>
		<link>http://www.asterdata.com/ceo-blog/index.php/2010/04/15/2010-and-beyond-data-clouds-and-next-generation-analytics/</link>
		<comments>http://www.asterdata.com/ceo-blog/index.php/2010/04/15/2010-and-beyond-data-clouds-and-next-generation-analytics/#comments</comments>
		<pubDate>Thu, 15 Apr 2010 21:10:54 +0000</pubDate>
		<dc:creator>Mayank</dc:creator>
				<category><![CDATA[Analytic applications]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/ceo-blog/?p=270</guid>
		<description><![CDATA[In the last few years there has been a significant amount of market pickup, from users and vendors, on data clouds and advanced analytics &#8211; specifically a new class of data-driven applications run in a data cloud or on-premise. What&#8217;s different about this from past approaches is the frequency and speed at which these applications [...]]]></description>
			<content:encoded><![CDATA[<p>In the last few years there has been a significant amount of market pickup, from users and vendors, on data clouds and advanced analytics &#8211; specifically a new class of data-driven applications run in a data cloud or on-premise. What&#8217;s different about this from past approaches is the frequency and speed at which these applications are accessed, the depth of the analysis, the number of data sources involved and the volume of data mined by these applications &#8211; terabytes to petabytes. In the midst of this cacophony of dialogue, recent announcements from vendors in this space are helping to clarify different visions and approaches to the big data challenge.</p>
<p>Both Aster Data and Greenplum made announcements this week that illustrated different approaches. At the same time that Aster Data announced the Aster Analytics Center, Greenplum announced an upcoming product named Chorus. I wanted to take a moment to compare and contrast what these announcements say about the direction of the two companies.</p>
<p>Greenplum&#8217;s approach speaks to two traditional problem areas i) access to data, from provisioning of data marts to connectivity to data across marts, and ii) some level of collaboration among certain developers and analysts. Their approach is to create a tool for provisioning, unified data access, and sharing of annotations and data among different developers and analysts.  Interestingly, this is not an entirely new concept; these are well-known problems for which a number of companies and tools have already developed best-of-breed solutions over the last 15 years. For example, the capabilities for data access are another version of Export/Copy primitives that already exist in all databases and that have been built upon by common ETL and EII tools for cases in which richer support than Export &amp; Copy are needed – for instance, when data has to be transformed, correlated or cleaned while being moved from one context (mart) to another (mart).</p>
<p>This approach is indicative of a product direction in which the primary focus is on adding another option to the list of tools available to customers to address these problems. It&#8217;s really not a ground-breaking innovation that evolves the world of analytics. New types of analytics, or &#8216;data-driven applications,&#8217; is where the enormous opportunity lies. The Greenplum approach of data collaboration is interesting in a  test environment or sandbox. When it comes to real production value however, it effectively increases the functions available to the end user, but at a big cost due to significant increases in complexity, security issues and extra administrative overhead. What does this mean exactly?</p>
<ul>
<li> The spin-up of marts and moving data around can result in &#8220;data sprawl&#8221; which ultimately increases administrative overhead and is dangerous in these days of compliance and sensitivity to privacy and data leaks.</li>
<li>Adding a new toolset into the data processing stack creates difficult and painful work to either manage and administer multiple tool sets for similar purposes or to eliminate and transition away from investments in existing toolsets.</li>
<li>To enable effective communication and sharing, users need strong processes and features for source identification of data, data collection, data transformation, rule administration, error detection &amp; correction, data governance and security. The quality and security policies around meta-data are especially important as free-form annotations can lead to propagation of errors or leaks in the absence of strong oversight.</li>
</ul>
<p></p>
<p>In contrast, Aster Data&#8217;s <a href="http://www.asterdata.com/news/100412-Aster-Analytics-Center.php">recent announcements</a> support our long-standing investments in our unique advanced in-database architecture where applications run fully inside Aster Data&#8217;s platform with complete application services essential for complex analytic applications. The announcements highlight that our vision is not to create a new set of tools and layers in the data stack that recreate capabilities currently available from a number of leading vendors, but rather to deliver a new Analytics Platform, a Data-Application Server, to uniquely enable analytics professionals to create data-rich applications that were impossible or impractical before &#8211; namely, to create and use advanced analytics for rich, rapid, and scalable insights into their data. This focus is complemented by our partners, who offer proven best-of-breed solutions for collaboration and data transformation.</p>
<p><span id="more-270"></span></p>
<p>A key illustration of the investments that Aster is making in this vision is the formation of the new Aster Analytics Center: a center of excellence; ready-to-use analytics solutions that leverage MapReduce; and best practices for advanced analytics on big data. The Center&#8217;s charter is to develop products and provide insights that help organizations use data in clever ways to enable data-driven decisions. The Center is headed by Dr. Jonathan Goldman, our Director of Analytics and Applications, and a team of analytics experts. Jonathan joined us from LinkedIn, where as their Principal Scientist he led a team of analytics researchers to build cutting-edge products with the rich data sets LinkedIn has amassed. His team&#8217;s focus was on driving growth and user engagement for the LinkedIn social network. His team developed a successful model to build, ship, and iterate &#8211; to deliver value to LinkedIn effectively and sustainably. Across 3 years, he and his team delivered several industry-first features that surprised and delighted LinkedIn&#8217;s users &#8211; &#8220;People You May Know,&#8221; &#8220;Who Viewed My Profile?&#8221; &#8220;Jobs that are Similar to Mine,&#8221; and several others.</p>
<p>One of the first product solutions from the Aster Analytics Center is a suite of advanced analytics modules built on SQL and MapReduce called ‘Aster Data Analytic Foundation.’ The suite makes it easy for data analysts to leverage large volumes of diverse data effectively. This package, which made its debut with our <a href="http://www.asterdata.com/product/whats-new.php"><em>n</em>Cluster 4.5</a> release, <a href="http://www.asterdata.com/news/100222-Aster-Data-4-dot-5.php">announced</a> in February, provides a suite of rich analytic functions that enable data scientists and users to manipulate data easily rather than building primitives from scratch.</p>
<p>The second aspect of the Analytics Center&#8217;s charter, the methodology of using data, is being addressed by their work to create analytics best practices that provide blueprints for data analysts to develop their insights into an operational data product that can be delivered repeatably.</p>
<p>From what we see already with customers, the Aster Analytics Center &#8211; the Aster Analytics Foundation solution, big data analytics best practices, and deep analytics expertise &#8211; will be a catalyst accelerating a chain reaction that will revolutionize data usage across industries.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/ceo-blog/index.php/2010/04/15/2010-and-beyond-data-clouds-and-next-generation-analytics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Partnering with SAS Institute for &#8216;Big Data&#8217; Analysis</title>
		<link>http://www.asterdata.com/ceo-blog/index.php/2009/11/16/partnering-with-sas-institute-for-big-data-analysis/</link>
		<comments>http://www.asterdata.com/ceo-blog/index.php/2009/11/16/partnering-with-sas-institute-for-big-data-analysis/#comments</comments>
		<pubDate>Mon, 16 Nov 2009 09:31:02 +0000</pubDate>
		<dc:creator>Mayank</dc:creator>
				<category><![CDATA[Data-Application Server]]></category>
		<category><![CDATA[Statement]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/ceo-blog/?p=207</guid>
		<description><![CDATA[
We are very excited to announce a strategic partnership between Aster Data and SAS Institute to further accelerate the “SAS In-Database Processing” initiative.
The objective of the partnership is to integrate SAS software capabilities within our MPP database which Aster Data’s 4.0 release uniquely supports. Last week we announced the capability to fully push down analytics [...]]]></description>
			<content:encoded><![CDATA[<p><img class="size-full wp-image-218 alignright" title="S285_sas100K" src="http://www.asterdata.com/ceo-blog/wp-content/uploads/2009/11/S285_sas100K3.jpg" alt="SAS Institute" width="116" height="48" /></p>
<p>We are very excited to announce a strategic partnership between Aster Data and SAS Institute to further accelerate the “SAS In-Database Processing” initiative.</p>
<p>The objective of the partnership is to integrate SAS software capabilities within our MPP database which Aster Data’s 4.0 release uniquely supports. Last week we announced the capability to <a href="http://www.asterdata.com/ceo-blog/index.php/2009/11/02/the-era-of-big-data-applications/">fully push down analytics application logic inside our MPP database</a> so applications can now inside the database allowing analytics to be performed on massive data scales with very fast response.  We call this a <a href="http://www.asterdata.com/product/">Massively Parallel Data-Application Server</a>. We had earlier presented more details on this unique implementation of SAS software inside Aster Data&#8217;s nCluster software at a co-hosted session <a href="http://www.sas.com/m2009"> with SAS at M2009</a>.</p>
<p>Our architecture enables SAS software procs to run natively inside the database thereby preserving the statistical integrity of SAS software computations while giving unprecedented performance increases during analysis of large data sets. SAS Institute partners in this initiative with other databases too – but the difference is that each of these databases require the re-implementation of SAS software procs as proprietary UDFs or Stored Procedures.</p>
<p>We also allow dynamic workload management capabilities to enable graceful resource sharing between SAS software computations, SQL queries, loads, backups and scale-outs – all of which may be going on concurrently. The workload management enables administrators to dial-up or dial-down resources to the data mining operations based on the criticality of the mining and other tasks being performed.</p>
<p>Our fast loading and trickle feed capabilities ensure that SAS software procs have access to fresh data for modeling and scoring, ensuring a timely and accurate analysis. This avoids the need to export snapshots (or samples) of data to an external SAS server for analysis, saving analysts valuable time in their iterations and discovery cycles.</p>
<p>We’ve been working with SAS Institute for a while now, and it is very evident why SAS has been the market leader in analytic applications for three decades. The technology team is very sharp, driven to innovate and execute. And as a result we’ve achieved a lot working together in a short time.</p>
<p>We look forward to working with SAS Institute to dramatically advance analytics for big data!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/ceo-blog/index.php/2009/11/16/partnering-with-sas-institute-for-big-data-analysis/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Era of &#8220;Big Data&#8221; Applications</title>
		<link>http://www.asterdata.com/ceo-blog/index.php/2009/11/02/the-era-of-big-data-applications/</link>
		<comments>http://www.asterdata.com/ceo-blog/index.php/2009/11/02/the-era-of-big-data-applications/#comments</comments>
		<pubDate>Mon, 02 Nov 2009 04:03:33 +0000</pubDate>
		<dc:creator>Mayank</dc:creator>
				<category><![CDATA[Data-Application Server]]></category>
		<category><![CDATA[Statement]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/ceo-blog/?p=58</guid>
		<description><![CDATA[I had commented that a new set of applications are being written that leverage data to act smarter to enable companies to deliver more powerful analytic applications. Operating a business today without serious insight into business data is not an option. Data volumes are growing like wildfire, applications are getting more data-heavy and more analytics-intensive, [...]]]></description>
			<content:encoded><![CDATA[<p>I had commented that <a href="http://www.asterdata.com/blog/index.php/2008/04/24/we-live-in-interesting-times/">a new set of applications are being written that leverage data to act smarter</a> to enable companies to deliver more powerful analytic applications. Operating a business today without serious insight into business data is not an option. Data volumes are growing like wildfire, applications are getting more data-heavy and more analytics-intensive, and companies are putting more demands on their data.</p>
<p>The traditional 20-year old data pipeline of Operational Data Stores (to pool data), Data Warehouses (to store data), Data Marts (to farm out data), Application Servers (to process data) <img class="alignright" style="border: 5px solid black; margin: 3px;" title="Moving boulder uphill" src="http://www.asterdata.com/blog/wp-content/uploads/2009/11/sisyphus.thumbnail.jpg" border="5" alt="Moving boulder uphill" hspace="3" vspace="3" align="right" />and UI (to present data) are under severe strain – because we are expecting a lot of data to move from one tier to the other. Application Servers pull data from Databases for computations and push the results of the computation to the UI servers. But data is like a boulder – the larger the data, the more the inertia, and therefore the larger the time and effort needed to move it from one system to another.</p>
<p>The resulting performance problems of moving &#8216;big data&#8217; are so severe that application writers unconsciously compromise the quality of their analysis by avoiding “big data computations” – they first reduce the “big data” to “small data” (via SQL-based aggregations/windowing/sampling) and then perform computations on “small data” or data samples.</p>
<p><a title="Replacing sections of pipe" href="http://www.asterdata.com/blog/wp-content/uploads/2009/11/pipeline_replacement.jpg"><img class="alignleft" style="border: 5px solid black; margin: 3px;" title="Replacing sections of pipe" src="http://www.asterdata.com/blog/wp-content/uploads/2009/11/pipeline_replacement.thumbnail.jpg" border="5" alt="Replacing sections of pipe" hspace="3" vspace="3" width="128" height="96" align="left" /></a>The problem of &#8216;big data&#8217; analysis will continue to grow severe in the next 10 years as data volumes grow and applications demand more data granularity to model behavior and identify patterns so as to better understand and service their customers. To do this, you have to analyze all your available data. For the last 5 years, companies have routinely upgraded their data infrastructure every 12-18 months as data sizes double and the traditional data pipeline buckles under the weight of larger data movement &#8211; and they will be forced to continue doing this in the next 10 years if nothing fundamental changes.</p>
<p>Clearly, we need a new, sustainable solution to address this state of affairs.</p>
<p style="padding-left: 30px;">The &#8216;aha!&#8217; for big data management is to realize that traditional data pipeline suffers from an architecture problem &#8211; of <em>moving data to applications</em> &#8211; that must change to allow <em>applications to move to the data</em>.</p>
<p>I am very pleased to announce a new version of Aster Data <em>n</em>Cluster that addresses this challenge head-on.</p>
<p>Moving applications to the data requires a fundamental change in the traditional database architecture where applications are co-located inside the database engine so that they can iteratively read, write and update all data. The new infrastructure acts as a &#8216;Data-Application Server&#8217; managing both data and applications as first-class citizens. Like a traditional database, it provides a very strong data management layer. Like a traditional application server, it provides a very strong application processing framework. It co-locates applications with data, thus eliminating data movement from the Database to the Application server. At the same time, it keeps the two layers separate to ensure the right fault-tolerance and resource-management models &#8211; bad data will not crash the application, and vice-versa a bad application will not crash the database.</p>
<p>Our architecture and implementation ensures that apps should not have to be re-written to make this transition. The application is pushed down into the Aster 4.0 system and transparently parallelized across the servers that store the relevant data. As a result, Aster Data <em>n</em>Cluster 4.0 simultaneously also delivers 10x-100x boost in performance and scalability.</p>
<p>Those using Aster Data&#8217;s solution, including comScore, Full Tilt Poker, Telefonica I+D, Enquisite &#8211; are testament to the benefits of this fundamental change. In each case, it was the embedding of the application with the data that enables them to scale seamlessly and perform ultra-fast analysis.</p>
<p>The new release brings to fruition a major product roadmap milestone<a title="A clarion call" href="http://www.asterdata.com/blog/wp-content/uploads/2009/11/clarioncall.jpg"><img class="alignright" style="border: 5px solid black; margin: 3px;" title="A clarion call" src="http://www.asterdata.com/blog/wp-content/uploads/2009/11/clarioncall.thumbnail.jpg" border="5" alt="A clarion call" hspace="3" vspace="3" width="128" height="62" align="right" /></a> that we’ve been working on for the last 4 years. There is a lot more innovation coming – and this milestone is significant enough that we issue a clarion call to all persons working on “big data applications” – we need to move applications to the data because the other way round is unsustainable in this new era.</p>
<input id="gwProxy" type="hidden" />
<input id="jsProxy" onclick="jsCall();" type="hidden" />
<input id="gwProxy" type="hidden" />
<input id="jsProxy" onclick="jsCall();" type="hidden" />
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/ceo-blog/index.php/2009/11/02/the-era-of-big-data-applications/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Aster Data in Europe</title>
		<link>http://www.asterdata.com/ceo-blog/index.php/2009/09/14/aster-data-in-europe/</link>
		<comments>http://www.asterdata.com/ceo-blog/index.php/2009/09/14/aster-data-in-europe/#comments</comments>
		<pubDate>Mon, 14 Sep 2009 15:36:50 +0000</pubDate>
		<dc:creator>Mayank</dc:creator>
				<category><![CDATA[Statement]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/ceo-blog/index.php/2009/09/14/aster-data-in-europe/</guid>
		<description><![CDATA[Aster Data has seen tremendous growth in North America. We announced today that we have opened a Europe office in West London, England. The office will be headed by Bob Pearson, our newly appointed Europe Area Director. Bob is an entrepreneurial industry leader and had earlier introduced Opsware into Europe, eventually propelling Opsware to be [...]]]></description>
			<content:encoded><![CDATA[<p>Aster Data has seen tremendous growth in North America. We announced today that we have <a href="http://www.asterdata.com/news/090914-Aster-Data-UK.php">opened a Europe office</a> in West London, England. The office will be headed by Bob Pearson, our newly appointed Europe Area Director. Bob is an entrepreneurial industry leader and had earlier introduced Opsware into Europe, eventually propelling Opsware to be #1 in Europe in its market. We had been in conversations with Bob for 12 months &#8211; understanding the European market &#8211; before we opened our office this summer.</p>
<p>We also <a href="http://www.asterdata.com/news/090914-Aster-Pocket-Kings.php">announced today that our first customer in Europe</a> is the #1 online poker gaming site in the world, Full Tilt Poker. We have been working with <a href="http://www.fulltiltpoker.com/">Full Tilt Poker</a> for 8 months now helping deploy Aster <em>n</em>Cluster to power their fraud prevention systems and provide enhanced customer service to their players.</p>
<p>It is no surprise that data size growth is a world-wide phenomenon, and certainly occurs across &#8220;the pond&#8221; as well. We have noticed that European customers in numerous industries, such as financial services and insurance, online retailing, social networking, communications, and gaming are deploying new (and sometimes custom) applications to leverage big data.</p>
<p>Aster Data is certainly the most application friendly big-data infrastructure in the market, and we look forward to working with our European customers in the coming years!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/ceo-blog/index.php/2009/09/14/aster-data-in-europe/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Netezza&#8217;s Change in Architecture &#8211; Move towards Commodity</title>
		<link>http://www.asterdata.com/ceo-blog/index.php/2009/08/03/netezzas-change-in-architecture-move-towards-commodity/</link>
		<comments>http://www.asterdata.com/ceo-blog/index.php/2009/08/03/netezzas-change-in-architecture-move-towards-commodity/#comments</comments>
		<pubDate>Tue, 04 Aug 2009 05:49:36 +0000</pubDate>
		<dc:creator>Mayank</dc:creator>
				<category><![CDATA[Statement]]></category>
		<category><![CDATA[TCO]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/ceo-blog/index.php/2009/08/03/netezzas-change-in-architecture-move-towards-commodity/</guid>
		<description><![CDATA[Netezza pre-announced last week that they will be moving to a new architecture &#8211; one based around IBM blades (Linux + Intel + RAM) with commodity SAS disks, RAID controllers, and NICs. The product will continue to rely on an FPGA, but that would sit much further from the disks &#38; RAID controller, beyond the [...]]]></description>
			<content:encoded><![CDATA[<p>Netezza <a href="http://www.netezzacommunity.com/blogs/nzblog/2009/07/30/catch-a-wave-and-youre-sittin-on-top-of-the-world">pre-announced</a> last week that they will be moving to a new architecture &#8211; one based around IBM blades (Linux + Intel + RAM) with commodity SAS disks, RAID controllers, and NICs. The product will continue to rely on an FPGA, but that would sit much further from the disks &amp; RAID controller, beyond the RAM but adjacent to the Intel CPU, in contrast to their previous product line.</p>
<p>In assembling a new hardware stack, Netezza calls this re-architecture as <a href="http://www.netezzacommunity.com/blogs/nzblog/2009/07/31/change-but-no-change">a change but not really a change</a> &#8211; the FPGA will continue to offload data compression/decompression, selection and projection from the Intel CPU; the Intel CPU will be used to push-down joins and group bys; the RAM will be used to enable caching (thus helping improve mixed workload performance).</p>
<p>I think this is a pretty significant change for Netezza.</p>
<p><span id="more-119"></span></p>
<p>Clearly, Netezza would not have invested in this change &#8211; assemble &amp; ship a new hardware stack to share revenue with IBM vs. a 3rd party hardware assembler &#8211; if Netezza&#8217;s old FPGA-dominant hardware was not being out-priced and out-performed by our Intel-based commodity hardware.</p>
<p>It was a matter of time before the market realized that FPGA&#8217;s had reached their end-of-life status in the data warehousing market. In realizing the writing on the wall, and responding to it early, Netezza has made a bold decision to change &#8211; and yet, clung to the warm familiarity of an FPGA as a &#8220;side car&#8221;.</p>
<p>Netezza, and the rest of the market, will soon become aware that a change in hardware stack is not a free lunch. The richness of CPU and RAM resources in an IBM commodity blade come at a cost that a resource-starved FPGA-based architecture never had to account for.</p>
<p>In 2009, after having engineered its software for an FPGA over the last 9 years, Netezza will need to come to terms with commodity hardware in production systems and demonstrate that they can:</p>
<p>- Manage processes and memory spawned by a single query across 100s of blade servers</p>
<p>- Maintain consistent caches across 100s of blade servers &#8211; after all, it is Oracle&#8217;s Cache Fusion technology that is the bane of scaling Oracle RAC beyond 8 blade servers</p>
<p>- Tolerate the higher frequency of failures that a commodity Linux + RAID Controller/driver + Network driver stack incur when put under rigorous data movement (e.g., allocation/de-allocation of memory contributing to memory leaks)</p>
<p>- Add a new IBM blade and ensure incremental scaling of their appliance</p>
<p>- Upgrade the software stack in place &#8211; unlike an FPGA-based hardware stack that customers are OK to floor-sweep in their upgrade</p>
<p>- Contain run-away queries from allocating the abundant CPU and RAM resources and starving other concurrent queries in the workload</p>
<p>- Reduce network traffic for a blade with 2 NICs that is managing 8 disks vs. a Power-PC/FPGA that had 1 NIC for 1 disk</p>
<p>- …</p>
<p>If you take a quick pulse of the market, apart from our known installations of 100+ servers, there is no other vendor &#8211; mature or new-age &#8211; who has demonstrated that 100&#8217;s of commodity servers can be made to work together to run a single database.</p>
<p>And I believe that there is a fundamental reason for this lack of proof-point even a decade after Linux has matured and commodity servers have been used for computing &#8211; software <strong>not</strong> built from the ground-up to leverage the richness and contain the limitations of commodity hardware is <strong>incapable</strong> of scaling. Aster <em>n</em>Cluster has been built ground up to have these capabilities on a commodity stack. Netezza’s software written for proprietary hardware cannot be retrofitted to work on commodity hardware (else, Netezza would have completely taken the FPGAs out, now that they have powerful CPUs!). Netezza has its work cut-out &#8211; they have taken a dramatic shift that has the ability to bring the company and its production customers to its knees. And there-in lies Netezza&#8217;s challenge &#8211; they must succeed while supporting their current customers on an FPGA-based platform while moving resources to build out a commodity-based platform.</p>
<p>And we have not even touched upon the extension of SQL with MapReduce to power big data manipulation using arbitrary user-written procedures.</p>
<p>If a system is not fundamentally designed to leverage commodity servers, it&#8217;s only going to be a band-aid on seams that are bursting. Overall, we will curiously watch how long it takes Netezza to eliminate their FPGAs completely and move to a real commodity stack so that the customers can have the freedom to choose their own hardware and not be locked down to Netezza-supplied custom hardware.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/ceo-blog/index.php/2009/08/03/netezzas-change-in-architecture-move-towards-commodity/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Enterprise-Ready MapReduce Data Warehouse Appliance</title>
		<link>http://www.asterdata.com/ceo-blog/index.php/2009/06/29/enterprise-ready-mapreduce-data-warehouse-appliance/</link>
		<comments>http://www.asterdata.com/ceo-blog/index.php/2009/06/29/enterprise-ready-mapreduce-data-warehouse-appliance/#comments</comments>
		<pubDate>Mon, 29 Jun 2009 15:38:37 +0000</pubDate>
		<dc:creator>Mayank</dc:creator>
				<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/ceo-blog/index.php/2009/06/29/enterprise-ready-mapreduce-data-warehouse-appliance/</guid>
		<description><![CDATA[We are announcing the availability of an Enterprise-Ready MapReduce Data Warehouse Appliance.
The appliance is powered by Dell hardware and Aster&#8217;s nCluster SQL/ MR database, with optional software for BI platform from Microstrategy and data modeling software from Aqua Data Studio.
Our product portfolio now allows our customers to get the benefits of our flagship Aster nCluster [...]]]></description>
			<content:encoded><![CDATA[<p>We are announcing the availability of an Enterprise-Ready MapReduce Data Warehouse Appliance.</p>
<p>The appliance is powered by Dell hardware and Aster&#8217;s <em>n</em>Cluster SQL/ MR database, with optional software for BI platform from Microstrategy and data modeling software from Aqua Data Studio.</p>
<p>Our product portfolio now allows our customers to get the benefits of our flagship Aster <em>n</em>Cluster SQL/MR database in the packaging that they are most comfortable with &#8211; on-premise software, in-cloud service, or pre-packaged appliance.</p>
<p>The appliance offering packs a lot of punch compared to other data warehousing appliances in the market &#8211; it has the highest ratio of compute &amp; memory to data sizes, allowing you to run rich queries on the appliance without breaking a sweat.</p>
<p><span id="more-118"></span>We are especially proud of the open nature of our appliance &#8211; the hardware is from Dell built from industry-standard components, the BI server is from Microstrategy, and the data modeling tool is from AquaFold (Aqua Data Studio). The appliance brings together industry-leading components of a full data warehouse stack together &#8211; all pre-tested and configured for optimal performance.</p>
<p>Even the programming of our appliance is open &#8211; our SQL/MR framework allows applications to push computation into the appliance using industry standard SQL augmented with MapReduce in the language of your choice (Java, C#, Perl, Python, etc.).</p>
<p>We have been approached by a number of customers seeking a get-started-quickly system, especially those groups of users and departments seeking a Hadoop framework to build their solutions upon.</p>
<p>In response to the requests, we are proud to announce an Express Edition of the appliance that is designed to work for upto 1TB of user data. And it comes in an even more attractive price &#8211; that of $50K only &#8211; complete with hardware and software!</p>
<p>Give us a call &#8211; we&#8217;ll get your warehouse setup on our appliance to ensure that the time-to-first-query is measured in hours, not months!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/ceo-blog/index.php/2009/06/29/enterprise-ready-mapreduce-data-warehouse-appliance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Goodbye, Rajeev</title>
		<link>http://www.asterdata.com/ceo-blog/index.php/2009/06/05/goodbye-rajeev/</link>
		<comments>http://www.asterdata.com/ceo-blog/index.php/2009/06/05/goodbye-rajeev/#comments</comments>
		<pubDate>Sat, 06 Jun 2009 01:48:39 +0000</pubDate>
		<dc:creator>Mayank</dc:creator>
				<category><![CDATA[Statement]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/ceo-blog/index.php/2009/06/05/goodbye-rajeev/</guid>
		<description><![CDATA[Rajeev was a close friend and a cherished mentor. We were saddened to hear the news today and we will miss him dearly. Our thoughts are with his family.
]]></description>
			<content:encoded><![CDATA[<p>Rajeev was a close friend and a cherished mentor. We were saddened to hear the <a href="http://gigaom.com/2009/06/05/goodbye-old-friend-r-i-p-rajeev-motwani/">news </a>today and we will miss him dearly. Our thoughts are with his family.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/ceo-blog/index.php/2009/06/05/goodbye-rajeev/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Partly Cloudy</title>
		<link>http://www.asterdata.com/ceo-blog/index.php/2009/05/01/partly-cloudy/</link>
		<comments>http://www.asterdata.com/ceo-blog/index.php/2009/05/01/partly-cloudy/#comments</comments>
		<pubDate>Fri, 01 May 2009 21:45:02 +0000</pubDate>
		<dc:creator>Mayank</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/ceo-blog/index.php/2009/05/01/partly-cloudy/</guid>
		<description><![CDATA[Data poetry by Mason Hale. Awesome!
]]></description>
			<content:encoded><![CDATA[<p><a href="http://flowdelic.org/archives/2009/04/partly-cloudy/" target="_blank">Data poetry</a> by Mason Hale. Awesome!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/ceo-blog/index.php/2009/05/01/partly-cloudy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ah, it&#8217;s good to be young and talented (Congrats to Tasso on the Business Week recognition!)</title>
		<link>http://www.asterdata.com/ceo-blog/index.php/2009/04/26/ah-its-good-to-be-young-and-talented-congrats-to-tasso-on-the-business-week-recognition/</link>
		<comments>http://www.asterdata.com/ceo-blog/index.php/2009/04/26/ah-its-good-to-be-young-and-talented-congrats-to-tasso-on-the-business-week-recognition/#comments</comments>
		<pubDate>Sun, 26 Apr 2009 22:58:10 +0000</pubDate>
		<dc:creator>Mayank</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.asterdata.com/ceo-blog/index.php/2009/04/26/ah-its-good-to-be-young-and-talented-congrats-to-tasso-on-the-business-week-recognition/</guid>
		<description><![CDATA[A big congratulations to our CTO and Co-Founder, Tasso Argyros, who has been recognized as one of BusinessWeek’s Best Young Tech Entrepreneurs for 2009. I&#8217;d have given him a run for his spot, but I am over-the-hill and probably too old to run the distance &#8211; I wish they&#8217;d start a list for Best Entrepreneurs [...]]]></description>
			<content:encoded><![CDATA[<p>A big congratulations to our CTO and Co-Founder, Tasso Argyros, who has been recognized as one of <a href="http://images.businessweek.com/ss/09/04/0421_best_young_entrepreneurs/3.htm">BusinessWeek’s Best Young Tech Entrepreneurs for 2009</a>. I&#8217;d have given him a run for his spot, but I am over-the-hill and probably too old to run the distance &#8211; I wish they&#8217;d start a list for Best Entrepreneurs under the age of 40.</p>
<p>Tasso&#8217;s hard work, dedication, confidence and vision have been a huge part of our success to date, and we know they will be a big part of great things ahead for Aster. Congratulations to you, and to all the other great companies that made the list as well; it&#8217;s an honor for them to be recognized alongside you.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.asterdata.com/ceo-blog/index.php/2009/04/26/ah-its-good-to-be-young-and-talented-congrats-to-tasso-on-the-business-week-recognition/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
