Archive for November, 2009

 
16
Nov
Posted by Mayank in Data-Application Server, Statement on November 16, 2009

SAS Institute

We are very excited to announce a strategic partnership between Aster Data and SAS Institute to further accelerate the “SAS In-Database Processing” initiative.

The objective of the partnership is to integrate SAS software capabilities within our MPP database which Aster Data’s 4.0 release uniquely supports. Last week we announced the capability to fully push down analytics application logic inside our MPP database so applications can now inside the database allowing analytics to be performed on massive data scales with very fast response.  We call this a Massively Parallel Data-Application Server. We had earlier presented more details on this unique implementation of SAS software inside Aster Data’s nCluster software at a co-hosted session with SAS at M2009.

Our architecture enables SAS software procs to run natively inside the database thereby preserving the statistical integrity of SAS software computations while giving unprecedented performance increases during analysis of large data sets. SAS Institute partners in this initiative with other databases too – but the difference is that each of these databases require the re-implementation of SAS software procs as proprietary UDFs or Stored Procedures.

We also allow dynamic workload management capabilities to enable graceful resource sharing between SAS software computations, SQL queries, loads, backups and scale-outs – all of which may be going on concurrently. The workload management enables administrators to dial-up or dial-down resources to the data mining operations based on the criticality of the mining and other tasks being performed.

Our fast loading and trickle feed capabilities ensure that SAS software procs have access to fresh data for modeling and scoring, ensuring a timely and accurate analysis. This avoids the need to export snapshots (or samples) of data to an external SAS server for analysis, saving analysts valuable time in their iterations and discovery cycles.

We’ve been working with SAS Institute for a while now, and it is very evident why SAS has been the market leader in analytic applications for three decades. The technology team is very sharp, driven to innovate and execute. And as a result we’ve achieved a lot working together in a short time.

We look forward to working with SAS Institute to dramatically advance analytics for big data!



 
02
Nov
Posted by Mayank in Data-Application Server, Statement on November 2, 2009

I had commented that a new set of applications are being written that leverage data to act smarter to enable companies to deliver more powerful analytic applications. Operating a business today without serious insight into business data is not an option. Data volumes are growing like wildfire, applications are getting more data-heavy and more analytics-intensive, and companies are putting more demands on their data.

The traditional 20-year old data pipeline of Operational Data Stores (to pool data), Data Warehouses (to store data), Data Marts (to farm out data), Application Servers (to process data) Moving boulder uphilland UI (to present data) are under severe strain – because we are expecting a lot of data to move from one tier to the other. Application Servers pull data from Databases for computations and push the results of the computation to the UI servers. But data is like a boulder – the larger the data, the more the inertia, and therefore the larger the time and effort needed to move it from one system to another.

The resulting performance problems of moving ‘big data’ are so severe that application writers unconsciously compromise the quality of their analysis by avoiding “big data computations” – they first reduce the “big data” to “small data” (via SQL-based aggregations/windowing/sampling) and then perform computations on “small data” or data samples.

Replacing sections of pipeThe problem of ‘big data’ analysis will continue to grow severe in the next 10 years as data volumes grow and applications demand more data granularity to model behavior and identify patterns so as to better understand and service their customers. To do this, you have to analyze all your available data. For the last 5 years, companies have routinely upgraded their data infrastructure every 12-18 months as data sizes double and the traditional data pipeline buckles under the weight of larger data movement – and they will be forced to continue doing this in the next 10 years if nothing fundamental changes.

Clearly, we need a new, sustainable solution to address this state of affairs.

The ‘aha!’ for big data management is to realize that traditional data pipeline suffers from an architecture problem – of moving data to applications – that must change to allow applications to move to the data.

I am very pleased to announce a new version of Aster Data nCluster that addresses this challenge head-on.

Moving applications to the data requires a fundamental change in the traditional database architecture where applications are co-located inside the database engine so that they can iteratively read, write and update all data. The new infrastructure acts as a ‘Data-Application Server’ managing both data and applications as first-class citizens. Like a traditional database, it provides a very strong data management layer. Like a traditional application server, it provides a very strong application processing framework. It co-locates applications with data, thus eliminating data movement from the Database to the Application server. At the same time, it keeps the two layers separate to ensure the right fault-tolerance and resource-management models – bad data will not crash the application, and vice-versa a bad application will not crash the database.

Our architecture and implementation ensures that apps should not have to be re-written to make this transition. The application is pushed down into the Aster 4.0 system and transparently parallelized across the servers that store the relevant data. As a result, Aster Data nCluster 4.0 simultaneously also delivers 10x-100x boost in performance and scalability.

Those using Aster Data’s solution, including comScore, Full Tilt Poker, Telefonica I+D, Enquisite – are testament to the benefits of this fundamental change. In each case, it was the embedding of the application with the data that enables them to scale seamlessly and perform ultra-fast analysis.

The new release brings to fruition a major product roadmap milestoneA clarion call that we’ve been working on for the last 4 years. There is a lot more innovation coming – and this milestone is significant enough that we issue a clarion call to all persons working on “big data applications” – we need to move applications to the data because the other way round is unsustainable in this new era.