blog   contact    
log: Winning with Data
1.888.Aster.Data Email
Posted on May 20th, 2008 by Mayank Bawa

I am glad to share the news that one of our first customers, MySpace, has scaled their Aster nCluster enterprise data warehouse to more than 100 Terabytes of actual data. LogoIt is not easy to cross the 100TB barrier, especially when loads happen continuously and queries are relentless, as they are at

Hala, Richard, Dan, Jim, Allen, and Aber, you have been awesome partners for us! It has been a great experience for Aster to work with you and we can see the reasons behind MySpace’s continued success. Your team is amazingly strong and capable and there is a clear sense of purpose. Tasso and I often remark that we need to replicate that culture in our company as we grow. At the end of the day, it is the culture and the strength of a team that makes a company successful.

And to everyone at Aster, you have been great from Day 1. It is impressive how a fresh perspective and a clean architecture can solve a tough technical challenge!

Thank you. And I wish everyone as much fun in the coming days!

6 Responses to “MySpace crosses 100TB on Aster nCluster!”

  1. Hi Guys,
    Your solution is very interesting, I am curious though - I have seen a lot of interesting dialogue regarding the scalability of your solution and discussion regarding how analytics is a key component but have had difficulty figuring out how analytics is deployed with the solution. Are you presenting a data discovery type solution (such as that offered by Visual Site) or proposing a plug in architecture with analytics partners (such as SAS). I suspect I am missing something important here! Congratulations on a great launch BTW.
    Cheers, James.

  2. Thanks for the comments/questions, James!

    We allow you to store all your logs in one database on which queries can be run. The logs can be augmented with contextual information (e.g., information about pages, users, geographies, etc.). We can then use SQL to process data and generate reports. You can also use data-mining tools like SAS, SPSS, and R to analyze data in our database.

  3. Very interesting, how does your implementation differ from a distributed data system like Mnesia (Erlang). I have read that MS-SQL 2008 Clustering will use a similar approach. Have you guys heard different?

    Back to mnesia , the creators have said that the service begins to fail at the petabyte level, have you guys done such theoretical tests?

    Regardless, Congratulations! and great product.

  4. Nima - Mnesia and nCluster share many of the same goals (distribution, location transparency, ACID transactions, non-stop applications).

    The biggest difference is in our support for Normalization and fast Joins.

    Normalization helps limit data growth rates; de-normalization causes values to be replicated multiple times spurring data growth.

    Denormalization is preferred to avoid Joins; it is fair to say that Joins are the slowest component in query execution. We do joins pretty fast even in a distributed environment.

  5. I definitely get that part, the support for horizontal fragmentation is clearly the main benefit of this type of scaling approach, but isn’t their a scaling issue vertically?

    Meaning once you have a large enough data set vertically and a sufficiently horizontally fragmented schema doesn’t the scaling fall apart due to the messaging overhead?

    Keep in mind , this is in no way a reflection of your product, and to anyone reading this I have no experience with any Asterdata product. I am simply stating characteristics of other distributed systems that are *somewhat* similar to your guys approach. I’m just curious and hopeful!


  6. […] has been driving innovation very aggressively. 1. Revisiting database engines. MPP is the answer to Big Data, among other […]

Leave a Comment

Category Archives

Relevant Blogs

  • Converting data exhaust into data valueOctober 20th, 2008
  • Why MapReduce matters to SQL data warehousingOctober 20th, 2008
  • The new paradigm of in-database cloud analytics, and Google’s role as catalystOctober 20th, 2008
  • Thoughts on category creation and information access platformsOctober 20th, 2008
Copyright © 2008 Aster Data Systems, Inc. All rights reserved.