Archive for May, 2010

04
May
By jgoldman in Analytics on May 4, 2010
   

It has been a few weeks since we announced the Aster Analytics Center, so I think this is a good time to shed a little more light on what we are doing. Our goal is to make analytical work easier and faster to do on many types of data sets. We have already worked closely with many customers to architect solutions that solve their analytics challenges: fraud detection; complex security analysis to detect communication anomalies; graph analysis for social networks.

As part of the center, we are building an analytics infrastructure to make advanced analytics readily accessible to anyone using Aster Data. This includes making use of our SQL-MapReduce interface to do analysis that can’t easily be expressed in SQL, and often leads to huge performance gains. In addition, we are releasing a suite of functions built on Aster’s API for MapReduce that allows for easy invocation from within SQL. The suite includes, for example, novel tools to do sequence analysis, which is very useful for anyone trying to do pattern analysis. It’s important to note that many of our customers are already writing their own applications using this API and it’s really straightforward to get started. Incidentally, development for our Java API has just become very easy with our new SDK that uses a plug-in for Eclipse. Also, we are actively developing partnerships with analytic functions and solution providers.

I’d like to briefly provide a brief background of why I’m so excited about what Aster is enabling and how this is indicative of a significant shift in how companies use and analyze their data. I first encountered Aster Data when I was at LinkedIn building analytically driven products with the large data sets that LinkedIn has amassed. Our team faced severe limitations with our standard warehouse, but with the introduction of the MPP Aster system we were suddenly able to analyze data much faster. Analyses that previously took 10 hours to run could suddenly run in 5 minutes. Our ability to think of an idea and get answers was no longer limited by the constraints of the equipment we owned but was instead bottlenecked by how quickly we could think. With a 10 hour wait-time you frequently forgot what you were working on or the stakeholder had moved on without doing a proper analysis. If you made a mistake or wanted to tweak your query you had to wait another 10 hours. With the Aster-enable approach to analytic development, however, a whole new way of thinking emerged and we started to perform analyses we didn’t even think was previously possible. Having the ability to quickly iterate on an idea is invaluable when solving problems - the answers we got back helped guide business decisions and enabled better products on LinkedIn.

As a customer I worked directly with the Aster team on a number of problems and was amazed by their depth of knowledge of the challenges analytics practitioners face and their ability to innovate. Since joining the team, I’ve been pleased by Aster’s strong commitment to make analytics accessible to all. A scalable system that can do more with data will unleash a whole new set of capabilities for enterprises. I’m very excited that the field team has grown and we have attracted top-talent like ex-particle physicist Puneet Batra and data mining experts like Qi Su. Ajay Mysore, another member of the team, conducted master’s research on clustering algorithms. Our team lives and breathes data and is always ready for new challenges. Right now the field of analytics is undergoing a renaissance and it’s exciting to be working with a leader in the field of big data and advanced analytics.