09
Jun
By Peter Pawlowski in Blogroll on June 9, 2009
   

The Aster SQL/MapReduce framework allows developers to push analytics code for applications closer to the data in the database, without dealing with the headaches of extracting and analyzing data outside of the database. We’ve supported a variety of language from day one, including Java, Python, and Perl. Today we’re pleased to announce official support for the .NET family of languages via Mono, an excellent open source .NET implementation. This will allow developers who use .NET languages like C# and VB (and, of course, F#) to more easily leverage nCluster for massively parallel analytics.

Our .NET support is enabled through our Stream SQL/MR function, which allows users to process data via a simple streaming interface: provide a program that reads rows from the console (stdin) and writes rows back to the console (stdout). Let’s consider a simple C# program called Tokenize, which splits incoming rows into tokens, and then output each token (one per line):

net-blog-post-code.jpg

To run this program over data stored in nCluster, a developer just needs to compile the above Tokenize.cs into Tokenize.exe with a C# compiler (in our case, the Mono C# compiler gmcs). With the compiled executable in hand, one command in our terminal client will install it in nCluster. The program can be then invoked from SQL. The below example will run the program over all the rows in the documents table, outputting a table with a single column (token). Each row in the result of the query will correspond to a single token in the input documents.

net-blog-post-code_2.jpg

It’s as simple as that. We hope our new .NET support will enable an ever-broader group of developers take advantage of SQL/MR, our in-database analytics technology!If you’re interested in learning more, please check out a host of new resources around our implementation of MapReduce within Aster nCluster including example applications and code.


Comments:
Eeraj on June 11th, 2009 at 9:56 am #

Is there a development version of your product that can be downloaded and evaluated? I was trying to find a download or eval link on your site, but couldn’t.

Peter Pawlowski on June 11th, 2009 at 1:03 pm #

Eeeraj, thanks for your comment! We are definitely exploring the most effective way to make our nCluster software available for testing. We are exploring several options, including using the Cloud, enabling installation on your laptop or desktop, and more. Stay tuned.

Paolo on June 21st, 2009 at 10:20 am #

Is this similar to the streaming functionality in Hadoop?

Peter Pawlowski on June 21st, 2009 at 2:11 pm #

Paolo,

Our Stream SQL/MR function was designed allow support for a broad range of languages, now including .NET. The stdin/stdout API was chosen for its simplicity and is compatible with the Hadoop Streaming API. You can therefore quickly migrate streaming functions to nCluster’s SQL/MR.

Cheers,
Peter

Post a comment

Name: 
Email: 
URL: 
Comments: