Archive for the ‘Administration’ Category

17
Oct
   

“Big data” has always been a favorite subject of discussion among the Aster Data team. We’ve been talking about big data at least since 2009, long before the term became burning-hot. The big data hype has confused many organization (and vendors) in the market about the best technology or method to solve their analytical business problems.

However, our vision hasn’t changed: from the time we founded the company in 2005 to today where we are part of the Teradata family. Teradata Aster continues to lead the market with technology innovations and reference architectures which provide clear guidance and deliver significant business value to our customers

Today, we are pushing the limits of analytical technology once more, by launching the Teradata Aster Big Analytics Appliance. The Big Analytics Appliance is a unique machine that can help enterprises see their business in high-definition. By harnessing all existing and new data types in the enterprise, we enable organizations to leverage our powerful SQL-MapReduce framework and business-ready analytics & apps which solve specifics business problems in marketing attribution, fraud detection, graph analysis, pattern analysis, and much more. It unleashes the creativity of bright analysts to go discover new insights to help their organizations grow revenue and create sustainable competitive advantage.

So what is the Big Analytics Appliance? It’s five things in one box:

  1. Aster + Apache Hadoop (100% open source via the Hortonworks HDP distribution), fully integrated in one box
  2. ANSI-standard SQL and next-generation MapReduce, fully integrated
  3. More than 50 ready-to-use MapReduce  apps, to deliver immediate business value
  4. Full ecosystem connectivity for both Aster and Hadoop; with BI, ETL and other existing IT systems
  5. The latest-generation, most efficient hardware platform, specifically optimized for Aster, Hadoop, and Big Analytics

Loyal to our Stanford roots, the appliance comes in Cardinal-red color!

Teradata Aster Big Analytics Appliance

The Big Analytics Appliance packs a long list of essential and unique technologies, including:

  • SQL-MapReduce®,  industry’s only true SQL/MapReduce integration
  • SQL-H™, industry’s only ANSI-standard SQL and Hadoop integration
  • Teradata Viewpoint, the most advanced database monitoring platform now extended to Aster and Hadoop
  • Teradata TVI a very sophisticated hardware support and failure prevention software, now ported to Hadoop as well as to Aster
  • Infiniband network interconnect – makes ultra-high-performance connectivity between Aster and Hadoop, as well as scalability, a non-issue
  • Small factor disk drives and dense enclosures – make this appliance one of the most dense and space-efficient big data platforms in the market

And, of course, everything in this appliance is packaged, integrated, pre-tested and supported by Teradata – the most trusted brand in data management and analytics.

I also want to take a moment to talk about our Unified Data Architecture vision for the enterprise. When most vendors out there talk about big data at a very high level without explaining where it fits and how it relates with traditional technologies like data warehousing, we decided to do the hard work of figuring out how different technologies complement each other and for what purpose. The result of that was the diagram below that showcases how Teradata, Aster & Hadoop can work together in tandem to provide a complete data solution for enterprise environments:

Teradata Unified Data Architecture

We also went one step further and now have a matrix that explains what technology (or technologies) are more appropriate for what use case – given a workload/use case and a specific type of data. The result of that exercise is below:

Processing as a Function of Schema Requirements by Data Type

When To Use Which Technology? The best approach by workload and data type

If you want to know more about our Unified Data Architecture vision, read the whitepaper we co-authored with Hortonworks, or feel free to contact us and we’ll be happy to discuss with you this concept and how it’d fit into your environment.

Through tightly integrating Aster and Hadoop, the new Big Analytics Appliance addresses a large part of the Unified Data Architecture; and via the Teradata-Aster and Teradata-Hadoop connectors, Teradata now has all the necessary pieces to help enterprises extract the maximum business value from all their data and execute on their Big Data vision. At Aster, just like at Teradata, we are committed to continuously provide the best innovations to help our customers have the power to make the best decision possible.

P.S. If you want to try out Aster without ordering a full Aster box, we now allow you to download an Aster virtual appliance! Go give it a try: http://www.asterdata.com/AsterExpress



12
Aug
By Tasso Argyros in Administration, Availability, Blogroll, Manageability, Scalability on August 12, 2008
   

- John: “What was wrong with the server that crashed last week?”

- Chris: “I don’t know. I rebooted it and it’s just fine. Perhaps the software crashed!”

I’m sure anyone who has been in operations has had the above dialog, sometimes quite frequently! In computer science such a failure would be called “transient” because the failure affects a piece of the system only for a fixed amount of time. People who have been running large-scale systems for a long time will attest that transient failures are extremely common and can lead to system unavailability if not handled right.

In this post I want to explore why transient failures are an important threat to availability and how a distributed database can handle them.

To see why transient failures are frequent and unavoidable, let’s consider what can cause them. Here’s an easy (albeit non-intuitive) reason:  software bugs.  All production-quality software still has bugs; most of the bugs that escape testing are difficult to track down and resolve, and they take the form of Heisenbugs, race conditions, resource leaks, and environment-dependent bugs, both in the OS and the applications. Some of these bugs will cause a server to crash unexpectedly.  A simple reboot will fix the issue, but in the meantime the server will not be available.  Configuration errors are another common cause.  Somebody inserts the wrong parameters into a network switch console and as a result a few servers suddenly go offline. And, sometimes, the cause of the failure just remains unidentified because it can be hard to reproduce and thus examine more thoroughly.

I submit to you that it is much harder to prevent transient failures than permanent ones. Permanent failures are predictable, and are often caused by hardware failures. We can build software or hardware to work around permanent failures. For example, one can build a RAID scheme to prevent a server from going down if a disk fails, but no RAID level can prevent a memory leak in the OS kernel from causing a crash!

What does this mean? Since transient failures are unpredictable and harder to prevent, MTTF (mean time to failure) for transient failures is hard to increase.

Clearly, a smaller MTTF means more frequent outages and larger downtimes. But if MTTF is so hard to increase for transient failures, what can we do to always keep the system running?

The answer is that instead of increasing MTTF we can reduce MTTR (mean time to recover). Mathematically this concept is expressed by the formula:

Availability = MTTF/(MTTF+MTTR)

It is obvious that as MTTR approaches zero, Availability approaches 1, (i.e. 100%). In other words, if failure recovery is very fast, (instantaneous in an extreme example) then even if failures happen frequently, overall system availability will continue to be very high. This interesting approach to availability, called Recovery Oriented Computing was developed jointly by Berkeley and Stanford researchers, including my co-founder George Candea.

Applying this concept to a massively parallel distributed database yields interesting design implications. As an example, let’s consider the case where a server fails temporarily due to an OS crash in a 100-server distributed database. Such an event means that the system has fewer resources to work with: in our example after the failure we have a 1% reduction of available resources. A reliable system will need to:

(a) Be available while the failure lasts and

(b) Recover to the initial state as soon as possible after the failed server is restored.

Thus, recovering from this failure needs to be a two-step process:

(a) Keep the system available with a small performance/capacity hit while the failure is ongoing (availability recovery)

(b) Upgrade the system to its initial levels of performance and capacity as soon as the transient failure is resolved (resource recovery)

Minimizing MTTR means minimizing the sum of the time it takes to do (a) and (b), ta + tb. Keeping ta very low requires having replicas of data spread across the cluster; this, coupled with fast failure detection and fast activation of the appropriate replicas, will ensure that ta remains as low as possible.

Minimizing tb requires seamless re-incorporation of the transiently failed nodes into the system. Since in a distributed database each node has a lot of state, and the network is the biggest bottleneck, the system must be able to reuse as much of the state that pre-existed on the failed nodes as possible to reduce the recovery time. In other words, if most of the data that was on the node before the failure is still valid (a very likely case) then it needs to be identified, validated and reused during re-incorporation.

Any system that lacks the capacity to keep either ta or tb low does not provide good tolerance to transient failures.

And because there will always be more transient failures the bigger a system gets, any architecture that cannot handle failures correctly is – simply – not scalable. Any attempt to scale it up will likely result in outages and performance problems. Having a system designed with a Recovery-Oriented architecture, such as the Aster nCluster database, can ensure that transient failures are tolerated with minimal disruption, and thus true scalability is possible.



27
May
By George Candea in Administration, Blogroll, Manageability on May 27, 2008
   

When developing a system that is expected to take care of itself (self-managing, autonomic, etc.) the discussion of how much control to give users over the details of the system inevitably comes up. There is, however, a clear line between visibility and control.

Users want control primarily because they don’t have visibility into the reasons for a system’s behavior. Take for instance a database whose performance has suddenly dropped 3x… This can be due to someone running a crazy query, or some other process on the same machine updating a filesystem index, or the battery of a RAID controller’s cache having run out and forcing all updates to be write-through, etc. In order to figure out what is going on, the DBA would normally start poking around with ps, vmstat, mdadm, etc. and for this (s)he needs control. However, what the DBA really wants is visibility into the cause of the slowdown… the control needed to remedy the situation is minimal: kill a query, reboot, replace a battery, etc.)

To provide good visibility, one ought to expose why the system is doing something, not how it is doing it. Any system that self-manages must be able to explain itself when requested to do so. If a DB is slow, it should be able to provide a profile of the in-flight queries. If a cluster system reboots nodes frequently, it should be able to tell whether it’s rebooting due to the same cause or a different one every time. If a node is taken offline, the system should be able to tell it’s because of suspected failure of disk device /dev/sdc1 on that node. And so on… this is visibility.

We do see, however, very many systems and products that substitute control for visibility, such as providing root access on the machines running the system. I believe this is mainly because the engineers themselves do not understand very well in which way the how turns into the why, i.e., they do not understand all the different paths that lead to poor system behavior.

Choosing to expose the why instead of the how influences the control knobs provided to users and administrators. Retrofitting complex systems to provide visibility instead of control is hard, so this really needs to be done from day one. What’s more, when customers get used to control, it becomes difficult to give it up in exchange for visibility, so the product must maintain the user-accessible controls for backward compatibility. This allows administrators to introduce unpredictable causes of system behavior (e.g., by allowing RAID recovery to be triggered at arbitrary times), which makes self-management that much harder and inaccurate. Hence the need to build visibility in from day one and to minimize unnecessary control.



17
May
By George Candea in Administration, Blogroll, Database, Manageability on May 17, 2008
   

I want databases that are as easy to manage as Web servers.

IT operations account for 50%-80% of today’s IT budgets and amount to 10s of billions of dollars yearly(1). Poor manageability impacts the bottomline and reduces reliability, availability, and security.

Stateless applications, like Web servers, require little configuration, can be scaled through mere replication, and are reboot-friendly. I want to do that with databases too. But the way they’re built today, the number of knobs is overwhelming: the most popular DB has 220 initialization parameters and 1,477 tables of system parameters, while its “Administrator’s Guide” is 875 pages long(2).

What worries me is an impending manageability crisis, as large data repositories are proliferating at an astonishing pace; in 2003, large Internet services were collecting >1 TB of clickstream data per day(3). 5 years later we’re encountering businesses that want SQL databases to store >1 PB of data. PB-scale databases are by necessity distributed, since no DB can scale vertically to 1 PB; now imagine taking notoriously hard-to-manage single-node databases and distributing them;

How does one build a DB as easy to manage as a Web server? All real engineering disciplines use metrics to quantitatively measure progress toward a design goal, to evaluate how different design decisions impact the desired system property.

We ought to have a manageability benchmark, and the place to start is a concrete metric for manageability, one that is simple, intuitive, and applies to a wide range of systems. We don’t just use the metric to measure, but also to guide developers in making day-to-day choices. It should tell engineers how close their system is to the manageability target. It should enable IT managers to evaluate and compare systems to each other. It should lay down a new criterion for competing in the market.

Here’s a first thought;

I think of system management as a collection of tasks the administrators have to perform to keep a system running in good condition (e.g., deployment, configuration, upgrades, tuning, backup, failure recovery). The complexity of a task is roughly proportional to the number of atomic steps Stepsi required to complete task i; the larger Stepsi, the more inter-step intervals, so the greater the opportunity for the admin to mess up. Installing an operating system, for example, has Stepsinstall in the 10s or 100s.

Efficiency of management operations can be approximated by the time Ti in seconds it takes the system to complete task i ; the larger Ti , the greater the opportunity for unrelated failures to impact atomicity of the management operation. For a trouble-free OS install, Tinstall is probably around 1-3 hours.

If Ni represents the number of times task i is performed during a time interval Tevaluation (e.g., 1 year) and Ntotal=N1+; +Nn, then task i ‘s relative frequency of occurrence is Frequencyi = Ni / Ntotal . Typical values for Frequencyi can be derived empirically or extracted from surveys(4),(5),(6). The less frequently one needs to manage a system, the better.

Manageability can now be expressed with a formula, with larger values of manageability being better:

manageability formula

This says that, the more frequently a system needs to be “managed,” the poorer its manageability. The longer each step takes, the poorer the manageability. The more steps involved in each management action, the poorer the manageability. The longer the evaluation interval, the better the manageability, because observing a system longer increases the confidence in the “measurement.”

While complexity and efficiency are system-specific, their relative importance is actually specific to a customer: an improvement in complexity may be preferred over an improvement in efficiency or vice-versa; this differentiated weighting is captured by T. I would expect T>2 in general, because having fewer, atomic steps is valued more from a manageability perspective than reducing task duration, since the former reduces the risk of expensive human mistakes and training costs, while the latter relates almost exclusively to service-level agreements.

So would this metric work? Is there a simpler one that’s usable?