Emic CEO: Do you really need a cluster? How to find out

At LinuxWorld on Aug. 3 in San Francisco, Emic Networks Inc. of San Jose, Calif., will be announcing Emic Application Cluster 2.5, due out in about 6 months, and its new membership in the Novell PartnerNet for Technology Partners program.

Version 2.5 adds to EAC 2.0 support for a segment of JBoss applications and 64-bit capabilities. In this interview, Emic CEO Eero Teerikorpi offers dos and don'ts for evaluating clusters and defines and compares types of clusters.

First of all, LinuxWorld is coming up. So -- other than your own company -- what are must-sees for system administrators attending the show? I would check out Red Hat, HP, Oracle, IBM and BEA Systems, as they outline new opportunities for innovation, cost savings and productivity gains using Linux and open source technology. Otherwise, they should be sure to attend these keynotes: Matt Szulik, CEO of Red Hat; Martin Fink, vice president...

of Linux at HP; Michael S. Rocha, executive vice president of Oracle; and Nick Donofrio, senior vice president of technology and manufacturing at IBM. What are your dos and don'ts for evaluating an enterprise to see if clustering is appropriate? Do decide on your goals for clustering.

Customers should decide what their primary goals are: Are they trying to achieve high availability, or 99.999% uptime? Do they need a disaster recovery system, or "hot" or standby backup? Are they trying to gain better usage of existing servers? Are they trying to build high-quality systems from low cost components? Are their applications read intensive with large databases, such as an online store where a potential buyer may browse through many products and related product details before deciding on a purchase, or are they write intensive?

The answers to these questions will help determine the criteria that their application clustering will need to meet. For example, many organizations are trying to reduce costs and thus, expensive shared storage solutions may not meet such a requirement. Also, an 'in-memory; solution may not be the right choice for a large database or one where data persistence is critical.

Do look for transparent solutions that do not require changes to your applications.

Do look for solutions that deliver good manageability, reliability, failover and load balancing. Do look for the most cost effective solutions. There are many very expensive solutions that might also entail high installation and customization costs.

Don't choose a one-size-fits-all solution. Be sure that the solution you select can grow with your organization, its users, customers, or database size.

Don't assume that one needs to spend a very large amount of money and effort to build a cluster. There are so many types of Linux clusters that some newcomers to the clustering concept get confused. What's the difference between some common types of clusters?
A Beowulf cluster is a type of high-performance massively parallel computer built primarily out of commodity hardware components, running a free-software operating system like Linux or FreeBSD, interconnected by a private high-speed network. It consists of a cluster of PCs or workstations dedicated to running high-performance computing tasks. The nodes in the cluster don't sit on people's desks; they are dedicated to running cluster jobs. It is usually connected to the outside world through only a single node. Thus, it is primarily a "super computer" built from low TCO parts.

Some Linux clusters are built for reliability instead of speed. These are not Beowulfs. To run any software on such hardware requires that it be specifically written or re-written for this hardware architecture.

Storage Clusters are solutions that allow a large disk farm to sit behind any number of servers. This means that there is only one copy of the data and all applications/users see this same copy. Normally, these solutions do not provide high reliability as data can be lost, requiring it to be restored. Other solutions are needed for disaster recovery and other solutions are needed for load balancing (of users and of users' computer tasks). They typically can be a very expensive to build and operate.

Application clusters are solutions that act as middleware sitting between the client applications and backend servers, this can be a 2 or n-tier architecture.

Application clusters perform load balancing in order to route a user to the least busy machine and to failover a user if the machine he or she is using fails or is taken down for routine maintenance. They should be transparent in that they do not require the application to be changed in order to run on the cluster.

Other functions of applications clusters are to provide scalability and reliability: if a new computer needs to be added to the application in order to support more users/more queries, this new server can be added seamlessly and if a server fails or is taken out of service, it has a way of recovering to the current state of the applications data so that it is current. Each server can have its own copy of the data, building data redundancy and enabling use of "standard" hardware and storage such as Intel computers. Data is replicated across the servers in the cluster to insure that all users can see the same data at the same time.

Where does Emic's EAC fit into the Linux clustering picture?
EAC is an application clustering product, combining multiple physical MySQL databases, Apache web servers, TomCat containers and JBoss application servers to create a fail-over and load balancing cluster that provides scalable performance, high availability and single-point-of-manageability. The combination of transaction level replication and underlying IP level clustering provides the advantages over the any other legacy database clustering methods.

Dig deeper on Linux high-performance computing and supercomputing



Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: