Tractor Supply Co. of Nashville, Tenn., is the perfect example of a company that needs scalable computing. With $1 billion in 2002 sales, it runs more than 450 stores in the U.S., opening 113 last year. Until recently, however, its need for an easy-to-grow IT system has gone unfulfilled. Rapid growth and large peaks in sales mean that Tractor Supply Co. needs to scale out its processing power at almost a moment's notice, according to...
Stevan Townsend, its manager of database and BASIS administration. In this interview, he describes his search for a scaleable, redundant replacement for a pay-as-you-grow WinTel system.
What problems did Tractor Supply Company (TSC) have with IT system scalability?
Townsend: We were a Sun shop five years ago, running Solaris. Those machines are very expensive to keep up, so we moved our data warehouse to Intel in a Windows environment. We started running Windows NT on Pentium Pro 4-way servers. We outgrew that box quickly. We upgraded to a 4-way Xeon server. Eighteen months later, we outgrew that. Then, we got eight-processor Intel SMP servers running Oracle8i on Windows and Veritas Cluster Server to handle the workload. This approach was not only costly, but also resulted in unacceptable administration costs.
So, every 18 months, we had to throw out a server and get the latest and greatest. That's a lot of hassle just for scalability, not to mention that fact that we couldn't achieve sufficient failover with WinTel.
What alternative solutions did you consider?
Townsend: Originally we considered a failover cluster for redundancy. We thought about going into big Unix, but that was very cost-prohibitive. We wanted to stay with the Intel architecture. We evaluated Solaris on Intel, but we thought that (at that time) Sun wasn't committed to Intel. That left Linux. So, we started working with Red Hat Linux 6.2 two years ago. It's very solid.
Being an Oracle shop, we like the concept of 9i RAC. So, we tried implementing the 9i RAC on Linux in a disk environment. But there was a disk slippage problem. If a disk disappears, you have to reconfigure everything in the system. We knew our backup strategies would have to change. We had 470 databases and didn't have enough people to manage raw disks.
We began looking at cluster file systems. We evaluated cluster file system products purporting to support Oracle9i RAC and general-purpose, file-based applications, and subjected them to a battery of ease of use, performance, scalability and high-availability testing.
At the time, Sistina had a single-point of failure architecture, which we didn't want. We tried PolyServe, release 1.1, and that worked well in some ways. Then, management challenged us to look at Oracle Clustered File System. Unfortunately, it was not a good fit for our production environment, not holding up in our performance tests, and it was very difficult to install. During that time PolyServe 1.2 came out, so we tried it. Now, we're in the process of putting in PolyServe 1.2 on Red Hat Linux.
What did PolyServe 1.2 offer that the previous release didn't?
Townsend: The first release of the product was better than anything else we could find at the time. But we could not run this cluster with any other thing on the switch. We couldn't do I/O fencing, for example. On 1.2, that was built in. The difference in 1.1 and 1.2 was fairly substantial. With 1.2, it became so feature-complete that people could move it to production.
How does PolyServe 1.2 on Linux facilitate scalability?
Townsend: The PolyServe-Oracle9i RAC solution enables TSC to scale out processing power to meet its increasing performance demands. It will save us substantial money in the future because we will never again need to do a costly forklift upgrade of the servers that supports the POS [point of sale] data warehouse. In addition, we were able to build redundancies into the system that will save time and money in terms of the availabilities of its mission-critical systems.
What's the manageability picture?
Townsend: We were attracted to PolyServe Matrix Server because of its strong integration with Oracle databases. In addition, Matrix Server installed easily, possessed an intuitive graphical user interface for centralized cluster configuration and, most importantly, offered seamless, highly scalable file system performance using standard Linux system calls. Matrix Server's fully symmetric architecture also eliminated single points of failure and provided high availability functionality and multi-path I/O capabilities.
What happens when you need to grow this system?
Townsend: When TSC outgrows its current configuration, additional servers and storage can be added to the cluster online without affecting existing members of the cluster.
What benefits to you expect to see from the new system?
Townsend: Using the PolyServe solution will significantly reduce server costs compared to the alternative of purchasing a failover cluster of two expensive, eight-processor systems. The six-node cluster will deliver higher levels of availability and will be easier to manage than a two-node cluster of bigger SMP machines.
So, what's happening with this implementation now?
Townsend: TSC is in the process of migrating its production POS data warehouse to a PolyServe-powered cluster. We have to be completely rolled out by the end of September 2003.