An application is only available if the network is capable of delivering it. As applications start piling up and requiring increasing amounts of bandwidth, hardware upgrades can help overcome problems with poor application performance and availability. Network upgrades for high-availability applications can take many forms, encompassing cabling, switches, servers and WAN connections. Solution providers should seek to understand the needs of their clients and provide the necessary upgrades while minimizing cost and downtime. The first installment of this Hot Spot Tutorial introduced the basic concepts of high-availability network applications. This second installment highlights some specific upgrade options and examines the tradeoffs involved.
Upgrading network hardware for high-availability applications
Any network upgrades for high-availability applications should start with a network performance analysis versus application traffic. It's important for you and your client to understand where the performance bottlenecks and potential single points of failure are located as they relate to critical applications.
For example, an inspection of network documentation may reveal that the application's server is only running the minimum number of processors, though there are now far more employees using the application than when it was first deployed. Similarly, you may discover that the critical application server is only connected to the network across a single 10/100 Ethernet link. Not only could this older link pose a performance bottleneck, it may also present a single point of failure -- possibly compromising the application's availability. Ultimately, you have to do your homework. Once problem areas are identified, you can then discuss the issues with your client and make more informed recommendations for corrective action.
Start with an evaluation of the network cabling. While cabling does not decay with age, an older physical infrastructure can limit the type and variety of other hardware upgrade options available to you. "One of the most basic needs is just to have good wiring," said Scott Gorcester, president of Moose Logic, a network solution provider in Bothell, Wash. "We can't simply add 10 GigE switches to an aging system that cannot handle it on the wire."
Performance and availability can also be enhanced at the network switches by allowing for greater bandwidth, lower latency and redundant ports for trunking and failover. For example, consider the move from 10/100 Ethernet to 1 Gigabit Ethernet (1 GigE) at key locations in the network. At a minimum, 1 GigE can serve as a backbone, allowing greater performance for other 10/100 users. Gigabit Ethernet is also commonly deployed directly to servers and important workstations.
There's a great deal of interest in 10 Gigabit Ethernet (10 GigE) for the same reasons -- adding even more bandwidth to busy networks. "Some of the newer 10 GigE switches have innovative cut-through architectures that do not inspect the whole packet, just the destination," said Bob Laliberte, an analyst at the Enterprise Strategy Group, an industry analyst firm in Milford, Mass. "That enables faster speeds and could be good enough for some HPC environments."
The network interface cards (NICs) themselves also play a role in bandwidth and latency improvements. For example, faster NICs are needed to match any switching improvements, and NICs with TCP/IP Offload Engine (TOE) capabilities can improve network traffic performance at critical servers or other network nodes.
When designing network upgrades for high-availability applications, multiple NICs may be required with separate links to ports on different switches -- eliminating single points of failure and allowing for failover should one of the network links fail. Avoid relying on NICs with multiple ports or using multiple ports on the same switch, both of which are single points of failure themselves.
Resiliency also extends to the WAN, and high-availability applications that require remote access may benefit from multiple WAN links supplied by different WAN providers (or the same provider using entirely different circuits) and linked to the network through different routers.
Clustering is another technique that has emerged for availability, allowing multiple servers to be interconnected between one another using fast Ethernet or Infiniband interfaces. Clustered servers work together, aggregating their processing resources and I/O. This allows greater application performance, which can accommodate a larger user base. Clustering also supports redundancy, since a failure in one element of the cluster will only reduce the cluster's aggregate performance -- compared to a single server which would go offline during an internal fault.
The pros and cons of network upgrades for high-availability applications
Network solution providers note that network upgrades can provide client benefits that extend far beyond application availability. Upgrades can often position a client for future projects, so solution providers can sometimes strengthen their justification for hardware upgrades by fitting their recommendations into the client's long-term growth plan.
For example, the introduction of 1 GigE (or even 10 GigE) may be necessary to support streaming video applications for your client, but may also allow the eventual introduction of VoIP into their environment. As another example, adding a redundant WAN connection may improve remote application availability, but can also help disaster recovery initiatives. Look for opportunities to position the client for future growth beyond the immediate needs for application availability.
Probably the single biggest disadvantage of network upgrades is the cost of high-performance components -- particularly TOE adapters and 10 GigE switches. The cost of any solution has to be less than the cost of a problem, or the solution isn't economically viable for the client.
Fortunately, hardware costs are always falling as new technologies enter the mainstream. For example, Laliberte points to vendors with 10 GigE switches at less than $500 per port, making the client's move to 10 GigE far more cost-effective than it was just a few months ago. When estimating upgrade costs, it's important to include ongoing operational, maintenance and support costs (in addition to the up-front purchase and installation costs) when presenting the total cost of ownership to a client.
Your client should also understand the level of business disruption involved in an upgrade project. Old equipment may need to be removed, while new equipment must be installed, connected, configured and so on. Many hardware upgrades can take place during normal maintenance windows, while other upgrades may be deployed systematically across multiple downtimes. In extreme cases, extended downtime may be required.
All of this demands coordination and cooperation with the client. "If properly implemented, it should not be disruptive," Sobel said, noting that knowledgeable and experienced solution providers can make a tremendous impact here. "A good smooth implementation has a value -- and also a cost."
Other revenue opportunities with network hardware upgrades
Solution providers should consider opportunities for additional revenue. One common approach involves making proposals for future projects that map to the client's needs and business goals. For example, suppose the client plans to add streaming video applications to their environment in the future. You might lay the groundwork for future projects by making recommendations for additional performance enhancements or upgrades that can be implemented later on. This requires you to understand your client's business and industry.
Solution providers can also find supplemental revenue in ongoing monitoring, maintenance and managed services. "SMBs [small and medium-sized businesses] historically don't have the means or the desire to monitor the health of each system," Gorcester said, noting that other tasks like patching, maintenance and backups aren't performed properly or consistently. Clients that rely on networks for everyday business simply cannot afford to run systems until they suffer costly failures.