With Pacemaker high-availability cluster technology, clients can ensure that they are making the most of their cluster infrastructure. In this Project FAQ, expert Sander van Vugt answers frequently asked questions about the resource manager. Find out what you will need to create a cluster, how the STONITH device helps clients and what Pacemaker does to improve server functionality.
Listen to Sander van Vugt's answers to other frequently asked questions about Pacemaker high-availability technology in this Project FAQ podcast.
• What is Pacemaker?
• What is the purpose of Pacemaker high-availability clustering?
• What hardware and software do I need to run Pacemaker?
• Do I need a SAN if I want to create a cluster?
• Why do I need STONITH?
• Is it possible to grant different servers simultaneous access to the same data?
• More on SUSE Linux Enterprise Server
• About the expert
Pacemaker is a high-availability cluster software that was developed as an open source project, mainly by people who work for SUSE Linux. It helps you make sure that essential services on your network get the best availability possible.
As the name of the project suggests, the purpose of Pacemaker high-availability clustering is to make sure that vital resources receive increased availability. Without a clustering solution, a service may fail at the moment the server crashes. If the service is configured as a resource in Pacemaker clustering, Pacemaker ensures that the service will still be available, even if a server in your network fails.
You need at least two servers that run Linux. Currently, Pacemaker is able to support up to 16 servers, but some people run it on clusters that have hundreds of servers, which are called nodes in the cluster. Virtually all Linux distributions are supported. But if you need Enterprise support, Novell's SUSE Linux Enterprise Server is currently the only Linux distribution that has that ability. The servers must be installed in the same LAN and, in most cases, a storage area network (SAN) is required as well. For optimal performance, you need a special device that is capable of shutting down a server if needed -- the STONITH device.
Clustering is about increased availability of services. To reach this goal, a service must be able to run on all servers in a cluster. The server must also be able to access its configuration files and data when it moves over to another server. To make this easy, I highly recommend using a SAN. Just put the data and configuration files on the SAN and make sure all servers can reach the SAN. If you can't install a SAN, there is another solution that offers shared access to the files. This could be a Network File System-based solution or a synchronization solution, such as rsync. If your data is not too dynamic, you could even schedule a cron job to keep the data and configuration in sync.
Imagine that a communication link fails in a two-node cluster. Both servers may think the other one is down and begin servicing the resource that you want to be highly available. If this resource needs access to the shared file system on the SAN, you may end up with a situation where both servers try to write to the same file system at the same time. If you are using a file system like Ext3, XFS or Ext4, this will cause severe file system corruption. The STONITH device makes sure that one of the servers is really shut down before a server can take over a resource from another. It does that by cutting the power to the server so it really is down. This sounds like a strange solution, but it's much better than having file system corruption.
You can do that if you are using a special purpose cluster file system. Currently, there are two of them: the Oracle Cluster File System (OCFS) 2 and the Global File System. Both are open source, so you can choose whichever system you prefer. If, however, you are creating your cluster on Red Hat Linux, you'll most likely work with GFS, which Red Hat developed. If you use Novell's SUSE, you will be working with OCFS2, because it is the only cluster-aware file system supported on SUSE. The special thing about these file systems is that they have a shared cache. That means if one server writes to the file system, the other server knows about it.
- Installing SUSE Linux Enterprise Server 11
- Novell offers SUSE Enterprise Linux 11 partner training
- Enterprise Linux growing but still far behind Windows and Mac
Sander van Vugt is an independent trainer and consultant living in the Netherlands. Van Vugt is an expert in Linux high availability, virtualization and performance and has completed several projects that implement all three. He is also the writer of various Linux-related books, such as Beginning the Linux Command Line, Beginning Ubuntu Server Administration and Pro Ubuntu Server Administration.
This was first published in June 2009