Service provider takeaway: Service providers can simplify their customers' database consolidation efforts using Oracle RAC -- and they can stay ahead of the troubleshooting curve by following this advice.
For your Oracle-shop customers, Oracle Real Application Clusters (RAC) -- which enables the combination of multiple database servers into one large cluster -- can bring important benefits, namely higher scalability and availability and easier database consolidation.
Beyond those high-level benefits, there are some specific features worth noting in Oracle RAC -- as well as some gotchas to watch out for at customer sites. We'll take a look at them all to make sure you can help your customers make an informed choice about RAC. We'll also give you RAC troubleshooting advice to handle problems that might crop up when integrating applications into an Oracle RAC database.
- Caching. Oracle's Cache Fusion, a feature of the system since Oracle 9i, enables most applications to work in Real Application Clusters without any application code changes. Oracle RAC has an instance of Oracle running on each server in the cluster. All of the data is stored in a centralized disk storage system, normally a storage area network (SAN). Any Oracle instance can read a block of data from the shared disk system. If your application is connected to more than one server in the cluster, each server is still capable of reading the data in the database. But what happens if the application changes data? Before Oracle 9i, any changes had to be written to the shared disk storage before another server could read the changes.
Writing the changed data to disk only to be read by another server in the cluster is called "disk pinging." Disk pinging killed application performance when an application was ported to a clustered Oracle database in the precursor to RAC (Oracle Parallel Server). With Cache Fusion, disk pinging and its related performance drag was eliminated. The database's cache is shared between all servers in the cluster. If a transaction changes a block of data on one server and another server needs the new data, the block of data can be transferred across the cluster's private network much more efficiently compared to disk pinging. Cache Fusion has made it possible for virtually any application to use RAC, right out of the box without alteration to the code.
Many third-party applications do not have support for Oracle 11g RAC. Service providers will need to work with their customer's third-party application vendor to see if the application is supported in a RAC environment. In some cases, your client may be the first to deploy the application on RAC. You and your client may have to troubleshoot Oracle RAC problems, showing the third-party vendor how well their application works on RAC without any modifications.
- Services. If your customer is consolidating database servers into a RAC configuration, you should definitely use database services. Defined with Oracle's Enterprise Manager (EM), database services let database administrators (DBAs) monitor, control and allocate database resources with Oracle 11g's Resource Manager. Say, for instance, that your customer used the Oracle RAC consolidation to combine its human resources (HR) application and accounting application into one environment. The DBA uses EM to create two services, HR_SVC and ACCT_SVC. At the end of each month, it's vitally important that the accounting department balance the books and have as many database resources as possible. With Resource Manager, the DBA can configure the database to give priority to the ACCT_SVC service. This way, if someone in human resources requests a long-running report, accounting will still get its work done without a troubleshooting request.
In many RAC configurations, and by default, each server in the cluster provides all database services equally. This way, the workload is spread among all servers in the cluster. You can also configure RAC so that services are limited to a subset of the cluster's servers, called application partitioning. For example, you can tell Server A and Server B to run the HR_SVC service while Server C runs the ACCT_SVC service. This way, you ensure that a server's resources are devoted to specific services. If a server in the cluster does go down, the other servers can take over the service's workload.
- High availability. Without RAC, when a server goes down, any work being done on that server immediately terminates. The user's application can connect to any other server in the cluster supporting that database service, but such a connection must be performed manually by the user. In an environment with RAC, however, you can avoid this scenario using RAC's high-availability feature, Transparent Application Failover (TAF). As the name indicates, if a server goes down, the applications can transparently fail over to any server still running in the cluster. End users won't likely notice that a server in the cluster has terminated.
- TAF. TAF is not always seamless, however. If the user is issuing a SELECT statement (the most common interaction type) to the database when the server is terminated, TAF is truly transparent to the end user. But, if a user is issuing Data Manipulation Language (DML) statements such as INSERT, UPDATE or DELETE, they won't be shielded from the termination in the same way: The DML statement will abort. TAF will reconnect to a new server in the cluster. It is up to the application to reissue the DML statement. The application may need some minor changes to reissue the DML statement. Any application changes would need to be made by the development team that created the software or by the third-party vendor.
- Tablespaces. Another area that can require troubleshooting is tablespaces. When moving applications to an Oracle RAC environment, it may be a good idea to make sure your customer's database uses Automatic Segment Storage Management (ASSM) for its tablespaces. With ASSM, allocation of blocks of data for a table or index is managed in the tablespace rather than in the Data Dictionary. This is important for RAC environments that have high rates of DML activity. Without ASSM, the Data Dictionary can be a point of contention for applications. The DBA creates the tablespaces for the application and, if necessary, can migrate the tablespace to ASSM using Oracle-supplied tools.
- Sequences. Finally, the handling of database sequences can require troubleshooting. Many applications use database sequences to generate unique increase numbers; sequences can be used to generate invoice numbers or check numbers. Normally, not much thought is given to sequences in non-RAC environments, but in an Oracle RAC environment, it's good practice to make sure that the application's sequences are cached in memory. Otherwise, accessing the next number in the sequence requires an update in the Data Dictionary, and the Data Dictionary becomes a point of contention for the applications.
A few hiccups: Troubleshooting Oracle RAC
About the author
Brian Peasland has been in the IT field for 18 years and has worked as a computer operator, operations analyst, systems administrator, application developer and finally a database administrator. He holds a B.S. in computer science and M.S. in computer science, specializing in database systems. See Brian's Oracle consulting website at www.peasland.net.