This chapter from Oracle Database 11g: A Beginner's Guide explains how to work with high availability features in Oracle 11g such as Real Application Clusters (RAC), Automatic Storage Management (ASM), and Data Guard. In this section, learn how to define, install and text Real Application Clusters (RAC), including how to work with the RAC environment and Oracle 11g RAC.
Define high availability
Define, Install and Test Real Application Clusters (RAC)
ASM Instance and ASM Disk Groups
How to manage an ASM instance with ASMCMD and ASMLIB
Implementing Oracle 11g data guard and data guard protection modes
Creating an Oracle physical standby server
CRITICAL SKILL 8.2
Understand Real Application Clusters
Oracle Real Application Clusters (RAC) provides a database environment that is highly available as well as scalable. If a server in the cluster fails, the database instance will continue to run on the remaining servers or nodes in the cluster. With Oracle 11g Clusterware, implementing a new cluster node is made simple. RAC provides possibilities for scaling applications further than the resources of a single server, which means that the environment can start with what is currently needed and then servers can be added as necessary. Oracle 9i introduced the Oracle Real Application Clusters; with each subsequent release, management as well as implementation of RAC have become more straightforward, with new features providing a stable and good performance environment.
In Oracle 11g, Oracle introduced rolling patches for the RAC environment. Previously, it was possible to provide ways to minimize downtime by failing over to another node for patching, but it would still require an outage to finish patching all of the nodes in a cluster. Now with Oracle 11g, the patches can be applied allowing other servers to continue working even with the non-patched version. Reducing any outages, planned or unplanned, in companies with 24x7 operations is key. Along the same lines of the rolling patches, the deployment of new RAC instances and nodes has been significantly enhanced in 11g. The Oracle Clusterware is the piece that helps in setting up new servers and can clone an existing ORACLE_HOME and database instances. Also, it can convert a single node Oracle database into a RAC environment with multiple nodes.
The RAC environment consists of one or more server nodes; of course a single server cluster doesn't provide high availability, because there is nowhere to failover to. The servers or nodes are connected through a private network, also referred to as an interconnect. The nodes share the same set of disks, and if one node fails, the other nodes in a cluster take over.
A typical RAC environment has a set of disks that are shared by all servers; each server has at least two network ports: one for outside connections and one for the interconnect (the private network between nodes and a cluster manager). The shared disk cannot just be a simple file system because it needs to be cluster aware, which is the reason for Oracle Clusterware. Oracle 11g made several improvements with the Oracle Clusterware, which now provides several interfaces for managing the cluster. RAC still supports third-party cluster managers, but the Oracle Clusterware provides the hooks for the new features for provisioning or deployment of new nodes and the rolling patches. The Oracle Clusterware is also necessary for Automatic Storage Management (ASM), which will be discussed in the later part of this chapter.
The shared disk for the clusterware is comprised of two components: a voting disk for recording disk membership and an Oracle Cluster Registry (OCR) that contains the cluster configurations. The voting disk needs to be shared and can be raw devices, Oracle Cluster File System files, ASM, or NTFS partitions. The Oracle Clusterware is the key piece that allows all of the servers to operate together. Without the interconnect, the servers do not have a way to talk to each other; without the clustered disk they have no way to have another node to access the same information. As seen in Figure 8-1, this is a basic setup with these key components. Next we will look at how to set up and install these pieces of the RAC environment.
CRITICAL SKILL 8.3
Before runInstaller or setup.exe is even executed, a checklist of pre-installation steps needs to be completed. These vary from network setup to making sure the proper disk is in place. Also before the database is even installed on one of the nodes for RAC, the clusterware needs to be present. Several of these steps only need to be done once to set up the backbone of the RAC environment no matter how many nodes are being installed. Then tools for cloning the configuration can be used for deployment of new nodes in the cluster.
FIGURE 8-1. RAC components
Each server needs to be set up with the needed kernel parameters and system parameters that are required for the operating system. (Please refer back to Chapter 2 for more details on installing Oracle.) Just as there were steps that needed to be completed for that install, network addresses and the shared disks need to be configured before Clusterware is installed.
Configurations for network addresses and connections are different from a standalone database. There are three different IP addresses that are needed: the virtual network, the private network (interconnect), and the normal or public network. The hosts need a non-domain name listed for each node in the /etc/hosts files on the nodes as well as the IP addresses. That is, each host will have at least three listings in the /etc/hosts files, and each one will have its own unique IP address and alias or name for the host.
cat /etc/hosts #eth0 – Public Network mmrac1.domain1.com mmrac1 mmrac2.domain1.com mmrac2 #eth1 – Private/Interconnect Network 10.0.0.1 mmrac1priv.domain1.com mmrac1priv 10.0.0.2 mmrac2priv.domain1.com mmrac2priv #VIPs – Virtual Nework 192.168.10.104 mmrac1vip.domain1.com mmrac1vip 192.168.10.05 mmrac2vip.domain1.com mmrac2vip
The public and private networks need to be configured on the same adapter for all of the nodes. So from the example host file, all of the nodes in the cluster must have eth0 set to the public network and eth1 to the private. These nodes should be tested and reachable by pinging them. The interconnect network should be reserved for traffic between the nodes only, and it is even recommended that it have its own physically separate network. (This means with hardware setup there should have been at least two network adaptors installed.) This will certainly help with the performance of the cache fusion, which is the memory sharing of the buffer caches between the nodes.
The shared disk needs to be available to be able record the configuration about the cluster being installed. This disk will house the Oracle Cluster Registry and the cluster membership.
There should be multiple voting disks available to the Oracle Cluster; if not added at installation, disks can be added, backed up, and restored if necessary. To add disks, the following must be run as root; the path is the fully qualified name for the disk that is being added:
crsctl add votedisk css path –force
Verify by pulling a current list of voting disks:
crsctl query css votedisk
To back up voting disks in Linux/Unix, run the following:
dd if=voting_disk_name of=backup_file_name
In Windows, use the following:
ocopy voting_disk_name backup_file_name
To restore voting disks in Linux/Unix, run the following:
dd if=backup_file_name of=voting_disk_name
In Windows, use the following:
ocopy backup_file_name voting_disk_name
With all of these different pieces needed before RAC can even be installed, the importance of verifying and checking the configurations is extremely high. When installing the clusterware, it is critical that the initial configuration of the virtual and private networks is set up properly. Verifying the network, disk, operating system, and hardware prerequisites is the first step for installation. The clusterware will not install properly if any of these requirements is missing. The option to install the Real Application Cluster when installing the Oracle software will not be available if the clusterware is not installed or installed correctly.
The Cluster Verification Utility (CVU) assists in this area and should be run before attempting to install the clusterware. It will verify the hardware and operating system prerequisites as well as the network configurations. From the software install directory run the following:
./runcluvfy.sh stage –pre crsinst –n mmrac1, mmrac2
Unknown outputs could mean that the user doesn't have privileges it needs to run the check, or a node is unavailable or having resource errors. Running cluvfy checks after clusterware is installed to verify the install and other prerequisites before the Oracle database install. Failures should be addressed here before attempting the database install, otherwise you may find yourself uninstalling and reinstalling many more times than necessary.
NOTE Setting up users for the installs of Oracle Software, it might be wise to plan to have different logins for Clusterware and ASM. Using ASM and Clusterware, they should be set up in different Oracle homes from the instance and can have separate Oracle Software owners, which would be a best practice for security. You can create users such as crs, asm, and oracle, but they must share the same Oracle Software Inventory and have the oinstall group as primary.
Now that the requirements are in place, the installation of the clusterware is ready to go; by using the Oracle installer, the option should be available to install clusterware. If the installation of clusterware does not come up, then go back through and run the Cluster Verification Utility and fix any issues first. Figure 8-2 shows the network information that was configured for the three network addresses as well as the name of the cluster being defined.
FIGURE 8-2. Clusterware Install
After your clusterware is successfully installed, it's time to install Oracle Database because the framework has already been completed on the nodes. From the first node in the cluster, run the Oracle installer (runInstaller on linux/unix, setup.exe on Windows). Install the Enterprise Edition, which follows along the same path as a single instance, except for the Cluster installation choices (see Figure 8-3) after the location of the install information. The recommended path would be to just do a software install without creating the database, so that the software install can first be verified. Then, you can use the database configuration assistance to create the database on the nodes of the cluster.
FIGURE 8-3. RAC install CRITICAL SKILL 8.4 Test RAC
RAC environments should first be created for testing and proving concepts for how the failover works for different applications. The RAC test environments should continue to be available for testing for production systems, because different workloads and failover can't be tested as against a single node. After the installs of the clusterware and database nodes, it's useful to test several different scenarios of failover in order to verify the install as well as to determine how the application fails over.
Create a test list and establish the pieces of the application that need to be part of that testing. Testing should include hardware, interconnect, disk, and operating system failures. Simulations of most of these failures are possible in the environment.
Here is an example test list:
- Interconnect Disconnect network cable from network adapter.
- Transaction failover when node fails Connect to one node, run the transaction, and then shut down the node; try with selects, inserts, updates and deletes.
- Backups and restores
- Loss of a data file or disk
- Test load balancing of transactions Verify that the services are valid and are allowing for work load balancing.
Using RAC databases on the backend of applications doesn't necessarily mean that the application is RAC aware. The application as a whole may not fail over, even though the database is failing over current transactions and connections. There might be a small outage when one of the nodes needs to fail over. However, with server calls about the failover, these events can be used to trigger automated processes for reconnecting or restarting application pieces. These are the Fast Application Notification events and can be used for failover and for workload balancing.
In Oracle 10g, having different pieces of the application connect to different nodes helped with load balancing in some ways, but now, thanks to the improvements in Oracle 11g, the Oracle Clusterware and the Load Balancing Advisor workload can be distributed across the RAC environment more effectively. Application workloads connect via a service, and it is by these services that load balancing as well as the failover is handled. The services are designed to be integrated with several areas of the database, and not just CPU resources. The advisor bases information on SERVICE_TIME and THROUGHPUT.