Creating a Standby Database with RMAN on Oracle 10g
Reading the Oracle Online manuals on how to create a standby database using the Recovery Manager (aka RMAN) is quite confusing: references on what to do next jump between articles back and forth, and you even have to remember which steps to skip in the referenced chapters. After all, you may end up even more confused than you have been curious before! So this article shall help you to overcome this, and give you an easy way on getting started with your Standby Database.
What's this article about
As said above, here we deal with setting up a standby database on Oracle 10g. We are going to use RMAN, and our server runs in a Unix/Linux environment. Though, there are only a few operating-system specific things, so the article should be applyable to other environments as well – at least up to ~90%.
What this article is NOT about
This article will not deal with OS specifica other than *nix – as said above. And it will probably not deal with other issues than mentioned above.
- Creating the Standby Database
- Postprocessing and Remarks
- SwitchOver / FailOver
Hardware and Software
Well – we do not talk about the server installation itself, so the special requirements for Oracle will not go here. Instead, we go to list the required environment. So here comes what we need for this article:
- Two physical servers – one for the master, and one for the standby system
- Oracle 10g binaries installed on both servers
- Concerning Oracle, all directory structures are identical on both servers; i.e. datafiles, dumps, etc. go to the same places – or, in other words, concerning Oracle the only difference in both servers is that one database will be "open", while the other is "standby"
- The master (primary) database ready, up and running on Oracle 10g
- File Recovery Area (FRA) is configured and used for Backups and ArchiveLog
init.ora for the standby database
Work to do
In our very first step, we will prepare the
init.ora for the standby
database. As base, we use the one from the master database. So if you use a
"Server Parameter File", spfile, instead of the ASCII pfile, make sure you
apply all changes you possibly made to it (e.g.: Make a backup from the
existing pfile, then
CREATE PFILE FROM SPFILE; in SQL*Plus, and
compare/adjust the values). Copy this pfile to the standby database, place it
where it belongs to (
$ORACLE_HOME/dbs directory), and open it with your
favorite editor. The important keywords to change/add are:
LOG_ARCHIVE_DEST_1='SERVICE=<master-tns-alias> VALID_FOR=(ALL_LOGFILES,PRIMARY_ROLE) STANDBY_FILE_MANAGEMENT=AUTO STANDBY_ARCHIVE_DEST='LOCATION=USE_DB_RECOVERY_FILE_DEST' DB_UNIQUE_NAME=<unique name for the standby DB> FAL_SERVER=<master-tns-alias> FAL_CLIENT=<standby-tns-alias> REMOTE_LOGIN_PASSWORD_FILE=EXCLUSIVE
<standby-tns-alias> are the aliases you will
configure in the
tnsnames.ora for the respective databases in a later step. A
DB_UNIQUE_NAME is needed, since both databases will share the same SID and
DBID – and thus need to be distinguished by something else (later you will see,
that this unique name is used in the FRA for directory naming by Oracle
The same changes we will apply to the master database once the standby has been
created – only with exchanged values for the
LOG_ARCHIVE_DEST1 pointing to the TNS alias of the Standby database.
What these parameters mean
Just in case, a short explanation of these parameters:
- LOG_ARCHIVE_DEST_1: Here we tell the master database to feed the
standby with redo. The keyword
VALID_FOR=(PRIMARY_ROLE)makes sure that this is only served when the instance is running as primary node. This applies to the standby after a failover/switchover, so we better have it ready here, too.
- STANDBY_FILE_MANAGEMENT: What to do when there are (data) files
created/deleted on the master.
AUTOautomatically applies these changes on the standby as well.
- STANDBY_ARCHIVE_DEST: Where the standby instance saves redo logs received from the master. Using the FRA again here automates this.
- DB_UNIQUE_NAME: As already explained, this is needed to identify the database - since both have the same SID and DBID.
- FAL_SERVER/FAL_CLIENT: This is for the transmission of the archived
log files (FAL = Fetch Archive Log), these settings define who serves whom.
Remember that after the failover, the roles change. These settings are
evaluated on the FAL_SERVER only - so we can prepare them on the
FAL_CLIENTin a way that they automatically fit after a failover/switchover.
- REMOTE_LOGIN_PASSWORD_FILE: This is needed in order to login
to the remote database AS SYSDBA. In above code, we set that to
EXCLUSIVE(each database on the same server uses its own password file). The steps below rely on this – but with some changes, you could also use
NONE, which would be the default).
The Oracle Password File
If you did not have an Oracle password file up to now, it is time to create it
now. Change to the
$ORACLE_HOME/dbs directory and issue the command
file=orapw<ORACLE_SID> password=<password>. Important: The file name must
orapw<ORACLE_SID>, and the file must be in the
directory, or it will not be found by the instance. Furthermore, the password
must be identical on the master and the standby side.
These steps are necessary to make a login AS SYSDBA possible via Oracle Net.
Now we need to adjust the listener on both, the master and the standby side. We need distinguished names for both instances which should be identically on both systems.
First we set up the
tnsnames.ora usually located in the
$ORACLE_HOME/network/admin directory. Here we add two new entries, one for
<master-tns-alias> = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP) (Host = <DNS-Name/IP des Master Servers>) (Port = 1521)) (CONNECT_DATA = (SID = <ORACLE_SID>) )) <standby-tns-alias> = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP) (Host = <DNS-Name/IP des Standby Servers>) (Port = 1521)) (CONNECT_DATA = (SID = <ORACLE_SID>) ))
Next comes the
listener.ora located in the same directory. Here we need to
add the local instance to the
SID_LIST_LISTENER (if your listener is not
LISTENER, as the default listener is, chose the corresponding block):
(SID_DESC = (SID_NAME = <ORACLE_SID>) (ORACLE_HOME = /local/oracle/ora102))
As said, these entries should be identical on both servers. If the changes are done, you need to (re-) start the listener:
oracle$ lsnrctl stop oracle$ lsnrctl start
Checking the database status
Now we go to check the status of our databases. We use SQL*Plus to connect as
SYSDBA, and issue the command
SELECT status FROM v$instance. On our master
database, this should return
OPEN or at least
MOUNTED, on the standby we
NOMOUNT. If this is not the case, we need to take the necessary steps:
On the master, issue a
startup (or at least
startup mount) command, on the
Check also if you can connect to the databases remotely. From the master, try connecting the standby and vice versa, both times as SYSDBA. If this fails, the standby creation will fail as well.
A last pre-requisite are the backups. We need a full backup of the master database, plus a copy of the controlfile for standby – both created by RMAN. If you do not have them yet, here we go:
oracle$ rman target / rman> BACKUP CURRENT CONTROLFILE FOR STANDBY; rman> BACKUP DATABASE;
Of course, these commands have to be executed on the master server. For more details on these commands, please refer to the RMAN documentation – you may want to finetune them for later use.
To finish preparations, you now need to copy the backups to the standby server.
The easiest way is to login to the standby server,
cd to the recovery area,
and rsync the files:
rsync -Pae master:/path/to/recovery/area/ .
Now we are prepared to create the standby database.
Creating the Standby Database
As the article is named, we want to use RMAN to create the standby database – so we do now. This page will be short, since the commands themselves are:
oracle$ rman target / auxiliary sys/password@standby-tns-alias DUPLICATE TARGET DATABASE FOR STANDBY NOFILENAMECHECK;
Sit back and wait – if that command is finished, your standby database is created and mounted!
Postprocessing and Remarks
- If not yet done, now is the time to apply the parameter adjustments to the master database, as mentioned in the first section
- Your new standby database is created, but not yet automatically recovering.
So you need to issue the
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;command at the sqlplus prompt, logged in as SYSDBA
- Though starting with 10g, Oracle automatically detects a standby database when
STARTUPcommand is issued, and opens it read only – it again does not start the managed recovery automatically. So you still need to take care that this is done after server restart.
- Optionally, you may want to create STANDBY LOGFILE GROUPs (see below).
Concerning the standby logs, let me quote a part of a Metalink documentation here:
"Standby redo logs are necessary for the higher protection levels such as Guaranteed, Instant, and Rapid. In these protection modes LGWR from the Primary host writes transactions directly to the standby redo logs. This enables no data loss solutions and reduces the amount of data loss in the event of failure. Standby redo logs are not necessary if you are using the delayed protection mode."
That is, by default we don't need them (since we use the deferred method: Redo log is applied to the standby after the primary has archived it, which means a certain delay). If your concern is to have the standby immediately applying redo information as soon as it is generated on the primary instance, i.e. without any delay – then you will need the standby logs. But remember in this case to configure them on both, the primary and the standby database – though they are not used on a primary instance, after a SwitchOver/FailOver the primary becomes the standby and should be prepared for this task.
There are some more things to take care of if you are going to configure
standby logs – too much for our little article, so please refer to the Oracle
documentation. Two hints I give you: One is the command,
ALTER DATABASE ADD
STANDBY LOGFILE, so you know what to look for. The other is a Metalink
document on this topic, see note
Most of these steps can be automated using the RMAN wrapper from the DBAHelper framework. These are:
- Automatic backups using the FRA
- Of course also managing those backups, restoring, recovering …
- Starting with v0.2.2: Guided creation of a standby database
The last mentioned feature will guide you step by step through what was explained in this article. Many things as creating a backup, creating a copy of the controlfile, turning on the managed recovery, will be done automatically when you answered the corresponding question with yes. However, some steps must still be done manually, as editing the configuration files, and copy files to the correct locations.
On the standby database, we cannot use Oracles DBConsole (since this one always places its repository in the local database itself, which is read only in case of a standby database and thus cannot be used; there is no way to place the repository in a different database). So if you want an interface like that for administration, you will need to setup Grid Control. But this is a topic for another article.
- MetaLink Note 789370.1: Creating physical standby using RMAN duplicate without shutting down the primary
SwitchOver / FailOver
SwitchOver or FailOver?
Though the fact, that the previous Standby database became the primary afterwards applies to both variants – there are small differences which should for sure be considered. The most important is: After a FailOver there is no way to convert the former primary instance into a standby database for the new primary system: Due to the fact that both of them created redo information after the FailOver, they are completely out of sync, and in no way they can be synchronized again (see below for details). The only possibility to get a new standby system again is to create it from the scratch (i.e. start at the first steps again).
After a SwitchOver, on the other hand, we have both, the primary and the standby system, available again as we had before. This means you should think twice before doing a FailOver. And though the options seem to be obvious, I will give you some additional hints:
- Both databases are physical and logical in the best state, and reachable for the applications. However, you need to point the apps away from the primary instance, e.g. due to maintenance issues as a kernel update or the like: SwitchOver.
- Same as 1., but the applications cannot reach the primary instance anymore e.g. due to network problems – while they could reach the standby system: SwitchOver if both databases can reach each other (i.e. when a switchover is still possible), FailOver only otherwise.
- The primary database is completely dead and cannot be brought up again in time, while the apps urgently need to use the system: OK, here we have absolutely no choice but to FailOver.
In other words: As long as you can do a SwitchOver, you should NOT do a FailOver.
So we go for the SwitchOver first. It's not hard to do – but important to stick to the order (i.e. do not switch the steps):
-- Make sure that nobody accesses the primary database anymore sqlplus@master> SHUTDOWN IMMEDIATE sqlplus@master> STARTUP [RESTRICT] -- Prepare the SwitchOver sqlplus@master> ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN; sqlplus@standby> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY; -- Do it sqlplus@master> SHUTDOWN IMMEDIATE sqlplus@master> STARTUP MOUNT sqlplus@master> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT; sqlplus@standby> ALTER DATABASE OPEN
Now the systems switched their roles. Please note: "@master" always stands for the old primary, "@standby" always for the old standby system in above code, even after the switch!
Once more to make it clear: Only do a FailOver if you cannot do a SwitchOver! So only in case of emergency and no other choice, do the following on the Standby system:
sqlplus@standby> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH; sqlplus@standby> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
If we compare the SQL of both methods, it becomes quite clear why after a FailOver there is no chance anymore to make the old primary system the new standby – while this is substantial part of a SwitchOver: On the failover, we immediately opened the Standby system read-write, so it immediately started to produce redo – while on the SwitchOver we first made sure that the old primary system turns "read-only" and is even waiting for not (yet) coming Redo-information from a not (yet) existing primary instance.