RMAN Framework - Easy Interface for beginners
Oracles "Recovery Manager" – aka RMAN – is known as a powerful command line tool around backup and restore. But as usual with such powerful command line tools, many are concerned about their complexity and, because of that, tend to not use them as they are "too complicated to handle". It is true, you can do very complicated things with RMAN – but who says you have to? This article shows you how to start off easily – and extend when necessary.
What's this article about
While the RMAN framework, as a part of the DBAHelper package, ships with a documentation, that documentation is more a reference for the tools contained in the package than a detailed "workshop". So this missing piece shall be given by this article: Explaining everything in a little more detailed way, giving examples and screenshots.
As usual, we again use a Linux server, as the RMAN framework depends on Bash. Of course, it should run on Linuxes other than RHEL / CentOS, and also on other Unixes (Solaris, AIX, …). As the RMAN framework is targeted to be used with Oracle databases version 10g and higher, you should have such installed. And yes, you will need to have the Flash Recovery Area configured here.
What this article is NOT about
Well, the usual disclaimer a bit harder this time: No Windoze here at all, since the RMAN framework is (Unix) Shell based (requires Bash). Never tried whether it works with Cygwin – but I don't think somebody installed an Oracle database using Cygwin?
Contents
Components
What is RMAN?
RMAN is the Oracle Recovery Manager. But where we can recover, we need a backup first - RMAN also covers this area. It is a very powerful tool, and we will not discuss it in too much detail here (if you want more details, see Database Backup and Recovery Basics at Oracles Website, which will give you all the details you may want.
In short: With RMAN, you can backup your database to a local disk (e.g. to the Flash Recovery Area, as we do here) or to tape or somewhere else, while it can use storage managers to access these locations. You can use RMAN to restore from a backup – whether the complete database or only a tablespace or just a single datafile. It is also possible to check for and repair block corruptions, and more.
What is DBAHelper?
DBAHelper is a collection of tools for the "daily needs" of the Oracle DBA, from the simple scripts to show used undo or lazy sessions, via scripts to automatically move objects from one tablespace to another or rebuild all/broken indexes, to complex things as the RMAN framework. DBAHelper is provided under the terms of the GNU Public License (GPL) and such available free of charge.
What does the RMAN framework provide?
RMan Framework: Menus The RMAN framework provides easy access to central functions of RMAN. For the beginner, it gives you out-of-the-box running backup and restore facilities to generate backups (and maintain – e.g. cleanup obsoletes), full/tablespace restore, repair of block corruptions, moving your Flash Recovery Area to a new location, and more. Even the setup of a standby database can be done. And all this can be found easily due to a well organized menu structure, grouping commands into intuitive submenus.
The framework can be customized concerning its interface as well as for the database settings – both will be discussed as well as its automated and interactive use, in this article.
Installation
Pre-Requisites
As usual, there are some preconditions to be met. E.g. it makes no sense to
install the DBAHelper package if there is no database available: Most scripts
– and the RMAN framework belongs to these – expect a local database to work on.
Moreover, the RMAN framework also needs the rman
executable installed on the
local machine.
Another precondition is to have the database configured to use the Flash
Recovery Area for backups. This is done using the parameters
db_recovery_file_dest
and db_recovery_file_dest_size
in your init.ora
, or
by executing the corresponding ALTER SYSTEM...SCOPE=both;
command when using
a SPFile. Just make sure you have the FRA configured, and you are ready to
proceed.
Installation
Depending on the download type, there may be different ways to install DBAHelper (with the RMAN framework) or the RMAN framework alone:
- Having downloaded the TAR archive (
*.tar.gz
), simply unpack all (for all DBAHelper contents) or just therman/
folder (for the RMAN framework) to any destination of your choice - Having downloaded a package (either RPM or DEB), use the corresponding
packaging software to install it, e.g.
rpm -i dbahelper*.rpm
ordpkg -i dbahelper*.deb
- Having downloaded nothing, you can simply checkout the latest code from the repository (check its FAQ in the wiki)
- Having downloaded nothing, you can also add IzzySofts Apt
Repository to your APT,
YUM or RPM configuration as described there, and simply use
apt-get install dbahelper
oryum install dbahelper
after you updated your configuration.
There is no special configuration required – all should run fine out of the box. However, there is some configuration possible to suit your special requirements and/or feelings: This is described on the following page.
Configuration
Basically, there are two configuration files for the RMAN framework: One deals
with the behaviour, look and feel of the rman.sh
wrapper script and such
affects the interface you use – the other is the backup configuration for RMAN
itself and such affects your backup policy and how backups are made. Both will
be explained below:
Look, feel and behaviour
The configuration file for "your convenience" is named rmanrc
, and resides in
the main directory of the RMAN framework (rman/
in the distribution archive –
or /usr/local/share/dbahelper/rman
when you installed from a package). The
keywords are explained in the shipped documentation, so we will focus on the
main things here:
The LOGDIR
should point to an existing directory to store your logfiles to.
rman.sh
will create logfiles for all actions, storing at least the output
produced by rman
– so you will always be able to reproduce what has happened
or what you've done. Each new call will produce a new logfile there with the
name rman_<action>_yyyymmdd_hhmiss
. So if you e.g. run rman.sh backup_daily
at 10:27:03 on June 11th, 2008 – you will get a logfile called
rman_backup_daily_20080611_102703
there. Unlikely you'll run the same command
twice at the same second ;)
With the TEMPTS_*
keywords you can define your temporary tablespace. This
tablespace is never backed up (what for should it be – it's only temporary
data), so it won't be restored either. Though, if it is missing, you will get a
bunch of errors – so rman.sh restore_temp
will recreate your temp tablespace
interactively, using the values defined here as defaults – but asking you for
confirmation; i.e. you can override these defaults interactively.
The USEDIALOG
and TIMEOUT
settings are only affecting the (G)UI: If you
have the package dialog
installed, we can create a much more userfriendly UI
for interactive use (so for non-interactive use, we can ignore this). Here you
can set whether we shall do so if possible (setting USEDIALOG=1
) or not. If
dialog
is not found, the script automatically falls back to a "plain text"
mode. You still can override the setting made here on the command line with the
--[no]dialog
option.
The TIMEOUT
(in seconds) just defines after what time to close textboxes
waiting for acknowledgement while you've fallen asleep. By default, this is
3600s (1 hour).
Remain the TMPFILE
and SPOOLFILE
settings – where the
only important point is they should point to a non-existing file within an
existing directory. Both files are of temporary nature – i.e. they are created,
but usually also deleted at runtime - so the name itself should be of no closer
interest to you.
DB Configuration
General Configuration
Now come the real important things: Configuring the backups themselves. This is
done in the rman.conf
file. The initial configuration as shipped is a working
one, of course (as long as you have the FRA configured as demanded) – but it
may or may not suit your local requirements. So I will explain what is
configured, and what may be (basically) changed. Sure there are many more
options we will not discuss here: For the advanced stuff I already refered you
to the Oracle Backup documentation.
The first is the Retention policy. The RMAN framework uses a recovery window – i.e. it should be possible to restore to any point-in-time during the specified period. This is achieved by making an initial copy of all the datafiles, storing additional incremental backup pieces to these, and when reaching the window, updating the datafile copies from these pieces to the maximum time. So with our default policy of 7 days, the datafile copies will always reflect the data of 7 days old, and from there we can always apply the youngest incremental backup to the time of the last backup, and then the archived logfiles until the last one to reach the most up-to-date state – or any combination to reach any point in between.
Surely this needs quite some diskspace – so you may want to decrease the window size. Though RMan itself also supports other strategies (e.g. redundancy), this is not (yet) covered by this framework. But setting the window to 1 day is the same as a redundancy of 1, if you make one backup per day – so it should be sufficient in most cases.
The control file autobackup
you should leave configured to "ON" for more
safety – this way RMAN automatically creates a new backup of the controlfile if
it has changed. With our default configuration, this will be on each new backup
and/or cleanup, since we don't use a catalog database but rather store our
backup information in the controlfile.
The other settings affect the device type for the backup, and are all configured to "disk", i.e. store it on a local harddrive, since we use the Flash Recovery Area (FRA) as backup target. For advanced settings, please consult the RMAN documentation.
Database specific configuration
If you have multiple databases running on a single host, and want to apply
different settings, you will need multiple rman.conf
copies. You can either
name them as you like, and supply the config file to use with the -c
<configfile>
switch - or better name the configuration file after the
databases ORACLE_SID
, i.e. rman_$ORACLE_SID.conf
(for a database with the
ORACLE_SID
"orcl", this would be rman_orcl.conf
). This way you don't need
the -c
switch, since rman.sh
automatically finds this configuration and
even prefers it over the default one.
These database specific configuration files are also useful if you want to
handle multiple databases in one run with the backup_daily
or
cleanup_obsolete
commands. If the settings are similar, it will also suffice
to simply created symlinks for this case.
Interactive use
Syntax
Calling rman.sh
without any parameters will display a help screen listing up
usage instructions. This is mainly a list of all parameters to be used, and a
few examples - so if you forgot the names, this is the easiest way to retrieve
them. Nothing else will happen then: After displaying the help screen, the
script will exit.
If you call rman.sh
with the command to execute as only parameter, it will
enter its interactive mode to guide you through the process – which will be
shown below by a few examples.
First start
On its first start, rman.sh
will display its disclaimer. This is not to nag
you (it won't be displayed a second time) – but just to remind you of the usual
possible danger: While it is free software meaning "free of charge", it is also
free software meaning "free of warranty". Not that there are any dangerous
routines put there incidentally – but always there's something that may go
wrong. And no software is free of bugs. Just keep in mind you are using it on
your own risk … the usual stuff.
Dryrun
Worried by that disclaimer, we look for a "less dangerous" method of testing.
Here it is: the option --dryrun
- which on one hand is also covered by the
disclaimer, but at least theoretically should not change anything. This option
can also be used for learning purposes: Instead of executing the intended
action, it simply displays what it would do (while in order to do so, it
sometimes at least executes some read-only commands). By this you can see how
rman.sh
does its job - and whether you want to trust it for a real run. As a
side-effect, you learn something about RMAN and SQL*Plus commands ;)
As you can see when enlarging the screenshot next to this paragraph, in
DryRun rman.sh
will not only display the OS command executed,
but also the contents of the scriptfile used (if any). This should allow you to
really investigate the case – and make the actions more transparent to you. Of
course, interactive questions will still be asked even in DryRun mode – the
only difference to the "RealRun" should be to not execute any changes but
display the commands which would cause them instead.
Backup
The simplest example is to run the daily backup command:
rman.sh backup_daily
For the daily backup, no special input is needed – everything is already
configured: We know the backups shall be stored into the Flash Recovery Area
(FRA), and it should be a complete backup (so no objects to specify
explicitly), and the backup strategy is already defined in the rman.conf
configuration file. So you only need to lean back and watch the tool doing the
job.
You may want to provide the additional option --all
to run the backup for all
configured databases – this will not change anything for your input, since
again all configuration is already available – and you only need to lean back
and watch until the program has finished.
Validate your backup
Validating Backups
Good to know that you have a backup now! But even better, if you'ld know it is
not broken. A broken backup is not very helpful, right? But RMAN also has
commands to verify the data integrity. So now you may wish to run a rman.sh
validate
and see what happens.
Again, no questions are asked (what for – we should now where we stored the backups we made ourselves, shouldn't we?) – the validation process does start immediately, and so is the reporting: A so-called tailbox shows the progress of the command running. Even though you can close this box before the process finished, it will continue running – except you press Ctrl-C. But if you want to see the final results (and why else did you call this command), you should wait to the end before closing the tailbox.
But if you fell asleep (or the phone called, the party started, whatever)
before the process was finished, and the box closed due to the timeout
configured: Remember there are always the logfiles from the RMAN commands
executed. So in this case, if you forgot where they are stored, have a look
into your rmanrc
file for the directory, and then consult the logfile …
Cleanup the Flash Recovery Area
No more lazyness – now your interaction will be required: The commands
cleanup_expired
and cleanup_obsolete
(see screenshot) will check for files
to remove – either because they are no longer needed according to your backup
policy (cleanup_obsolete
), or for some other reason as e.g. failing a
crosscheck (cleanup_expired
). From now on you will be guided through the
process, and asked for confirmation of the single steps: rman.sh
first tells
you what it is going to do, and where you should concentrate on. Then it lists
the results of a check for the files to be purged, and finally asks you again
to confirm that you really want to delete these files.
Of course this can be automated to not ask you any questions, or even to not output anything – but hey, we are talking about interactive use here, so for automatizing things please check the corresponding chapter ;)
Moving the Flash Recovery Area to a different location
There may come a time where you have to move the FRA to a different location. The easiest reason may be you underestimated the needed space, and now the disk holding the FRA is filled so much that next the archiver will hang because there is no space left on that device – so you add a new disk, or mount another partition - and aks yourself how to get the FRA moved here.
Moving the FRA While it is easy to tell the database to use the new location instead, the old backups still remain in the old place. You may think with the time they expire; but since we use a recovery window, we will remain merging our incrementals with the datafiles there. And even if not, maybe you don't want to wait?
Without to much "fiddling around", I would not mention it here if the RMAN
framework does not supply the answer: You simply call rman.sh move_fra
, and
again will be guided through all steps necessary. On the screenshot you may
also note that a kind of progress is also shown, so you know just about how far
you already got and how many steps will follow.
After having you asked for the database to work on (just to be sure), rman.sh
already retrieved the current location of the FRA from that database, asking
for your confirmation (see screenshot). Usually you will just have to confirm
this: It is very unlikely this specification is wrong. Only exception: You
already changed it in the database, and then started to think about how to
move the files. For this case, rman.sh
offers you the choice of manual input
here.
In the same manner, it will ask you for the new location (which you now have to
enter manually – since how should we guess that?). Then it updates the database
accordingly, and asks you whether to move the files. This is done then (if you
approved) with the OS command cp
, and the following steps will update the
RMAN catalog accordingly (expiring the old location with a crosscheck, cleaning
up the expired files, and finally re-cataloging the new location). Thus in the
end, you have all files in the new place, and your RMAN catalog is fixed with
the new information.
Restore
Restore a Tablespace Now we created backups, verified them, cleaned up old ones. And finally comes the day we wished never to see: Something weird happened, the database is broken! We need to restore data from our backup.
Since this (luckily!) does not happen every day, one easily forgets the commands involved in the restore process. But we are lucky enough to have our little helpers, right? So there are quite a few commands available we could use to restore the lost stuff:
- No time to figure out what is needed? Let RMAN decide:
rman.sh restore_full
- Know exactly what is broken – just a dumb file from one tablespace? Restore just this one, run
rman.sh restore_ts
(see screenshot) - Urgs, the alert log says something like
ORA-1578: ORACLE data block corrupted (file # 6, block # 1234)
? Write down the file# and block#, and callrman.sh block_recover
- The TEMP tablespace got lost? Well, that's an easy case – no data lost ;) Simply run
rman.sh restore_temp
In all these case, you will be guided through the process as usual. rman.sh
will ask you for some data (if needed – like the name of the tablespace to
recover when you invoked rman.sh recover_ts
, or the details for the temp
tablespace, or - in the case with the block corruption – which file and block
are affected), and do the necessary work.
Creating a standby database
The best (and most complex) thing last: You can also use the RMan Wrapper to create a standby database, as described in this article. Like with the other commands described above, you will again be guided step by step through the entire process.
On the first screen it tells you the preconditions needed – i.e. what you have to prepare manually before running this script. It is not that much anymore: You simply have to create all directories on the standby host to exactly match those on your primary, and run the script on the primary host – that's it. Optionally (and recommended) is the SSH setup mentioned, so we can automatically update files there as well. Of course, if that is not possible, the script will tell you what you have to do then manually. If these conditions are all met, you can continue with the program.
You now will be asked to give the specifications (as hostname, SID, tnsname)
for the standby and primary system. rman.sh
will then try to verify if these
data are correct (e.g. by doing a tnsping
to check for the standby), and – if
needed – add the corresponding entries to your
$ORACLE_HOME/network/admin/tnsnames.ora
. If everything looks fine so far, it
will take the primary databases init.ora
file (which you approved among
others), make 2 copies of it, and apply the necessary changes to those copies:
One for the standby to create, and one for the primary to change later. It also
copies the standby init.ora
to the standby machine - but before this you will
get a chance to check the created file and, if needed, apply changes.
Step number 3 is the shortest during this process: It will change your primary database to force logging. This is a mandatory step (though you can skip it, which is only for the case you already did that before): Without this, one could pass by the logging with direct writes. And you remember how the standby feed works? By transfering the logged information – so the direct-write information would be lost! The forced logging avoids this problem.
Step number 4 just asks two questions – but may result in some waiting: Now we
make sure we have all backups available. As we create the standby database
using RMan, this works via backups. Well, hopefully you already made backups
with RMan up to now - maybe you even used rman.sh backup_daily
for this:
Then you just need to create the backup controlfile for the standby database
(which is done fast), and you are done with this.
A single question in the next step – but answered with "Yes", it may be the
most time consuming part (depending on your database size and network
configuration): The script copies all necessary files (init.ora
, backups)
from the primary to the standby host. If you answer with "No", you either
shared the drive containing the backups e.g. using NFS – or you need to copy
the files manually. They must be available locally on the standby host in order
to proceed.
Step number 6 again is a fast one: Before we can do the real thing, we need to make sure the databases are in the correct state – i.e. the primary is opened or at least mounted, and the standby is started (but not mounted or open). The screenshot next to this paragraph shows we found these OK and can proceed. Otherwise, you would be asked whether the script should try to startup the database into the correct state – or there would be an errormessage displayed (and the script aborted) if the standby database is already mounted or opened.
Now lean back and wait: Step number 7 is the last one, and will create your standby database. If no errors occur, this will end up with your new database being mounted, and you will be asked whether to start the managed recovery.
Congratulations! All you need to do now is to decide whether you want to start
the database automatically if the server gets rebooted (by making the corresponding
entry to your /etc/oratab
), care for the managed recovery after
each restart – and to update your primary databases init.ora
. A
new init.ora
file was already created in a location you specified,
so you could either replace the original one or, if your database uses a server
parameter file (SPFile), check the end of the new init.ora
for the
changes made, and send the relating ALTER … SCOPE=both;
commands
via SQL*Plus.
Conclusion
As we have shown, working with RMAN must not necessarily be cryptic and complicated. And you don't need to be nervous remembering the correct commands in their right order for the (hopefully) rare cases when you need them. The RMAN framework helps especially the beginners to have an easy start with it. Once you got familiar with RMAN, you may grow over this framework developing sophisticated scripts doing the really complicated stuff - but that's fine, we got you started! On the other hand: If you extended the RMan Framework with additional scripts – I would be happy to get a copy to, maybe, even add them to the distribution!
Next we will see that there's also potential for some of this more advanced stuff: We leave the interactive world, and use the RMAN framework for automated maintenance.
Non-interactive use
For the non-interactive mode, we can use a subset of the same commands from the
interactive mode described in the previous chapter. Of course,
there are some exceptions: Those commands really requiring user interaction –
e.g. the input of names and passwords - are not available. If there are only
"Yes/No" questions to answer, we can override that with the --yestoall
option.
So for the non-interactive mode, we have to overrule some things. We don't want
to assume "yes" to all questions alone – but we also don't want any questions
asked. That means, we can tell the script to "shut up". Since the only output
really needed either goes to the log files, we can even tell it to "shut up
completely" – giving it the -q
option 3 times: No subprogram STDOUT, STDERR
and no direct STDOUT from the program itself.
Since the UI is not needed here, it falls back to "plain text" mode already
when the -q
was specified twice – in this case, it only logs the progress to
the screen.
Useful commands in non-interactive mode
The following commands can be used in the non-interactive mode – e.g. from a cron job:
rman.sh backup_daily -q -q -q --yestoall [--all]
to run the daily backuprman.sh cleanup_obsolete -q -q -q --yestoall [--all]
to purge the FRArman.sh crosscheck -q -q -q --yestoall
to cross-check files against catalogrman.sh validate -q -q -q --yestoall
to validate our backups
While backup and cleanup are very useful daily maintenance tasks, automated crosscheck and validate runs allow a fast check of the logfiles once the unattended run - which can take a while on large databases – has completed, without the need to wait all the time.
An unattended restore I felt to be not only not useful, but also a bit dangerous. So these commands are not listed here. Same applies to moving the FRA and creating a standby database …