DISASTER RECOVERY PREPAREDNESS
Mainly relating to Unix servers.
Offsite Backups
It is a good idea to have, if at all possible, a set of backups kept separate
from your day-to-day backups.
Prudence suggests keeping them in a different building altogether; there
are some who keep
backups off campus. Regularity will differ between departements/units
but the important point
is that if your server room and backups are toasted your clients will have
lost weeks of
work instead of years.
Onsite Backups
For restoration of both client and server files some kind of daily archival
system should be installed. The
amount of data to backup will help determine the size and type of system
you will need. I am familiar with
the amanda (Advanced Maryland Automatic Network Disk Archiver) utility. I
could help you with
planning, installation, and configuration to implement amanda.
www.amanda.org
Realtime readiness
Regarding servers; should something go wrong with either the boot drive,
patching, or breach of security
generally the options are somewhat time consuming (usually measured in hours
or days). If a pactch has mucked
up your system you could back it out as long as you patched with the appropriate
option and you know what
patch is the culprit. If your boot drive has failed you can replace
it and then either install from scratch or
jumpstart. In either case you may have to bring it up-to-date using
your backups.
With a small investment in hardware and time you could recover in the amount
of time it takes to reboot your
machine. By having a 2nd drive installed and mirroring that drive (at
a time you choose) against the boot
drive you could simply boot to your 2nd drive in a time of crisis. I
mirror every Sunday morning at 5am and
before I conduct any patching (even though I patch using the backout option).
1) Install a 2nd drive that can at least hold the file systems you have on
the master drive.
I would recommend getting either different drives or if you
want the same drive then
make certain they are from different manufacture lots.
2) Copy this script file to / of your master drive
and configure it to meet the needs of
your file systems.
3) Set the DEBUG to 1 and then run /mirror to test and see what would happen.
If all looks correct
then restore DEBUG to 0 and run /mirror.
4) At the prom level you can use the 'boot' command followed by a disk
definition. However, I have found that the definition
you get from
using the format command is not always the correct definition.
Via format you can choose a disk and then use the 'current'
command to
'describe the current disk'. You'll get something back
like the following:
format> current
Current Disk = c0t1d0
<IBM-DNES-318350Y-SA30 cyl 11199 alt 2 hd 10 sec 320>
/pci@1f,0/pci@1/scsi@8/sd@1,0
The definition, /pci@1f,0/pci@1/scsi@8/sd@1,0 may
not work to boot
at the prom level. This particular case is off my SunFire
V100. Booting
at the prom using this definition does NOT work. I watched
a subsequent
normal boot carefully and noticed the word 'disk' in place
of 'sd' so
the following definition does work to boot from prom level
boot /pci@1f,0/pci@1/scsi@8/disk@1,0
Taking this a step further... to reduce the stress level in
a time of crisis I create
nvalias' on each of my servers. At the OK prompt I can
simply type
boot mirror
How to set nvalias: ok nvalias mirror /pci@1f,0/pci@1/scsi@8/disk@1,0
5) Now you are ready to boot to your 2nd drive. Time permitting you
should test this
and make sure it works. Once booted, do a
df -k -F ufs and note the
target numbers to ensure you really booted to the drive
you expected.
If you are interested in implementing this tool I will gladly help out.
Get in touch with me
so we can work something out.