Daniel G. Kluge
dkluge at acm.org
Thu Jun 12 20:21:05 GMT 2003
Am Jeudi, 12.06.03, um 16:19 Uhr (Europe/Zurich) schrieb Deb Hale:
> Have any of you on the list had experience with design and development
> Hot Sites for Disaster Recovery? I am considering proposing this to
> local community and am trying to get information. Any ideas? Deb
I do have some advice here, but since your question is pretty
open-ended, I'll just give some more general pointers.
The first thing, is you have to figure out what you want to do. Which
systems have to be replicated, how far away do they have to be, what is
the recovery time.
The next thing is to make sure that everybody working on the system
knows the disaster-recovery requirements. If you don't have
change-management in production, don't even think to replicate that
system, it will never work! There is nothing more interesting than
firing up a cold standby, and discovering that neither OS nor
Application Version matches the current production system...
If you're replicating complete sites with everything, the next point
isn't that much an issue. But make sure everything wants to talk to the
disaster recovery site, there's nothing more stressful than to hunt for
the config file entry in some obscure application where it specifies
it's TCP peers, or having to reconfigure the fire-wall, so your new
system is actually visible.
Now the hard part of course is replicating data next to real-time, or
even doing a transparent fail-over. Here you will be constrained by
money and distance.
For most relational databases there are multiple variants for
replication, the cheapest is a shadow database, where you just reapply
the rollback segments to the database on the disaster recovery site
whenever a rollover occurs. More expensive and complex are replicated
databases, using the db-vendor's tools or 3rd party.
The most expensive solutions, which guarantee failover in an hour to
real failover mostly involve private fibers between the sites. One
method is to replicate the data-storage, i.e. have the SAN with your
data replicate itself. similarly you can extend cluster configuration
to have the 2nd half of the cluster in the disaster recovery sites a
short distance away.
The most expensive solution is of course having two live sites,
everything runs replicated, and the last element before the user
switches/decides which site to use, such a solution has virtually no
A final recommendation: For any setup, be sure that your
consultants/vendors have done such a setup before, and have enough
knowhow available to support you. This check should include everybody,
even the industry's largest names; depending on your location they
might not have the experience or manpower.
More information about the list