Wednesday, November 4, 2009

10.6 Data Redundancy Solutions



[ Team LiB ]






10.6 Data Redundancy Solutions



Redundant data is another
option for dealing with primary site failure. Implementing a
redundant data approach differs from using a standby database, which
duplicates the entire primary database. Data redundancy is achieved
by having a copy of your critical data in an entirely separate Oracle
database with a different structure. The data, not the database
itself, is redundant. If the primary site fails, users can continue
working using the redundant data in the secondary database.



Oracle provides automated synchronous and
asynchronous data-replication
features to support data redundancy. For simplicity, in the following
sections we'll examine replication using a simple
two-site example�a primary and a secondary. Oracle can,
however, perform N-way or multimaster
replication involving more than two sites with all sites replicating
to all others.




10.6.1 Data Replication: Synchronous and Asynchronous



Whenever you have a data replication scenario, you always have a
primary site, from which the replication originates, and a secondary
site, which is the recipient of the data replication. (In a
multimaster scenario, you can have more than one master site, and a
single machine can be a master site for one replication plan and a
secondary site for another.) When you design your replication plan,
you must consider the degree to which data at the secondary site can
differ for a period of time from the data at the primary site. This
difference is referred to as data divergence.
When you implement replication, Oracle generates triggers on all
specified tables. These triggers are fired as part of the primary
site transactions. The triggers either update the secondary
site's data as part of the same transaction
(synchronous replication) or place an entry in a
deferred transaction queue that will be used later to update the
secondary site (asynchronous replication).



Key considerations in setting up a replication environment include
the following:




Tolerance for data divergence



The smaller the data
divergence, the more individual replication actions will have to be
performed. You will reduce the resources needed to implement the
replication by increasing the data divergence.




Performance requirements



Because replication requires resources, it can have an impact on
performance. However, Oracle Database 10g allows
Oracle Streams to capture change data from log files, which greatly
reduces the performance impact of replication on an active database.




Network bandwidth



Because
replication uses network bandwidth, you
have to consider the availability of this resource.




Distance between sites



The more distance between sites, the longer the physical transfer of
data will take and the longer each application will take.




Site and network stability



If a site or a network goes down, all replications that use that
network or are destined for that site will not be received. When
either of these resources comes back online, the stored replication
traffic can have an impact on the amount of time it takes to recover
the site.




Experience level of your database administrators



Even the most effective replication plan can be undone by DBAs who
aren't familiar with replication.





Figure 10-11 illustrates synchronous and asynchronous replication.




Figure 10-11. Oracle replication for redundant data



Synchronous, or real-time,
replication can be used when there is no tolerance for data
divergence or lost data. The data at the secondary site
must match the primary site at all times and
reflect all committed transactions. Each transaction at the primary
site will fire triggers that call procedures at the secondary site to
reproduce the transaction. Synchronous replication uses distributed
transactions that will add overhead to every transaction at the
primary site. Whether or not this additional overhead is acceptable
will clearly depend on your specific requirements. Synchronous
replication introduces system interdependencies�the secondary
site and the network connecting the sites must be up or the primary
site will not be able to perform transactions.



You can also use asynchronous, or deferred, replication to provide
redundant data. With asynchronous replication,
transactions are performed at the primary site and replicated some
time later to the secondary site. Until the deferred transaction
queue is "pushed" to the secondary
site, replicating the changes, the data at the secondary site will
differ from the primary site data. If the primary database is
irrevocably lost, any unpushed transactions in the deferred queue
will also be lost.



The extent of the data divergence and potential data loss resulting
from the divergence is a very important consideration in configuring
asynchronous replication. In addition, asynchronous replication
allows the primary site to function when the network or the secondary
site is down, while synchronous replication
requires that the secondary site be available.
Asynchronous
replication
adds overhead to transactions at the primary site, so once again,
you'll need to carefully consider throughput
requirements and perform appropriate testing. Typically, asynchronous
replication adds less overhead than synchronous replication, because
the replication of changes can be efficiently batched to the
secondary site. However, asynchronous replication will still add
overhead to the operation of the primary site, so you should consider
and test the effect of both types of replication on your database
environment.





10.6.2 Old-Fashioned Data Redundancy



You can also achieve data redundancy using
Oracle's standard utilities. Historically, one of
the most common backup methods for Oracle was simply to export the
contents of the database into a file using the
Oracle
Export utility. This file could then be shipped in binary form to any
platform Oracle supports and subsequently imported into another
database with Oracle's Import utility. This approach
can still provide a simple form of data redundancy if the amount of
data is manageable.



Oracle 7.3 introduced a direct path export
feature that runs about 70% faster than a traditional export. The
direct path export avoids some of the overhead of a normal export by
directly accessing the data in the Oracle datafiles.



You can also improve the performance of an import with some planning.
An import is essentially a series of INSERT statements, so you can
optimize the insert process by inserting the data into a table
without indexes and then adding the indexes after the table is built.
The time and resources needed to perform large exports and imports
may make this an appropriate option.



Another export
solution is to unload data from the desired tables into simple flat
files by spooling the output of a SELECT statement to an operating
system file. You can then ship the flat file to the secondary site
and use Oracle's SQL*Loader utility to load the data
into duplicate tables in the secondary database. For cases in which a
significant amount of data is input to the primary system using
loads, such as in a data warehouse, a viable disaster-recovery plan
is simply to back up the load files to a secondary site on which they
will wait, ready for reloading to either the primary or secondary
sites should a disaster occur.



Oracle Database 10g
further speeds this process through introduction of the
new Data Pump, a higher-speed export/import.



While these methods may seem relatively crude, they can provide
simple data redundancy for targeted sets of data. Transportable
tablespaces can also be used to move entire tablespaces to a backup
platform. Oracle Database 10g lets you transport
tablespaces from one type of system to another, which increases their
flexibility for implementing redundancy, moving large amounts of
data, and migrating to another database platform.




Export/Import, Standby Database, or Replication?



All the choices
we've discussed in this chapter offer you some type
of protection against losing critical data�or your entire
database. But which one is right for your needs?



To quote the standard answer to so many technical questions,
"It depends:"



  • Export/Import, whether in its original form or in the Oracle Database
    10g Data Pump, provides a simple and proven
    method, but the overhead involved with this method typically leaves
    larger time periods where data is lost in the event of a failure.

  • Transportable tablespaces can provide the same functionality with
    better performance, but are less granular.

  • A physical standby database typically leaves smaller data gaps or, in
    the case since introduction of Oracle9i
    zero-data loss, no data gap; however, this solution does require the
    expense of redundant hardware.

  • Replication also requires redundant hardware and ensures consistent
    and complete data on both the primary and backup server, but this
    solution is the most resource-intensive of all.


You should carefully balance the cost, both in extra hardware and
performance, of each of these solutions, and balance them against the
potential cost of a database or server failure. Of course, any one of
these solutions is infinitely more valuable than not implementing any
of them and simply hoping that a disaster never happens to you.












    [ Team LiB ]



    No comments:

    Post a Comment