Monday, November 2, 2009

12.3 Moving Data Between Distributed Systems



[ Team LiB ]






12.3 Moving Data Between Distributed Systems



The previous section discussed the
use of multiple database servers acting together as part of a single
logical database for users. The following situations call for the
contents of a database to be duplicated and moved between systems:



  • When data available locally eliminates network bandwidth issues or
    contention for system resources

  • When mobile database users can take their databases with them and
    operate disconnected from the network

  • When redundant databases can help to deliver higher levels of
    reliability, as each database can be used as a backup for other
    databases


The biggest issue facing users of multiple identical or similar
databases is how to keep the data on all of the servers in sync as
the data is changed over time. As a user inserts, updates, or deletes
data on one database, you need to have some way to get this new data
to the other databases. In addition, you will have to deal with the
possible data-integrity issues that can crop up if the changes
introduced by distributed users contend with each other.



Oracle offers a number of strategies to address this situation. The
following sections discuss techniques for data and table movement
among distributed systems.




12.3.1 Advanced Replication




The
copying and maintaining of database tables among multiple Oracle
databases on distributed systems is known as
replication.
Changes that are applied at any local site are propagated
automatically to all of the remote sites. These changes can include
updates to data or changes to the database schema. Replication is
frequently implemented to provide faster access for local users at
remote sites or to provide a disaster-recovery site in the event of
loss of a primary site. Oracle's Advanced
Replication features support both asynchronous replication and
synchronous replication. Oracle also supports heterogeneous
replication with DB2 through its Replication Services, bundled in the
Mainframe Integration Gateways.



Replication continues to evolve. Oracle8 moved execution of
replication triggers to the database kernel and enabled automatic
parallelization of data replication to improve performance.
Oracle8i added replication triggered by changes
to selected rows or columns of a table. Oracle9i
replication added support for object datatypes and
multi-tier updateable materialized views. Release 2 of Oracle9i added
log-based replication via Oracle Streams (described later in this
chapter).



Asynchronous
replication
is the storage of changes locally for later
forwarding to a remote site. Some types of asynchronous replication
include read-only snapshots replicated from a single updateable
master table and updateable snapshots that, although disconnected,
can also be updated.



In the Standard Edition of Oracle, you can have only one master site,
which replicates changes to its child sites. In the Enterprise
Edition, multiple master sites can exist and updates can take place
at any of these sites. The updates to these sites must be
synchronized, meaning that an update is not
completed until all of the target sites have been updated; otherwise,
conflicts can remain unresolved. Conflicts can occur when more than
one site updates the same data element during the same replication
interval. Changes are propagated using deferred remote procedure
calls (RPCs) based on events or at points in time when connectivity
is available or communications costs are minimal.



Several conflict-resolution routines provided with Enterprise Edition
can be automatically used to resolve replication conflicts. An
administrator can simply choose which conflict-resolution strategy he
wishes to use for a particular replication. For updates that may
affect a column or groups of columns, standard resolution choices
include the following:




Overwrite and discard value



Used when there is a single master (originating) site for new values
to update current values at destination sites.




Minimum and maximum value



Minimum compares the new value at the originating site and the
current value at the destination and applies the new value only if it
is less than the current value.



Maximum compares the new value at the originating site and the
current value at the destination and applies the new value only if it
is greater than the current value.




Earliest and latest timestamp value (with designation of a column of type DATE)



Earliest dictates that when there are multiple new values, the value
used for updates will be in the row with the earliest timestamp.



Latest dictates that when there are multiple new values, the value
used for updates will be in the row with the latest timestamp.




Additive and average value for column groups with single numeric columns



Additive takes the difference of new and old values at the
originating site and adds them to the current value at the
destination site.



Average takes the current value at the destination and the new value
at the originating site, divides by 2, and applies the new value.




Priority groups and site priority



When priority levels are assigned to columns and multiple new values
occur, columns with a lower priority will be updated by columns with
a higher priority.





Also available for use are built-in uniqueness conflict-resolution
routines, which will resolve conflicts that result from the
distributed use of primary key and unique constraints. The built-in
routines include the following:




Append site name to duplicate value



Appends the global database name of the originating site to the
replicated column




Append sequence to duplicate value



Appends a generated sequence number to the column value




Discard duplicate value



Discards the row at the originating site that causes errors





You can also write your own custom conflict-resolution routines and
assign them if your business requirements are not addressed by the
standard routines.




12.3.1.1 Managing advanced replication


You can manage replication through Oracle's
Replication Manager, which is launched
from the Oracle Enterprise Manager. Administrators can configure
database objects that need to be replicated, schedule replication,
troubleshoot error conditions, and view the deferred transaction
queue at each location through this central interface. A deferred
transaction queue is a queue holding transactions that will be
replicated (and applied) to child sites.



For example, to set up a typical multimaster replication, you must
first define master groups and tables and objects to be replicated in
each of the databases.



The Replication Manager wizard, which is part of the Oracle
Enterprise Manager Java version, then defines and sets up how
replication will occur. You first define a connection to the master
definition site, then create one or more master groups for
replicating tables and objects to the multiple master sites. Next,
you assign conflict-resolution routines for replicated tables in each
master group. Finally, you grant appropriate access privileges to
users of applications that access the data at the multiple
sites.






12.3.2 Transportable Tablespaces



Transportable tablespaces
are a way to speed up the distribution of complete
tablespaces between multiple databases. Transportable tablespaces
were introduced with Oracle8i Enterprise Edition
to rapidly copy and distribute tablespaces among database instances.
Previously, tablespaces needed to be exported from the source
database and imported at the target (or unloaded and loaded).
Transportable tablespaces enable copies to be moved simply through
the use of file transfer commands such as ftp.
Before you copy and move a copy of the tablespace, you should make
the tablespace read-only to avoid inadvertently changing it. Data
dictionary information needs to be exported from the source prior to
transfer, then imported at the target.



Some of the most popular reasons to use transportable tablespaces
include:



  • Rapid copying of tablespaces from enterprise data warehouses to data
    marts

  • Copying of tablespaces from operational systems to operational data
    stores for use in consolidated reporting

  • Publishing of tablespaces for distribution on CD-ROM

  • Use of backup copies for rapid point-in-time tablespace recovery


Prior to Oracle9i, Oracle block sizes needed to
be the same and, prior to Oracle Database 10g,
platform operating systems needed to be the same for both Oracle
instances in order to use this feature. Oracle Database
10g introduces transportable tablespaces that
can be used across different platforms.








    [ Team LiB ]



    No comments:

    Post a Comment