12.3 Moving Data Between Distributed Systems
The previous section discussed the use of multiple database servers acting together as part of a single logical database for users. The following situations call for the contents of a database to be duplicated and moved between systems:
When data available locally eliminates network bandwidth issues or contention for system resources When mobile database users can take their databases with them and operate disconnected from the network When redundant databases can help to deliver higher levels of reliability, as each database can be used as a backup for other databases
The biggest issue facing users of multiple identical or similar databases is how to keep the data on all of the servers in sync as the data is changed over time. As a user inserts, updates, or deletes data on one database, you need to have some way to get this new data to the other databases. In addition, you will have to deal with the possible data-integrity issues that can crop up if the changes introduced by distributed users contend with each other.
Oracle offers a number of strategies to address this situation. The following sections discuss techniques for data and table movement among distributed systems.
12.3.1 Advanced Replication
The copying and maintaining of database tables among multiple Oracle databases on distributed systems is known as replication. Changes that are applied at any local site are propagated automatically to all of the remote sites. These changes can include updates to data or changes to the database schema. Replication is frequently implemented to provide faster access for local users at remote sites or to provide a disaster-recovery site in the event of loss of a primary site. Oracle's Advanced Replication features support both asynchronous replication and synchronous replication. Oracle also supports heterogeneous replication with DB2 through its Replication Services, bundled in the Mainframe Integration Gateways.
Replication continues to evolve. Oracle8 moved execution of replication triggers to the database kernel and enabled automatic parallelization of data replication to improve performance. Oracle8i added replication triggered by changes to selected rows or columns of a table. Oracle9i replication added support for object datatypes and multi-tier updateable materialized views. Release 2 of Oracle9i added log-based replication via Oracle Streams (described later in this chapter).
Asynchronous replication is the storage of changes locally for later forwarding to a remote site. Some types of asynchronous replication include read-only snapshots replicated from a single updateable master table and updateable snapshots that, although disconnected, can also be updated.
In the Standard Edition of Oracle, you can have only one master site, which replicates changes to its child sites. In the Enterprise Edition, multiple master sites can exist and updates can take place at any of these sites. The updates to these sites must be synchronized, meaning that an update is not completed until all of the target sites have been updated; otherwise, conflicts can remain unresolved. Conflicts can occur when more than one site updates the same data element during the same replication interval. Changes are propagated using deferred remote procedure calls (RPCs) based on events or at points in time when connectivity is available or communications costs are minimal.
Several conflict-resolution routines provided with Enterprise Edition can be automatically used to resolve replication conflicts. An administrator can simply choose which conflict-resolution strategy he wishes to use for a particular replication. For updates that may affect a column or groups of columns, standard resolution choices include the following:
- Overwrite and discard value
Used when there is a single master (originating) site for new values to update current values at destination sites.
- Minimum and maximum value
Minimum compares the new value at the originating site and the current value at the destination and applies the new value only if it is less than the current value.
Maximum compares the new value at the originating site and the current value at the destination and applies the new value only if it is greater than the current value.
- Earliest and latest timestamp value (with designation of a column of type DATE)
Earliest dictates that when there are multiple new values, the value used for updates will be in the row with the earliest timestamp.
Latest dictates that when there are multiple new values, the value used for updates will be in the row with the latest timestamp.
- Additive and average value for column groups with single numeric columns
Additive takes the difference of new and old values at the originating site and adds them to the current value at the destination site.
Average takes the current value at the destination and the new value at the originating site, divides by 2, and applies the new value.
- Priority groups and site priority
When priority levels are assigned to columns and multiple new values occur, columns with a lower priority will be updated by columns with a higher priority.
Also available for use are built-in uniqueness conflict-resolution routines, which will resolve conflicts that result from the distributed use of primary key and unique constraints. The built-in routines include the following:
- Append site name to duplicate value
Appends the global database name of the originating site to the replicated column
- Append sequence to duplicate value
Appends a generated sequence number to the column value
- Discard duplicate value
Discards the row at the originating site that causes errors
You can also write your own custom conflict-resolution routines and assign them if your business requirements are not addressed by the standard routines.
12.3.1.1 Managing advanced replication
You can manage replication through Oracle's Replication Manager, which is launched from the Oracle Enterprise Manager. Administrators can configure database objects that need to be replicated, schedule replication, troubleshoot error conditions, and view the deferred transaction queue at each location through this central interface. A deferred transaction queue is a queue holding transactions that will be replicated (and applied) to child sites.
For example, to set up a typical multimaster replication, you must first define master groups and tables and objects to be replicated in each of the databases.
The Replication Manager wizard, which is part of the Oracle Enterprise Manager Java version, then defines and sets up how replication will occur. You first define a connection to the master definition site, then create one or more master groups for replicating tables and objects to the multiple master sites. Next, you assign conflict-resolution routines for replicated tables in each master group. Finally, you grant appropriate access privileges to users of applications that access the data at the multiple sites.
12.3.2 Transportable Tablespaces
Transportable tablespaces are a way to speed up the distribution of complete tablespaces between multiple databases. Transportable tablespaces were introduced with Oracle8i Enterprise Edition to rapidly copy and distribute tablespaces among database instances. Previously, tablespaces needed to be exported from the source database and imported at the target (or unloaded and loaded). Transportable tablespaces enable copies to be moved simply through the use of file transfer commands such as ftp. Before you copy and move a copy of the tablespace, you should make the tablespace read-only to avoid inadvertently changing it. Data dictionary information needs to be exported from the source prior to transfer, then imported at the target.
Some of the most popular reasons to use transportable tablespaces include:
Rapid copying of tablespaces from enterprise data warehouses to data marts Copying of tablespaces from operational systems to operational data stores for use in consolidated reporting Publishing of tablespaces for distribution on CD-ROM Use of backup copies for rapid point-in-time tablespace recovery
Prior to Oracle9i, Oracle block sizes needed to be the same and, prior to Oracle Database 10g, platform operating systems needed to be the same for both Oracle instances in order to use this feature. Oracle Database 10g introduces transportable tablespaces that can be used across different platforms.
|
No comments:
Post a Comment