Friday, November 6, 2009

Section 15.4.  BULK COLLECT









15.4. BULK COLLECT













Oracle8i Database introduced a very powerful new feature that improves the efficiency of queries in PL/SQL: the BULK COLLECT clause

. With BULK COLLECT you can retrieve multiple rows of data through either an implicit or an explicit query with a single roundtrip to and from the database. BULK COLLECT reduces the number of context switches between the PL/SQL and SQL engines and thereby reduces the overhead of retrieving data.


Take a look at the following code snippet. I need to retrieve hundreds of rows of data on automobiles that have a poor environmental record. I place that data into a set of collections so that I can easily and quickly manipulate the data for both analysis and reporting.



DECLARE
CURSOR major_polluters_cur
IS
SELECT name, mileage
FROM transportation
WHERE TYPE = 'AUTOMOBILE'
AND mileage < 20;
names name_varray;
mileages number_varray;
BEGIN
FOR bad_car IN major_polluters
LOOP
names.EXTEND;
names (major_polluters%ROWCOUNT) := bad_car.NAME;
mileages.EXTEND;
mileages (major_polluters%ROWCOUNT) := bad_car.mileage;
END LOOP;
-- Now work with data in the collections
END;



This certainly gets the job done, but the job might take a long time to complete. Consider this: if the transportation table contains 2,000 vehicles, then the PL/SQL engine issues 2,000 individual fetches against the cursor in the SGA.


To help out in this scenario, use the BULK COLLECT clause for the INTO element of your query. By using this clause in your cursor (explicit or implicit) you tell the SQL engine to bulk bind the output from the multiple rows fetched by the query into the specified collections before returning control to the PL/SQL engine. The syntax for this clause is:



... BULK COLLECT INTO collection_name[, collection_name] ...



where collection_name identifies a collection.


Here are some rules and restrictions to keep in mind when using BULK COLLECT:


  • Prior to Oracle9i Database, you could use BULK COLLECT only with static SQL. With Oracle9i Database and Oracle Database 10g, you can use BULK COLLECT with both dynamic and static SQL. See Chapter 16 for more details and an example.

  • You can use BULK COLLECT keywords in any of the following clauses: SELECT INTO, FETCH INTO, and RETURNING INTO.

  • The collections you reference can store only scalar values (strings, numbers, dates). In other words, you cannot fetch a row of data into a record structure that is a row in a collection.

  • The SQL engine automatically initializes and extends the collections you reference in the BULK COLLECT clause. It starts filling the collections at index 1, inserts elements consecutively (densely), and overwrites the values of any elements that were previously defined.

  • You can't use the SELECT...BULK COLLECT statement in a FORALL statement.

  • SELECT...BULK COLLECT will not raise NO_DATA_FOUND if no rows are found. Instead, you must check the contents of the collection to see if there is any data inside it.

  • The BULK COLLECT operation empties the collection referenced in the INTO clause before executing the query. If the query returns no rows, this collection's COUNT method will return 0.


Let's explore these rules and the usefulness of BULK COLLECT through a series of examples. First, here is a rewrite of the major polluters example using BULK COLLECT:



DECLARE
names name_varray;
mileages number_varray;
BEGIN
SELECT name, mileage
FROM transportation
BULK COLLECT INTO names, mileages
WHERE TYPE = 'AUTOMOBILE'
AND mileage < 20;
 
-- Now work with data in the collections
END;



I am now able to remove the initialization and extension code from the row-by-row fetch implementation.


I don't have to rely on implicit cursors to get this job done. Here is another reworking of the major polluters example, retaining the explicit cursor:



DECLARE
CURSOR major_polluters IS
SELECT name, mileage
FROM transportation
WHERE TYPE = 'AUTOMOBILE'
AND mileage < 20;
names name_varray;
mileages number_varray;
BEGIN
OPEN major_polluters;
FETCH major_polluters BULK COLLECT INTO names, mileages;
CLOSE major_polluters;
 
-- Now work with data in the collections
 
END;



And with Oracle9i Database Release 2 and above, I can simplify my life by fetching into a collection of records, as you see here:



DECLARE
TYPE transportation_aat IS TABLE OF transportation%ROWTYPE
INDEX BY PLS_INTEGER;
BEGIN
SELECT *
FROM transportation
BULK COLLECT INTO l_transportation
WHERE TYPE = 'AUTOMOBILE'
AND mileage < 20;
 
-- Now work with data in the collections
END;




15.4.1. Limiting Rows Retrieved with BULK COLLECT



Oracle provides a LIMIT clause
for BULK COLLECT that allows you to limit the number of rows fetched from the database. The syntax is:



FETCH cursor BULK COLLECT INTO ... [LIMIT rows];



where rows can be any literal, variable, or expression that evaluates to an integer (otherwise, Oracle will raise a VALUE_ERROR exception).


LIMIT is very useful with BULK COLLECT, because it helps you manage how much memory your program will used to process data. Suppose, for example, that you need to query and process 10,000 rows of data. You could use BULK COLLECT to retrieve all those rows and populate a rather large collection. However, this approach will consume lots of memory in the Process Global Area for that session. If this code is run by many separate Oracle schemas, your application performance may degrade because of PGA swapping.


The following block of code uses the LIMIT clause in a FETCH that is inside a simple loop. Notice that I check the COUNT of the collection returned to determine if any rows were retrieved. The %FOUND and %NOTFOUND attributes should not be used to determine this.



DECLARE
CURSOR allrows_cur IS SELECT * FROM EMPLOYEE;
 
TYPE employee_aat IS TABLE OF allrows_cur%ROWTYPE
INDEX BY BINARY_INTEGER;
 
l_employees employee_aat;
l_row PLS_INTEGER;
BEGIN
OPEN allrows_cur;
LOOP
FETCH allrows_cur BULK COLLECT INTO l_employees
LIMIT 100;
EXIT WHEN l_employees.COUNT = 0;
 
-- Process the data by scanning through the collection.
 
l_row := l_employees.FIRST;
WHILE (l_row IS NOT NULL)
LOOP
upgrade_employee_status (l_employees(l_row).employee_id);
l_row := l_employees.NEXT (l_row);
END LOOP;
END LOOP;
 
CLOSE allrows_cur;
END;





15.4.2. Bulk Fetching of Multiple Columns




As you have seen in previous examples, you certainly can bulk fetch the contents of more than one column. It would be most elegant if we could fetch those multiple columns into a single collection of records. This feature became available in Oracle9i Database Release 2.


Suppose that I would like to retrieve all the information in my transportation table for each vehicle whose mileage is less than 20 miles per gallon. In Oracle9i Database Release 2 and above, I can do so with a minimum of coding fuss:



DECLARE
-- Declare the type of collection
TYPE VehTab IS TABLE OF transportation%ROWTYPE;

-- Instantiate a particular collection from the TYPE.
gas_guzzlers VehTab;
BEGIN
SELECT *
BULK COLLECT INTO gas_guzzlers
FROM transportation
WHERE mileage < 20;
...



Prior to Oracle9i Database Release 2, the above code would raise this exception:



PLS-00597: expression 'GAS_GUZZLERS' in the INTO list is of wrong type



If you are running Oracle8i Database or Oracle9i Database Release 1, you will need to declare multiple collections and then fetch individually into those collections:



DECLARE
guzzler_type name_varray;
guzzler_name name_varray;
guzzler_mileage number_varray;

CURSOR low_mileage_cur IS
SELECT vehicle_type, name, mileage
FROM transportation WHERE mileage < 10;
BEGIN
OPEN low_mileage_cur;
FETCH low_mileage_cur BULK COLLECT
INTO guzzler_type, guzzler_name, guzzler_mileage;
END;



Note that you can use the LIMIT clause with a BULK COLLECT into a collection of records, just as you would with any other BULK COLLECT statement.




15.4.3. Using the RETURNING Clause with Bulk Operations







You've now seen BULK COLLECT used for both implicit and explicit query cursors. You can also use BULK COLLECT inside a FORALL statement, in order to take advantage of the RETURNING clause.


The RETURNING clause, introduced in Oracle8 Database, allows you to obtain information (such as a newly updated value for a salary) from a DML statement. RETURNING can help you avoid additional queries to the database to determine the results of DML operations that just completed.


Suppose that Congress has passed a law requiring that a company pay its highest-compensated employee no more than 50 times the salary of its lowest-paid employee. I work in the IT department of the newly merged company Northrop-Ford-Mattel-Yahoo-ATT, which employs a total of 250,000 workers. The word has come down from on high: the CEO is not taking a pay cut, so we need to increase the salaries of everyone who makes less than 50 times his 2004 total compensation package of $145 millionand decrease the salaries of all upper management except for the CEO. After all, somebody's got to make up for this loss in profit.


Wow! I have lots of updating to do, and I want to use FORALL to get the job done as quickly as possible. However, I also need to perform various kinds of processing on the employee data and then print a report showing the change in salary for each affected employee. That RETURNING clause would come in awfully handy here, so let's give it a try.


See the onlyfair.sql file on the book's web site for all of the steps shown here, plus table creation and INSERT statements.


First, I'll create a reusable function to return the compensation for an executive:



/* File on web: onlyfair.sql */
FUNCTION salforexec (title_in IN VARCHAR2) RETURN NUMBER
IS
CURSOR ceo_compensation IS
SELECT salary + bonus + stock_options +
mercedes_benz_allowance + yacht_allowance
FROM compensation
WHERE title = title_in;
big_bucks NUMBER;
BEGIN
OPEN ceo_compensation;
FETCH ceo_compensation INTO big_bucks;
RETURN big_bucks;
END;



In the main block of the update program, I declare a number of local variables and the following query to identify underpaid employees and overpaid employees who are not lucky enough to be the CEO:



DECLARE
big_bucks NUMBER := salforexec ('CEO');
min_sal NUMBER := big_bucks / 50;
names name_varray;
old_salaries number_varray;
new_salaries number_varray;

CURSOR affected_employees (ceosal IN NUMBER)
IS
SELECT name, salary + bonus old_salary
FROM compensation
WHERE title != 'CEO'
AND ((salary + bonus < ceosal / 50)
OR (salary + bonus > ceosal / 10)) ;



At the start of my executable section, I load all of this data into my collections with a BULK COLLECT query:



OPEN affected_employees (big_bucks);
FETCH affected_employees
BULK COLLECT INTO names, old_salaries;



Then I can use the names collection in my FORALL update:



FORALL indx IN names.FIRST .. names.L*
UPDATE compensation
SET salary =
GREATEST(
DECODE (
GREATEST (min_sal, salary),
min_sal, min_sal,
salary / 5),
min_sal )
WHERE name = names (indx)
RETURNING salary BULK COLLECT INTO new_salaries;



I use DECODE to give an employee either a major boost in yearly income or an 80% cut in pay to keep the CEO comfy. I end it with a RETURNING clause that relies on BULK COLLECT to populate a third collection: the new salaries.


Finally, because I used RETURNING and don't have to write another query against the compensation table to obtain the new salaries, I can immediately move to report generation:



FOR indx IN names.FIRST .. names.L*
LOOP
DBMS_OUTPUT.PUT_LINE (
RPAD (names(indx), 20) ||
RPAD (' Old: ' || old_salaries(indx), 15) ||
' New: ' || new_salaries(indx)
);
END LOOP;



Here, then, is the report generated from the onlyfair.sql script:



John DayAndNight Old: 10500 New: 2900000
Holly Cubicle Old: 52000 New: 2900000
Sandra Watchthebucks Old: 22000000 New: 4000000



Now everyone can afford quality housing and health care. And tax revenue at all levels will increase, so public schools can get the funding they need.


The RETURNING column values or expressions returned by each execution in FORALL are added to the collection after the values returned previously. If you use RETURNING inside a nonbulk FOR loop, previous values are overwritten by the latest DML execution.











    No comments:

    Post a Comment