Tuesday, October 27, 2009

Resource Pooling


3 4



Resource Pooling



Running the previous code sample to iterate over Northwind data 1000 times takes 7 seconds. That is with connection pooling turned on. Turning it off increases that time to 28 seconds—a 300 percent increase! This shows that the bulk of the work this sample does is spent establishing the database connection. Establishing this connection is so expensive that it dwarfs command execution, rowset retrieval, and rowset iteration combined. This is why resource pooling is so essential with the MTS scalability model. You simply cannot afford to create a brand-new connection to the database whenever one is needed. These precious resources must be maintained in a way that survives the short, request-based lifecycles of components in scalable applications.



Pooling benefits for scalable applications cannot be overestimated. The reduction from 28 to 7 seconds is observed simply by installing the component in the COM+ catalog. The transaction support setting is immaterial when it comes to this particular performance measurement; the sample runs in 7 seconds with any of the five transaction support settings. This transparency of the pooling service is a double-edged sword: it is easy to use and great for maintainability, but it sometimes can be difficult to verify in the development environment whether pooling is actually working because its benefits become apparent only under load. (However, it will be much easier to verify that pooling is working in the production environment; your customers will be sure to let you know if they have to wait four times as long for an operation to finish—but you probably want to know sooner than that.) I will examine pooling services in more detail in this section so that you understand and can predict exactly when and how
pooling will work for you.



When I described how Resource Dispensers interact with DispMan at the beginning of the chapter, I mentioned that it would be rather difficult to build an efficient, centralized, and resource-fit-rating pooling mechanism yourself. Also, like the Windows 2000 thread pool services described in Chapter 13, pooling is a process-based optimization. Arbitrarily combined components from different sources and vendors transparently collaborate to produce the full optimization benefit. Increasing the efficiency of a service by factoring out commonality when separate software modules are combined is the hallmark of solutions based on component technology standards.



Pooling is implemented at the level of the Resource Dispenser, and its particulars therefore vary with different data access technologies. Let's begin by examining ODBC's Resource Dispenser. With ODBC, the connections are pooled. If you have ODBC 3.5 or higher, you can enable pooling on a driver-by-driver basis by using the ODBC Data Source Administrator in control panel. There, you also can configure how long each connection should survive in the pool if it is not reacquired before being closed and discarded. If you have a version of ODBC lower than 3.5, you configure these settings in the registry. Given that a particular driver supports connection pooling and that pooling is enabled for that driver via the control panel or registry, an application must enable pooling by calling SQLSetEnvAttr with the SQL_ATTR_CONNECTION_POOLING attribute. Note that this is a process-level attribute. When running under MTS, COM+, or IIS (version 4 or higher), this call automatically is made for you.
Otherwise, connection pooling is turned off by default, regardless of the control panel and registry settings.



Version 3.0 with service pack 1 or higher of the ODBC driver manager will detect bad connections in the pool and will not return them to the caller when it requests a new connection. This is particularly important in high-availability clustered environments, where scheduled or unscheduled failover invalidates outstanding database connections. This should be transparent to clients, except for forcing them to close current connections and open new ones with the same attributes. Clustering would not work if clients repeatedly received broken connections to failed or shut-down servers from the pool. You also can configure a wait retry setting, which determines how long ODBC will wait before contacting a server again after that server has failed to respond.



Concurrency throttling is an obvious risk of any pooling strategy. Pools need to be locked when connections are added or removed, which might stall threads that try to create a connection. Therefore, ODBC maintains as many pools as there are processors on the server. When a thread needs a connection, ODBC goes to the first unlocked pool and begins a search for an entry with the proper attributes. If no such entry exists, a new one will be created in that pool. This is a reasonable compromise between facilitating concurrency and avoiding too much duplication of connection resources.



Now let's turn to OLE DB. The OLE DB Resource Dispenser pools data sources and one session per data source. (Thus the advice from Microsoft not to create more than one session, even if the provider supports doing so; it would work, but those additional sessions would not be pooled.) When discussing transparent OLE DB services in the section, "A Crash Course in OLE DB," I mentioned the conditions under which service components will be engaged, how clients must create providers if they want transparent services, and how clients can override which services will be layered on top of a provider. Resource pooling is only one such service. Refer to the discussion of transparent OLE DB services in that section to understand what you must do to turn on OLE DB resource pooling. The value to be bitwise combined with or bitwise excluded from the DBPROP_INIT_OLEDBSERVICES property of the data source before initializing it to override the provider's pooling setting is DBPROPVAL_OS_RESOURCEPOOLING.



As of Microsoft Data Access Components (MDAC) 2.1 with service pack 1 and a hotfix or later (including MDAC 2.1 with service pack 2 and no hotfix), you can configure how long unused connections will survive in the OLE DB resource pool. This is done via registry settings. A global setting applies to all providers. After selecting this setting, you can override it on a per-provider basis. The default value is 60 seconds. A wait retry setting is also supported, but with a twist on what ODBC has to offer—you can set an additional value to indicate the factor by which the waiting interval should increase between subsequent, unsuccessful retries. The default wait retry setting is 64 seconds. The default multiplication value is 2. OLE DB also will not return a bad connection from a pool if the provider can detect connection status. The provider signals this capability to the OLE DB service components with a property.



OLE DB does not support resource pooling on Windows 95 and 98. Apart from that, a number of conditions prevent a specific connection from being added to a pool when the client closes it, even if the client has met all other conditions for activating this transparent service. These are the conditions:




  • The client calls the Uninitialize method of the data source's IDBInitialize interface before releasing it. Calling Uninitialize will close the connection immediately and unconditionally.


  • If the client does set the data source's DBPROP_INIT_PROMPT property, it must be set to the value DBPROMPT_NOPROMPT. Setting this property to any other value will prevent the connection from being added to a pool because prompted connections are not allowed to be pooled. However, you can use the OLE DB initialization component's IDataInitialize::GetInitializationString method to retrieve the connection string from a prompted connection. (The value of DBPROP_INIT_PROMPT always is omitted from a connection string generated this way.) You then can manufacture more nonprompted connections that can be pooled from that connection string.


  • The client queries the data source for a provider-specific interface, calls IDBProperties::GetPropertyInfo, or establishes a connection point prior to calling IDBInitialize::Initialize. Performing any such operation on a data source before initializing it makes it unpoolable.





Unlike ODBC, the OLE DB Resource Dispenser does not necessarily hold its pools until the process terminates. If OLE DB is not running under COM+, MTS, or IIS (version 4 or higher), it jettisons all pools when the last service component is released. You therefore must hold onto an IDataInitialize or IDBPromptInitialize interface pointer for as long as you want pools to remain active in your application. If you are using ADO, you must keep at least one connection object around. This connection object does not necessarily have to remain open, but you must not release it. Things are much simpler if you are running under MTS, COM+, or IIS 4 or higher: OLE DB will hold all pools until your process terminates, just as ODBC does whenever connection pooling is turned on.



OLE DB's concurrency strategy is significantly different from that of ODBC. Instead of maintaining as many heterogeneous pools as there are processors, OLE DB maintains homogeneous pools on a per-processor basis. Each pool contains only data sources with the same attributes. Whenever a client produces a new attribute combination, OLE DB creates as many pools for data sources with that attribute combination as there are processors, plus one. On a single-processor machine, a request for a connection with a new attribute combination therefore results in two new pools. A user map determines which pools contain connections with the right attributes, and at least one of those pools is likely to be unlocked at any given moment. This strategy implies that OLE DB can block an entire pool when a server fails to respond to a request for the duration of the wait retry period. It also has an implication for application programs: because data sources remain associated with a particular pool, you must not change
their attributes after they have been initialized. (For example, don't change the DefaultDatabase property of the connection object once it is initialized.)



Pooling with OLE DB on Top of ODBC


What happens when you connect to your data source via the ODBC provider for OLE DB? Both technologies support pooling, but here they are layered on top of one another. So whose pooling services actually will be active?

The answer depends on how old your system software is. Prior to MDAC 2.1, pooling was disabled by default (via the registry) for the ODBC provider for OLE DB. An application that did not take any special steps to activate the OLE DB resource pooling service therefore used ODBC connection pooling (if it had been turned on), even though it interacted with the OLE DB provider interface.



With MDAC 2.1, the provider's default was changed to request the OLE DB resource pooling service. Applications therefore began using resource pooling instead of connection pooling as a result of this MDAC upgrade. Applications still can use ODBC connection pooling instead of OLE DB resource pooling, but for that to happen, they must turn off resource pooling with one of the mechanisms I described earlier.




Pooling is significantly easier to manage if COM+ services are engaged in a process. This is true for both ODBC and OLE DB data access. This is one reason I recommend that you install your COM+ business objects in the COM+ catalog. That way, you will not have to worry about whether ODBC connection pooling is turned on. And you will not need to deal with managing that single, special IDataInitialize or IDBPromptInitialize interface pointer or worry about holding onto an ADO connection object after it has been closed. Pooling like this is simpler, it's cleaner, and it will work whether or not your components run under IIS. Thus, your components become more reusable.



While the acquisition of a pooled resource is invisible to the application, there is a minor change to the programming model in a resource pooling environment. Besides having to be aware of the circumstances a data source can be pooled under, this change affects your ability to rely on session termination for database resource cleanup. In the past, programmers liked to have the RDBMS automatically remove certain temporary artifacts when it detected that they were no longer needed, since the only session that could have access to them was being closed. Such artifacts include temporary stored procedures for prepared SQL statements and temporary tables. In a pooling environment, the RDBMS might not clean up these artifacts for a very long time, as long as the connections in whose scope they were created still are drawn from the pool. This can lead to unacceptable bloat in the database. In a resource pooling environment, you therefore might have to take responsibility for destruction of temporary database
artifacts in your own application code.



No comments:

Post a Comment