Products White Papers + Training Services Support Partners News Center About Us
Home FAQ Site Map Search Contact Us Terms of Use Your Privacy Help

INTRODUCTION

Objectivity/DB? supports high availability applications with a patented schema evolution mechanism that allows developers to modify the structure of persistent data without requiring the database to be taken off-line. This increases the flexibility of the application development process, reducing the technical and business risks associated with modifying deployed applications.

As the requirements of a database application evolve over time, changes are made to the definition of the physical data structure, or "schema", of the data elements stored within the database. Schema evolution is the process of redefining the persistent datatypes in a database application. Data conversion is the step in the process that converts the contents of the database from the old schema to the new schema.

Objectivity/DB provides a mechanism for implementing changes to existing applications with large, populated databases. This ability to modify database schema "on-the-fly" provides developers the option of making data structure changes to deployed applications that would not otherwise be feasible. This reduces the risk of application deployment and makes project management more flexible.

The rest of this document describes Objectivity/DB schema evolution in more detail. A general discussion of the restrictions placed on schema evolution by relational technology is followed by a description, including examples, of Objectivity/DB's schema evolution capabilities.

RELATIONAL SCHEMA EVOLUTION

Relational technology provides little support for schema evolution and data conversion, offering, at best, the ability to add a statically initialized column to a table. The bulk of the work for more complex schema changes is the conversion of the existing data, which must be performed after the database has been taken off-line.

Application Changes

Data is usually stored in a relational database in a normalized format to provide a common view of the data across multiple applications. Persistent data is defined strictly as rows and columns within tables. The translation of data from the logical data structures of the application into the tables that form the schema of the database is left to the application.

As a result, relational database applications must always be conscious of the tables, field definitions, alternate views of tables, and, most importantly, table joins that must be performed during normal execution. Since the database schema is defined by the same mechanism that provides access to the database, SQL, every point in the application that accesses the database must be altered to reflect the change in the external data structure.

Of course, it is possible to encapsulate the physical data structure in a relational database application by providing alternate views of tables. The proper use of modular programming techniques can isolate the knowledge of the physical structure of the database. However, object oriented applications, built on object databases, do this encapsulation as a natural part of the development process, rather than as an extra step in the design, programming, and administration of the database.

Data Conversion

Assume for a moment that the application is sufficiently modular to be changed with a relatively small effort. What about the database that already exists that is filled with operational data?

The effort required to convert the data in an existing relational database to a new schema is one that is quite familiar. Traditionally, changing the definition of a table in a relational database requires shutting down the database, converting the contents of the database, updating the applications to use the new table definition, redistributing the application, restarting the database, and allowing the users back on the system.

Making a copy of some or all of the database and doing the job offline allows the users to remain operational, but it raises the problems of disk space and data integrity. The converted data will be out of sync due to normal operations that occur during the conversion process, requiring some form of update conflict resolution to be performed.

The key issue with changing the schema is that the on-line database must be converted at some point in time. There is no way to avoid inconveniencing the end-users during this phase of the data conversion process.

While the technique of schema evolution and data conversion described above is also applicable to some object databases, it is possible for an object database to provide assistance with the schema evolution process. In particular, the conversion of previously stored data is a process that is facilitated by Objectivity/DB's schema evolution capabilities.

A Relational Example

Consider an example in which the performance of a relational database application is found to be limited by the normalization of the data model.

The administrators discover that one of the original assumptions in their system design is false. Table A was expected to be accessed independently of Table B most of the time, when actual usage indicated that A is only used when B is accessed first. This lookup of A for every B is performed through a join that is repeated over and over, causing extremely poor performance. Denormalizing the physical implementation of the data model, merging A information into each record of B, would improve performance by eliminating the unnecessary join.

RDBMS do not provide support for such schema modifications. In order to make the changes indicated in the example above, the development team must perform some variation of the following general steps:

  • Modify the application to use the new schema

  • Write a monolithic Upgrade Application that performs three steps:

1. Reads the old data from the database DB
2. Converts each record from the old schema to the new schema
3. Writes the database with the new schema into DB'

  • Kick all the users off and shut down the database

  • Perform the monolithic data conversion

  • Distribute the new version of the application

  • Let users run the new application

The Problems with the Relational Approach

The problems with the monolithic data conversion described above are the lack of database availability during the data conversion process, excess disk space requirements, and general risk involved.

Availability

User inconvenience can be reduced by the steps outlined above, but it cannot be completely removed. The data conversion takes a finite amount of time. The larger the database, the longer it is unavailable during data conversion.

This puts a great deal of pressure on the development staff to make the conversion process go smoothly. It also requires access to the database when it is not being heavily used so that it can be shut down without disrupting operations. This is not possible for many applications, since they require the database to be available continuously.

Disk Space

During data conversion itself, the data is copied from one place to another, meaning that there will be two images of the database in storage. This could, in the worst case, double the disk space requirements.

If the entire database was copied during the conversion program, the disk requirements would double. If the tables are copied back into the same database, then obsolete versions of tables may exist that can be archived or deleted, as appropriate.

Disk space is a particular issue for schema evolution in object databases, since object databases are able to hold significantly more operationally useful data than relational databases.

Risk

Monolithic data conversion incurs tremendous risk to the schema evolution process in terms of the lost business opportunities during the data conversion time period. The business costs associated with the database being unavailable are entirely application dependent, but can be considerable in strategic applications.

The primary technical risk involves data integrity. At some point, the users will all change over to a new version of the end-user application to run against a new version of the database that was created with a separate upgrade application. The possibility of corrupting the database with two applications is greater than with a single application.

SCHEMA EVOLUTION WITH OBJECTIVITY/DB

Objectivity/DB provides a robust schema evolution mechanism that handles most schema changes quite simply, giving the developer control over the timing and the granularity of the data conversion process. A developer is able to alter the schema of a deployed application and convert the existing database without forcing end-users off the database during a lengthy off-line, monolithic data conversion process.

Rather than have to convert the entire database at once, only the objects whose definitions have changed are candidates for data conversion. Those objects are referred to as "affected" objects. They may be converted one at a time, or in various size groups. When converted, affected objects are written back into the space in which they existed before, allocating or freeing incremental disk space according to the type of change being made to the schema.

During the data conversion process, and in stark contrast to relational databases, the database remains on-line for the business function it supports, minimizing business risk. The technical risk is also minimized because in most cases the "conversion program" and the "end-user application" are the same. Data is converted automatically by Objectivity/DB in the end-user application, which greatly reduces the risk of programming errors.

The remainder of this discussion revolves around the type of schema changes that can be made, and how the timing and granularity of the data conversion is controlled by the developer.

Types of Schema Change

Many schema changes are possible, ranging from purely logical changes (such as changing the name of a data member) to inheritance changes. The basic types of schema changes supported by Objectivity/DB are:

  • Logical changes

  • Class member changes

  • Association and reference changes

  • Class changes

  • Inheritance changes

Basic schema changes of each of these types can be handled automatically by Objectivity/DB, with the optional use of Conversion Functions as required.

Automatic Conversion

Objectivity/DB handles many types of schema changes automatically, such as the conversion of one primitive datatype to another, the addition or deletion of new class members, and the modification of the access control of a base class.

Conversion Functions

Conversion Functions are developer defined call-back functions that provide an opportunity for application dependent processing to be applied at the point of data conversion. The Conversion Function is executed by the database engine during the automatic conversion of affected objects. Each time an object of the old schema is accessed, the Conversion Function is executed. When objects of the new schema are accessed, the Conversion Function is not executed.

Conversion Modes

After the type of schema change has been specified, the issues of timing and granularity of data conversion must be addressed. In other words, we have to decide when to convert the existing data, and how much of it to convert at a time.

Relational databases, and some object databases, only provide monolithic data conversion. In object database terms this is called "Immediate Mode" conversion, because all the data has to be converted immediately before any user application can be given access to the database. This makes the database completely unavailable to the users.

By comparison, Objectivity/DB does not limit access to the database during data conversion.


In addition to Immediate Mode, Objectivity/DB also offers alternative conversion modes that allow the application requirements to dictate the timing and granularity of data conversion. Data conversion can either be deferred until objects are physically accessed by an application, or performed when the developer demands. These are known as Deferred and On-Demand schema conversion, respectively, which never limit database availability. Even Immediate Mode data conversion leaves the database available, because only the affected objects are made unavailable.

Mode Granularity
Deferred Object
On-Demand Container
Database
Federated Database
Immediate Federated Database

Deferred Mode Conversion

Deferred Mode Conversion leaves the affected objects in the database in the old form until they are required for use by the end-user application. Objectivity/DB converts each affected object as it is used in the course of normal end-user operations.

Deferred Mode, which encompasses the majority of schema changes, is the easiest form of conversion from the developer's standpoint: the end-user application is simply modified to use the new schema. The process of changing the schema in the application source code will automatically set the program up to convert affected objects as they are encountered; i.e. in Deferred Mode. If a Conversion Function is required to augment the data conversion, it would be added to the end-user application.

The end-user simply receives a new version of the application and operates it as before. The conversion takes place in the database engine automatically. There is no need to stop all the users from using the system for an extended period of time, because down-time for an individual end-user is limited to the amount of time it takes them to restart their application.

On-Demand Mode Conversion

On-Demand Mode Conversion does the same type of conversions as Deferred Mode, defined above, but to groups of objects explicitly indicated by the application developer at various points in an application.

On-Demand Mode is implemented by calling a member function for one of the data storage constructs in the end-user application. The function call would be placed at that point where the application encounters new containers, databases, or federated databases.

Objects, and groups of objects, are flagged as they are converted, so that each affected object is only converted once. Unless On-Demand conversion is used for the entire federation, it is likely that there will be some unconverted objects in the database. This is not a problem, since they will simply be converted when the end-user application tries to use them.

Immediate Mode Conversion

The use of Immediate Mode data conversion allows schema evolution to be performed despite the presence of unidirectional associations and inherited references in the schema. Objectivity/DB offers a flexible implementation of Immediate Mode conversion that leaves the database on-line, making only the affected objects unavailable during data conversion. Note that Immediate Mode conversion is only required for two specific types of schema change; replacing base classes and deleting classes.

OTHER SCHEMA EVOLUTION ISSUES

Multiple Changes Over Time

Objectivity/DB can keep track of an arbitrary number of Deferred Mode schema changes to the same class. This becomes an important issue when multiple changes are made to the schema over time, and not all of the objects in the database have been converted.

For example, assume that a particular class is changed two or three times using Deferred Mode data conversion. The database will contain both converted and unconverted objects midway through a Deferred Mode conversion. Objectivity/DB allows the subsequent schema evolution processes to be started, even though all the data has not yet been converted from the earlier schema changes. As objects are accessed, they will be converted to the newest schema automatically. The above example assumes there are no user-defined conversions being used; only one conversion function can be registered per case per application.

Upgrade Applications

Upgrade Applications are primarily small programs that make one or more calls to the On-Demand Mode member functions to convert objects in containers, databases, or across the entire federation. Such an Upgrade Application can usually be run in parallel with the new version of the end-user application, with the knowledge that when it completes, all affected objects will have been converted.

Of course, it is not possible to anticipate every type of schema change. Objectivity/DB schema evolution supports unanticipated schema changes through the sequential execution of multiple schema changes. Some of these multiple step schema changes will require an Upgrade Application to explicitly traverse all of the affected objects prior to moving on to the next step in the schema evolution.

In a similar fashion, schema changes that are complex in nature, such as those where Immediate Mode conversion is required, are dependent upon application-specific information to be provided in an Upgrade Application in order to be able to apply integrity constraints during the schema evolution process.

SCHEMA EVOLUTION SCENARIOS

Application Distribution

Objectivity/DB In Centralized Client/Server Applications

Take the example of a repository built using Objectivity/DB, where the end-users start and stop the client application each day. In this scenario, the client applications are able to be redistributed as a normal part of operations. In an application in which Objectivity/DB resides in the client workstations, performing schema evolution simply requires updating the client workstation applications.

The only time that the database would be "unavailable" is during the brief moment when the client applications are being restarted.

Objectivity/DB In Server Application Only

Removing Objectivity/DB from the client workstations changes the situation.

This might be an advanced Web server application built with Objectivity/DB, where the "client application" is an off-the-shelf Web browser. Since the client and server processes are effectively decoupled through the use of HTML, it is unnecessary to redistribute the client portion of the application. The only time that the Web site would be unavailable is during the restart of the Web server application.

One way to prevent even this minor interruption of service is with Objectivity/DB Data Replication Option, which allows an individual server to be taken off-line for service, and brought back on-line again, without disrupting access to replicated data in a federated database.

SCHEMA EVOLUTION EXAMPLES

Adding Data Members

This example is the classic situation where a new piece of information needs to be maintained in an object.

The steps to performing the schema evolution are quite simple.

  • Change the schema and application to add the new data type.

  • Recompile, redistribute, and run the application as before.

The conversion of objects residing in the database will be deferred until they are accessed in the normal operation of the application. Objectivity/DB can automatically initialize a new data member to a predefined value. If the initial value must be calculated, a Conversion Function is required.

Conversion Of Primitive Datatypes

In this example, the number of unique BufferIDs required was underestimated. Converting BufferID from a short to a long will solve the problem. The physical conversion of the object is shown below.

The steps to performing the schema evolution are quite simple.

  • Change the schema and application to use the new data type.

  • Recompile, redistribute, and run the application as before.

Logical Schema Change

In this example, no change to the physical structure of the persistent objects is required. Objects of a persistent class each have an association to another object, and new requirements dictate that the visibility of the association be changed from private to public. Changing the visibility of an association - or a member - requires only a logical schema change.

The steps to performing such an operation are as follows:

  1. Change the schema to reflect the different access.
  2. Recompile, redistribute, and run the application as before.

Modifying Inheritance

Objectivity/DB's schema evolution support is not limited to modifying the contents of a class. It is also possible to modify the inheritance relationships between existing classes in Objectivity/DB.

For example, adding a non-persistent base class to a persistent class is a schema change that can be implemented in Deferred Mode. The same is true for removing a non-persistent base class. In this example, we also wish to ensure that all the objects are converted in a finite amount of time.

The steps are the same as in the earlier examples:

  • Change the end-user application to add or delete the base class.

  • Recompile, redistribute, and run the user application as before.

Over time, most of the affected objects are likely to be converted. In order to force the remaining affected objects to be converted, an Upgrade Application can be written that calls the function to convert the remaining affected objects in the federated database. If Conversion Functions are used in the end-user application, they should also be used in the Upgrade Application.

  • Create an Upgrade Application that also converts the objects in On-Demand mode against the entire Federated Database.

  • Run the Upgrade Application to convert the affected objects simultaneously with the execution of the end-user applications.

Managing Deployed Applications

Providing mechanisms to allow the migration of objects for changes in the database schema meets the needs of maintaining a deployed database, but how do you manage deploying the new schema? The traditional requirement that you build the new schema on the same platform as the existing database, is often a problem for embedded applications that are deployed in a minimal environment. What is required is the ability to make all changes to the schema in the development environment and then simply take the revised schema to the field to marry it with the existing data. Two tools, to dump and load schemas, are provided for just this purpose. These tools can also be used to allow independent development of database changes, which can then be merged into one schema.

CONCLUSION

Schema evolution is a key requirement in high availability applications. Objectivity/DB provides powerful and flexible schema evolution capabilities which clearly demonstrate our support of mission-critical application environments in which the database must remain available at all times.

Not only are application developers able to make schema changes that were not possible before, but they are able to do it easily with Objectivity/DB. The flexibility of Deferred and On-Demand Mode data conversion allows the developer to select the timing and granularity of the data conversion appropriate for the application.

Objectivity/DB's support of on-line data conversion minimizes the risk traditionally associated with making schema changes in deployed applications. Application developers are better able to plan incremental application modifications, reducing the risk of being locked into a deployed application that might be inadequate to meet future needs.

 
Copyright © Objectivity, Inc. 2000-2004. All Rights Reserved.