Facilitating Schema Evolution with Automatic Program Transformations

 

Michael Werner

College of Computer Science

Northeastern University

werner@ccs.neu.edu

Experience shows that even after programs have been designed, built and tested, change is more the norm than the exception. Consider a shared object-oriented persistent system built to serve the business needs of a company. Changes such as additions of classes or class members, renamings, retypings, etc. may be needed to introduce new applications, enhance existing ones, or integrate separately built systems. Additional changes such as factoring out duplicate code, rearranging the class hierarchy and delegating responsibilities to other classes, may be made for efficiency or clarity of design. The primary focus of this research is to provide a mechanism for making these changes easily and safely. Certain changes must be rejected, since they would introduce subtle errors, which might undermine the compilability, or the behavior of the system. The theoretical interest of this part of the research is to determine, for each kind of transformation, the necessary preconditions, which are required to preserve safety. Adherence to weak preconditions preserves type soundness, satisfying strong preconditions additionally guarantees behavior preservation. It is assumed that the source programs are written in Java, hence the preconditions are consequences of the language rules of Java.

A prototype tool, STP (Schema Transformation Processor) demonstrates the feasibility of this approach. With STP, a designer can reverse engineer a set of Java source programs to recover a design that is shown visually as a graph whose nodes represent classes, and whose arcs represent IS-A and HAS-A links. A language called Change Specification Language (CSL) is introduced for describing schema changes. CSL contains both low-level primitives for simple changes such as renaming a class and high-level primitives for broader changes such as factoring out common properties. High-level primitives are expanded into sequences of low-level ones at run-time. Program transformation is accomplished by parsing the source programs into an abstract syntax tree, then visiting the tree with transformation visitor objects, which update the code.

A second focus of this research is called Itineraries. A running application is envisaged in terms of its navigation along certain HAS-A and IS-A links. The term itinerary, taken from the travel industry, is used as a metaphor for describing this navigation. STP has a mode, which makes it easy to specify an itinerary by clicking on the required links. The itinerary is then automatically attached to the source programs, by augmenting the classes involved with takeoff and land methods. These methods take a Visitor object as parameter. In accordance with the Visitor Design Pattern, the code to support navigation is cleanly separated from code, which does work at the visited classes. A skeleton Visitor class is generated for each itinerary. The code to do a specialized task is encapsulated in one of its concrete subclasses. By creating additional subclasses, new functionality can easily be added on to existing itineraries, without disturbing the underlying classes.

A set of itineraries constitutes a subgraph of the graph representing the complete system. It is analagous to a view in a relational database. It is envisaged that by granting privileges on itineraries, access to a shared system can be limited on a need to know basis. By serving as targets for granting privileges, itineraries can play a role in securing shared object systems.