Use a Local Database Sandbox
A significant challenge on many software development projects is
making changes to the database structure. Many projects I've observed
typically use one shared database, so when developers make changes
to this shared development database they can adversely affect others
on the team—causing each developer's private build to break (if their
tests are part of the build). If developers have their own local code
"sandbox" to isolate their coding changes from other developers,
wouldn't it be great if they had a "database sandbox" too?
Multiple Database Instances
You may not have the resources to get a database for each
developer. In this situation, you could assign each developer a
separate schema on a central database server or use one of the
freely available, lightweight, open source equivalent databases.
Furthermore, many of the more widely used RDBMSs provide
free developer versions.
Another important capability you gain by automating your database
integration is that everyone on the team will be able to create a
local instance of the database on their workstations. Every team member
can then create a database "sandbox" to make and test database
changes without affecting others. If your database integration is
scripted, creating a new database instance is a push-button affair; conversely,
if you don't automate your database integration, it is more difficult
to recreate your database and run tests on your workstation.
Figure 5-3 provides an illustration of each developer using a local
database instance.
Using automated database integration, you are able to get the latest
version of your database scripts along with your application source
code. Each developer is able to create a local instance of the database,
modify the version of the database on his workstation, test the
changes, and commit the changes back to the repository. These
changes will be integrated and tested with the rest of the software as
part of the CI system. When another developer refreshes her private
workspace with changes from the repository, the database changes are
Figure 5-3 Each developer uses a local database sandbox copied down to her workstation along with the other source code
changes, and her next private build will incorporate the changes in her local database instance.
Supporting Multiple Database Environments
The next logical step after creating a local database sandbox is
creating different database instances to support multiple database
environments. For example, you may need to create a database
that contains all of your migrated production data. Assuming there
are many records in this database, you probably don't want to
include it in your local development database. Usually, this will
only be the DML (data changes), not the DDL (create, alter, and
drop statements to the database). By automating your database
integration, you can modify build script parameters to include the
data to support these environments. This way, you can execute
one command to provide data for different database environments.
The same goes for versions. You may want to test new
code against a prior version of the database. Use automated database
integration to provide this capability with a "push of the Integrate
button."
The next section identifies the reasons and approach for using a
version control repository for database integration.
Use a Version Control Repository to Share
Database Assets
Sharing your database integration scripts is a best practice, plain and simple.
All software assets need to be in a version control repository, and this
includes all database assets. Such assets might include the following:
- DDL to drop and create tables and views, including constraints
and triggers
- Stored procedures and functions
- Entity relationship diagrams
- Test data for different environments
- Specific database configurations
For numerous project scenarios, you should be able to recreate
your entire database from "scratch" using the scripts in your version
control repository (for large data sets, you may store data export
scripts rather than row-by-row DML scripts). Once you've applied all
your database assets to the version control repository, you'll have a
history of all of the database changes, so you can run prior versions of
the database with the latest code (or with prior versions of the code as
well). This also reduces the gridlock on projects when all the developers
need to go to the DBA for everything. Once database assets are in
one place, you can make a change to a database column, perform a private
build on your machine, commit it to the version control system,
and know you will receive feedback after the integration build is run.
Sometimes during development the database will need to undergo
large-scale changes. In most cases, these changes will require the
expertise of several people on the team and a longer duration to complete.
When such situations arise, it is best to create a task branch5 to
commit the changes back into the version control repository rather
than break the mainline and slow the activity of the rest of the team.
Without CDBI, often the DBA will be making these large-scale database
alterations, and he may be less suited to make all the changes at
once to the database, dependent application source code, associated
test code, and shared scripts because he may lack the knowledge of the
source code that developers are writing.
Just as you have a consistent directory structure for your source
code, you'll want to do the same for your database. Define the location
of database assets—probably somewhere in the implementation/construction
directory where your source code is located. In your database
directory, define subdirectories for each of the database entity types
and environments. Listing 5-7 shows a directory structure for an
implementation directory (using a MySQL database).
Listing 5-7 Sample Implementation Directory
____________________________________________________________________________
implementation
bin
build
filtered-sql
config
properties
xml
database
migration
lib
mysql
src
tests
tools
mysql
Just as with your source code, choose a directory structure that
works well for you, one that clearly defines the entities while making it
adaptable to changes.
____________________________________________________________________________
Directory Structure and Script Maintenance
In the beginning, you may find that the directory structure is less
important, but beware of making frequent directory structure
changes, as you'll spend additional time updating your scripts to
account for these changes.
____________________________________________________________________________
Now that you've automated your database integration activities
and are checking them into the version control repository to share with
others on the team, let's make the process continuous so that it is run
with every change to the software.
Continuous Database Integration
This is where the "rubber meets the road." The reason to automate,
share, and build the database integration processes is so you can make
these processes continuous. Using CDBI, your database and your
source code are synchronized many times a day. Once you commit
your database changes to your version control repository, the CI system
proceeds like this: It gets a complete copy of the system source
code, including your database data definition and manipulation scripts;
recreates your database from the source; integrates your other source
code; and then runs through your automated tests and inspections to
ensure that the change(s) didn't introduce defects into your system's
code base. Figure 5-4 demonstrates how the changes made by each
developer are synchronized with the integration build based on the
mainline in the version control repository.
Figure 5-4 shows that the changes that were made at 10 a.m. (by
Mike) and the changes that were made at 10:15 AM (by Sandy) are included in the integration build that occurred at 10:30 AM. The integration
build machine uses a single source point, provided by the version control repository, to synchronize and test changes as a part of the integration build.
FIGURE 5-4 Single source for database changes
Once you have automated your database integration and incorporated
it into your build scripts, making it run continuously is simple.
Your database integration tasks, along with the rest of your build,
should be executed using one command (such as an Ant/NAnt target).
To run your database integration tasks continuously, you only need to
make sure these database integration build task commands are executed
as a part of the automated build.
Give Developers the Capability to Modify the Database
Each developer should have the capability to modify any of the database
scripts. This doesn't mean that every developer will modify these
database scripts, because not every developer will have the necessary
database expertise. Because each developer will have his own database
sandbox, each can modify the local database and then commit the
changes to the version control repository. This will reduce the DBA
bottleneck and empower developers to make necessary changes. The
DBA can evaluate the new changes to the repository by reviewing the
integration builds or working with the developers if the build breaks.
As the adage goes, with this additional authority comes additional
responsibility. Changes to the underlying database structure can have
far-reaching impacts on the system. The developer who makes changes
to the database structure must assume the responsibility for thorough
testing before committing these changes. We feel it is far more likely
in today's industry for a developer to have a knowledge of databases
and database scripting—and the DBA is still there to "oversee" what
changes, if any, move into the system.
The Team Focuses Together on Fixing
Broken Builds
Since you treat the database the same as the other source code, you
may experience broken builds because of a database error. Of course,
errors may occur in any part of your build: source code, deployment,
tests, inspections, as well as the database. When using CDBI, database
integration is just another part of the build, so the playing field is leveled:
Whatever breaks the build, the priority is to fix it. The payoff
comes after this; the fix is now integrated, and that particular issue is
prevented from recurring.
Make the DBA Part of the Development Team
Break down barriers and make members of your database team a part
of the development team. You may already be doing this, but all too
often there is a "wall" between the DBA and the software developers.
As mentioned earlier, treat your database code and your other source
code in the same manner. The same goes for the people on your team.
This is probably the most controversial of the CDBI practices. We've
worked on teams that have used CDBI with the DBA on the development
team, and we've also seen the more traditional approach with the
DBA on another team, the database team. CDBI worked in both environments,
but it worked significantly better when the DBA was a part
of the team.
Some people ask, "If the DBA is no longer dropping and recreating
tables, creating test environments, and granting access, then what
is she doing?" The simple answer is, "Now she can do her job!"—
spending more time on higher-level tasks such as improving database
performance, improving SQL performance, data normalization, and
other value-added improvements.
Database Integration and the Integrate Button
The rest of this book covers topics concerning the additional parts of
the Integrate button: continuous testing, inspection, deployment, and
feedback. This section covers some specific issues concerning these
practices when it comes to database integration.
Testing
Just as with source code, you'll want to test your database. We cover
testing in detail in Chapter 6. There are tools you can use for database specific
testing such as PL/Unit, OUnit for Oracle, and SQLUnit. Your
database may contain behavior in stored procedures or functions that
needs to be tested and executed as a part of the build script, just like
the behavior of your other source code. You may also want to test the
interactions of constraints, triggers, and transactional boundaries by
performing application security data tests.
Inspection
As with your other source code, you should be running inspections on
your data source. This includes not just your DDL, but reference and
testing data as well. There are tools you can incorporate and run in
your automated build process so that you do not need to run these
inspections manually. Here are a few ideas for inspections on your
database.
- Ensure efficient data performance by running set explain
against your project's rules to target optimizations for your SQL
queries.
- Analyze data to ensure data integrity.
- Use a SQL recorder tool to determine which queries are being
run the most. These queries might be candidates for stored procedures.
- Ensure adherence to data naming conventions and standards.
Deployment
 |
| About the authors: |
| Paul Duvall is the CEO of Stelligent, a firm that helps clients create production-ready software every day. Duvall contributed a chapter to No Fluff, Just Stuff Anthology: The 2007 Edition. Stephen M. Matyas III is the vice president of AutomateIt, a service branch of 5AM Solutions. Matyas is an expert in applied software engineering and Java. Andrew Glover is the president of Stelligent Incorporated and contributes to online publications such as IBM's developerWorks and Oreilly's ONJava and ONLamp portals. |
|
|
 |
 |
As we have indicated, the goal of CDBI is to treat your database
source code and other source code in the same manner. The Continuous
Deployment process will deploy your database to your development
and test database instances just as it deploys your other code to
its different environments (e.g., application servers). If you need to
migrate from one database to another, you will be able to better test the
migration process by running through the process on a continuous or
scheduled basis.
Feedback and Documentation
When you incorporate continuous feedback and CDBI into your CI
system, you will find out if your build failed because of the latest database
changes. By default, most CI systems send the build status to the
people who last applied changes to the version control repository. Just
like with the source code, the CI system notifies those who made database
changes quickly so that they can make the necessary fixes to the
database.
Documentation is about communication, and there is much about
the database you'll want to communicate to other project members or
your customer. Your Entity Relationship Diagram (ERD) and data dictionary
are excellent candidates for generating as a part of your continuous
build process, perhaps as a secondary build (described in Chapter 4).
Summary
This chapter demonstrated that database assets are the same as
other source code. Therefore, the same principles apply.
Table 5-2 summarizes the practices covered in this chapter.
Table 5-2 CI Practices Discussed in This Chapter
____________________________________________________________________________
| Practices |
Description |
Automate database integration
|
Rebuild your database and insert test data as part of
your automated build. |
Use a local database sandbox
|
All developers should have their own copy of the
database that can be generated via SQL scripts. This
can be on their workstations or even shared on a
development server—as long as all developers have
their own copy on this shared server. |
Use a version controlrepository to share
database assets
|
Commit your DDL and DML scripts to your version
control system so that other developers can run the
same scripts to rebuild the database and test data. |
Give developers the capability to modify
the database
|
Avoid the DBA bottleneck that occurs when database
changes are restricted to just one or two people. Give
developers the capability to modify the DDL and DML
scripts and commit them to the version control
repository. |
Make the DBA part of the development
team
|
Be sure the DBA can run the same automated
build—which includes a database rebuild that other
developers run—to ensure consistency. By making
the DBA a part of the development team, the shared
experiences can benefit both the database and the
development teams. |
____________________________________________________________________________
Let's see how Julie, Scott, and Nona are doing now that they're
using CDBI.
Nona (Developer): I need to refresh my test data. What do I need to
do?
Scott (Technical Lead): Just run ant db:refresh from the command
line. Before you do that, get the latest changes out of Subversion by
typing ant scm:update, because I made a few changes to the USER
database table and the source code that uses this change.
Julie (DBA): Do you guys need any help?
Scott: Yeah, we are having a performance problem on one of the queries.
Do you have time to look at it? Also, I think we need to denormalize
the PRODUCT table. Can you model the table changes, prototype
the DDL changes, and set up a code branch so Nona can modify her
code for your changes? When you two are satisfied with the changes,
merge the branch and commit it to Subversion so that they run as part
of the integration build. Thanks, Julie.
Nona: . . . Sure, Scott. Should we use the test database rather than the
development database?
Scott: Yeah, just run ant –Denvironment=test db:refresh.
The developers and DBAs, who often perform roles that seem opposing
or distant, are now continually working toward the same goal, and
both are accomplishing more of their tasks that require analysis or
design.
Questions
These questions can help you determine your level of automation and
continuous database integration.

Automated database integration
How to automate database integration
Using continuous database integration and a database sandbox
This chapter is an excerpt from the book, Continuous Integration: Improving Software Quality and Reducing Risk, authored by Paul Duvall, Steve Matyas, Andrew Glover. Published by Addison-Wesley Professional, June 2007. Copyright 2007. Pearson Education Inc., ISBN 0321336380. For an additional sample chapter, please visit Addison-Wesley. Safari Books Online subscribers can access the book here.