Ashmead Software & Consulting, Inc./relational

Relational databases are the long-term memory of an organization; when they are fragmented or hard to access the organization can appear schizophrenic to outsiders.

General comments

Experience has made clear that the databases which provide the long term memory of an organization are themselves very long lived. It is appropriate therefore in considering their design, construction, enhancement, and administration to consider long term as well as short term requirements.

It is like driving a car: most of our attention should be focused on the road just ahead, but we want to keep in mind the road to come as well.

Design

The great problem in trying to design with a eye on long term requirements is that we frequently can't be sure what those requirements will be. We often have some reasonable ideas as to what those requirements are likely to be, but in practice competition, unexpected successes and failures, changes of strategic direction, mergers and divestitures, new laws, and the like will invalidate many of those ideas. How are we to get a design which satisfies today's requirements while still being able to adapt to tomorrow's surprises?

The first rule is make the design, as nearly as possible, reflect the nouns and verbs used in the industry. These generally do not change much with time. For instance, if the client talks about "patients" and "visits", create tables called "patients" and "visits".

Build

Another way to cope with continual change is to deliberately underbuild: build just enough structure to accommodate the immediate requirements, but don't build what isn't yet needed.

Given the inevitable twists in the road, many of these proposed structures will turn out not to be needed in practice. And even when they are, experience will often have changed our views on how best to implement them.

New features and other enhancements

Even basically sound initial designs tend to be weakened over time. Every active database is subject to continuous tuning to

fix bugs
add features
improve performance
upgrade to more recent version of application, database, operating system, hardware

There is a natural tendency to implement the necessary adjustments by adding new tables, new code, or the like. The problem is that after a few generations of this, the database and its applications start to become unmanageably complex. Multiple copies of what is essentially the same code come into existence, and gradually even small changes start to have unexpected side-effects.

The great rule here is to "think globally, act locally". That is, to consider the system as a whole before making any significant change, but then to make the smallest necessary change that still gets the job done.

For instance, the natural tendency when creating a new feature similar to an existing is to clone the existing code, then "hack it over" to provide the new feature. This is quick and relatively safe in the short term, but tends to multiply the amount of code unmanageably over time.

It is often better to refactor and generalize the existing code to handle both functions. has worked out a number of practical techniques for doing this -- while still keeping the risk to the existing system small

Performance tuning

Performance tuning in particular tends to put stress on a design. A number of the tricks normally used to improve performance -- e.g. pre-calculation, post-calculation, ancillary tables -- tend to make the overall system significantly more complicated. Even when the tricks work, they can create "code fear", a not unreasonable fear that the slightest change may cause the entire house of cards to disassemble itself.

Ironically -- and especially with modern hardware -- much performance trickery turns out to be unnecessary. With the improvements in hardware speeds and optimizer technology, it often happens that a well-designed, well-maintained system will run at least acceptably well over most or all of its operating range. A clean design should not be complicated by additions meant to enhance performance -- until an actual need is demonstrated.

If performance problems do appear, then the first rule of performance tuning is to monitor the system to locate the problems, before attempting to fix them: programmers are notoriously poor at locating performance problems by intuition.

Once the actual problem has been located, care should be taken to avoid fixing the problem in ways that create bugs and other problems elsewhere. Again, "think globally; act locally". That a performance problem reveals itself in one area does not automatically imply that that is the best spot to make a fix.

Debugging and quality control

It is a mournful truth of our industry, that half of all bug fixes either don't fix the original problem and/or induce new problems. The stereotypical method of debugging -- trace the code's execution till the bug is found, then swat it -- is not a particularly efficient way to either find or fix bugs.

For one thing, clean code should be readable enough that most bugs are obvious "by inspection". If they aren't, then their obscurity itself is a bug. And what we refer to as "scientific debugging" can usually find bugs faster than tracing can.

For another, every bug gives one a chance to ask: "how did this bug come into existence"? Do we need to:

drain the swamp: clean up the whole module, sub-system,whatnot
add more guard code: the earlier a problem is seen, the easier to debug
rethink the design: poorly understood and overly complex design is at the root of many bugs
add more automatic testing: make the machine do the dirty work of debugging

And so on. The subject is large enough -- and important enough, given that 60% or more of all programming time is spent debugging -- that is putting together a book, Zen & the Art of Debugging, addressing this one issue.

Administration

Database administration is now getting more attention than it once did. Issues such as

disaster recovery preparation (with fire drills)
backup and restore
index tuning
load balancing
query performance analysis
lock management
defragmentation

do seem to get routine attention, routinely.

There is still sometimes a reactive character to this work. Given that there are now many more GUI performance and system monitoring tools available recommends taking a few minutes a day to simply watch the system go through its paces and thereby get a sense of what "normal" means.

Conversions from one database to another

After unscheduled disaster recovery exercises, conversions from one database vendor to another are perhaps the scariest part of database administration. A large body of expertise has been built up on system A; unless the local developers are already familiar with system B, a similar amount of time may be needed just to get back up to current levels.

With that said, application and/or performance requirements, changes in corporate policy, mergers, and the like may require a conversion.

The great rule in this case is to break the process out into substeps. If database enhancements are required, it is generally better to do them before or after the conversion. Ditto hardware upgrades.

The conversion itself should be automated to the extent practicable. The first effort will fail. If the effort is being driven by scripts, then the scripts can be fixed and rerun, hopefully to faill at a new and higher level. But if increasingly fatigued humans are trying to run the procedures off scribbled notes, then there is no guarantee that old errors will not be repeated in new runs.

Once the conversion is done, it will turn out that the process of adapting the system to suit the sometimes highly specific requirements of "system A" will simultaneously have "un-adapted" it for the equally specific requirements of system B. Performance enhancement tricks that worked brilliantly for system A will turn out to reduce performance with B to a nearly hopeless crawl. Post conversion, a fair amount of tweaking will usually be required. And should be budgeted for.

Other services

also provides other relational database services in support of client requirements for business intelligence, operational reliability, data warehousing, and the like.

Ashmead Software & Consulting, Inc. specializes in the design, enhancement, and administration of relational databases with particular emphasis on reliability, performance, and ease-of-maintenance.