Migrating your Database Upon Deployments


Options, plan B’s, are handy things to have. We all desire to have these backups, and yet, it is amazing how often we choose not to exercise this comfort within our production deployments.

Glacier in the sun

Rolling back our production application should be the plan B. The most often cited reason for having the inability to roll back a production installation is the database. Once the database is incremented by that particular schema or data patch, it renders the previous code incompatible with the database.

It is possible that you are able to rollback your database schema so that the old application code becomes compatible with the database. However, there are risks involved such as potentially losing data that your newer version of the application captured into the database. Or worst case: your rollback scripts fail which leaves your database schema in a limbo state, a version not fully recognized by the older version of your application nor by the newer version.

With such high risks, it is no wonder why development teams would opt to fixing the existing (newer) code base, and deploying a new version than risk going back into time. But when things do go wrong, reactions will be like knee jerks, and more mistakes could be made - ultimately, ending up a longer than necessary downtime, the enemy metric of a scalable system.

This blog post will explore methods and techniques that would encourage the following attributes:

  1. Allow the database to be hot/compatible with the previous version.
  2. Allow both versions to operate against the same point of time. Yes, the same time.

Binary Compatibility

Just imagine that the framework that you develop against decided to change their API without any prior notice. Code that used to compile without any warnings, suddenly no longer compiles! That would be utterly unacceptable. Instead, you would expect the API to mark the old API with some kind of “obsolete” attribute, and detailing the newer member to use.

So why do developers fail to treat their database in like manner - that is - depreciate old members after a period of time? Do they feel that this technique is too cumbersome to manage?

I do not think it has to be too cumbersome. We do go through great lengths ensuring that our web services and our libraries are binary compatible, so why not treat and view the database as another library? After all, it is a service to data (regardless how it is stored and retrieved).

With that in mind, we can break the database feature life cycle into two steps:

  1. Expansion (without altering pre-existing schema or data)
  2. Contraction

Expansion

Appending optional attributes upon your database schema is the key for adding new features to the data store. Do you need a new column? New table? New view? Sure, just add one, but let us discuss them in a little bit more detail.

But one thing that we need to take away - no “simple renames” or “drops” are allowed until our application has long deprecated its usage on that old column’s value, or that old table’s value.

Adding Columns

Here are a few ideas for adding columns that ensures that the data store remains compatible with the previous version of the application while serving the new requirements of the newer version of the application.

  1. Add a simple default value to your column (for example, IsEnabled is set to true)

  2. Allow the new column to be nullable. Your data model is allowed to temporarily not enforce the business rules; your application code should be pretty good at that.

    1. A patch may be executed to update any null values to the appropriate value for the new system.

    2. Or if the value is too hard to be determined by a simple script, rather determine the value during runtime. I like to call this a progressive data evolution. Whenever the application reads the new column as being null, it will then determine through a code path on how to deal with that particular case.

I personally prefer the (1) and the (2.2) variations because (1) deals with the dead simple case, and (2.2) for everything else which is neatly versioned controlled through the application’s version.

Don’t Alter the Value or Meaning of a Column Value

Imagine that you have a column which stores a set of enumeration values. If you have to change the meaning of what those values map to, this would confuse the previous version of the application because they literally got their wires crossed.

In a similar manner, adding a new member to the enumeration set would also send the previous version of the application into confusion. Why? How should the old application deal with the enumeration value that it was not told to deal with? It just simply cannot deal with the unknown.

If you need to add a new member/value to the enumeration set, or change the meaning of the values it maps to, rather create a new column with the new set of enumerations and keep the old column’s original enumeration set intact. When the newer version of your application updates a pre-existing record, it would be up to the updating logic of how to keep that old column in a valid ‘synchronized’ - or meaningful - state for consumption by the old application.

Adding Tables

Adding a new table to a database is most unlikely to break the previous version of the application. However, considering that new tables may form part of the logic of stored procedures and views, one does need to consider it.

  1. If joining to the new database table, prefer to use left joins if the new table brings “value-added” data to the query.
  2. Utilize feature flips if you the table performs some vote in the logic. See later on how to deal with database views and stored procedures.

Altering Logic in the Database

New schema information within the database are - essentially - new features for your data set. If an application has to rollback on the application version, it may also be necessary to rollback the logic within your database too; such as stored procedures.

As mentioned before, marking down certain database features would be a great advantage for the database. It is minimal work to re-run the previous version of the application’s stored procedures (remember that you would still desire to have the database accessible to the newer version of the application). Another good reason for not running the stored procedures in again is that when you decide to rollback, you want to minimise the number of steps to restore operation to your application.

There are two ways that you may wish to mark down your database features.

  1. You may want to have a table within your database with the different mark down flags. I disfavor this approach because it prevents the application from stating what version the calling application is - the older version, or the newer version.

  2. You may pass a parameter to the stored procedure which states the version of the application of the callee. The disadvantage is that you cannot use this technique in database views nor in triggers.

With regards to database views and database triggers: if there are any of these logical systems, it might be best to refactor it out into database stored procedures. They would easily be able to mimic the work that you performed with the triggers and views, from the application’s point of view.

Contraction of the Schema

I am yet to see any members marked Obsolete within .NET to actually be removed from the framework. But since your application may be on a more aggressive release cycle, house work is inevitably on a more aggressive chore too.

Simply put, if your application no longer uses a certain column or table within the last four to six weeks, then it is time to retire that column. That would give your company enough time to evaluate whether or not your new version of your application is indeed stable.

Simply add a scheduled reminder to retire those obsolete members.

Conclusion

NoSql has popularized many of the deployment techniques to the average “web hippie.” Techniques such as “progressive data evolution” and gradually retiring undesired properties of the data record makes rolling back the application an easy obstacle to overcome.

I certainly hope that I have convinced you that if you really have to appraise your development process for database related code, you are able to open up the avenue to roll back your application version when necessary; ultimately enabling seamless uptime and scalable services.

Another benefit that you may receive is the ability to do midday deployments. That means less time working outside of the normal office hours; which also just improves your team’s chances of reducing the time to respond to a bad deployment.

With these two benefits - byproducts - within your application, you may be able to implement continuous delivery model with relative ease. But that is for another blog post for some time in the future.

blog comments powered by Disqus