I thought I’d blog my thoughts about service and message versioning which is one of the most over-looked topics in Computer Science.

Part I: Food For Thought

1.1 Why Do We Need Versioning

Applications should be deployed independently rather than coupled as one big application. This will ensure teams can work separately aiming for separate releases matching with their own delivery schedules. It will ensure loose coupling.

Coupling releases leads to the desire of having one big release making the architecture volatile and more open to failures spreading throughout the system without being resolved quickly, efficiently, and in a controlled manner. This also makes rollbacks more chaotic.

Having fewer versions (preferably not more than two) ensure gradual phasing out of the old and smoother transition without creating issues around maintenance overhead.

Not having a versioning policy can be expensive. Architect, Project Managers, Business Analyst, and Developers time maybe spent in formulation and implementing a spike to decide on the best approach to approach versioning at different releases. Not having a consistent and clear policy on versioning may also cause inconsistencies to develop at different versions in the same project over time which may include:

1)      Not considering a more simpler solutions.

2)      Making assumptions when deciding on versioning.

3)      Incorrect or inconsistent method of versioning chosen based on team composition at different times rather than company-wide policy.

4)      Not having a versioning strategy can be expensive in terms of hardware (load) and support perspective.

5)      Not phasing out in a systematic manner will create greater complexity.

Thus, versioning will have far reaching consequences on the business development and at the speed with which we implement change and deliver requirements.

1.2 What Governance Do We Need

Business functionality may change depending on how existing services operate based on newer requirements. That means that old users of the system must be supported (i.e. System B supports multiple versions of System A to receive FulfillmentMessage and send back FulfillmentResponse to the correct System A version).

Risks should also be reduced and potential bug-fixes and optimisations being minor, independent, internal and quick to fix. Support for logging messages, tracking errors, and monitoring performance should be available based on business requirements and future planning. This helps in planning for resource requirements adequately by being able to isolate expensive behaviours and resolving the bottlenecks.

Factors to Considering in Choosing a Versioning Method

1. Amount of code to write

2. How resilient to change it is.

3. Easy to maintain and phase out in the life-cycle.

4. Minimize the impact on the systems.

Forcing consumers to upgrade to a newer version of the contract is expensive and may have lower priority in the consumers release schedule. Governance should account on how the interface is designed, bindings used, and the versioning strategy adopted to be consistent with an option for an unfortunate rollback and ability to support multiple versions simultaneously unless decided otherwise by the management.

Right level of governance should not be left unchecked, as there is a high risk of turning versioning into a maintenance nightmare (this is an argument for governance rather than one against versioning).

In the case of Global System there is also a risk of having a new version of System C that does not get used because System B does not talk to it because System A does not talk to the new System B because there is only one consumer for each service.

1.3 Type of Change

Changes to schema should be marked as:

a. Minor Change: This allows the schema itself to be backward compatible. Changes of this type involve adding optional fields, changing types. This should for the most part be able to be plugged and used by old services without major changes to the consumer. This will not break the service contract. This should not impact consumer implementation, nor the service. Field can be marked as deprecated for potential major change (which is more of a revision rather than minor change), and deleted as per governance policy. This has the least amount of impact on versioning and should be the first option. However it may lead the code with a lot of if statements depending on what the optional attribute is.

b. Major Change: This involves major changes to the schema which are not backward compatible. Adding additional required fields, removing required fields, modifying optional fields to be required, changing namespace or schema name. Since the schema is not backward compatible it requires further in dept versioning mechanisms to be implemented as discussed below.

1.4 Service Versioning [Versioning Services & Versioning Methods in Services]

1. Versioning the service as a whole [new version runs side-by-side with the older ones].

This approach works well in Object-Oriented Design of not having similar object being used in a similar way for matching functions. However, with coarse-grained services (services which expose interfaces to perform complete business functions rather than fine-grained service) it may seem inappropriate. If the contract changes, we would have to version all the internal services and sub-components of that package which may be almost identical.

oldDeployed.dll

processRequest(ObjectTypeV1 yourObject) {

CallA();

CallB();

CallC();

}

newDeployed.dll

processRequest(ObjectTypeV2 yourObject) {

CallA();

CallB();

CallC();

}

2. Versioning methods inside the services.

Having methods representing different versions inside the services allows deprecating services easily without changing the name of the service itself. It also minimizes the affect of changing services (by providing additional methods), such that consumers of other methods do not get effected. Only methods with new versions will need to be deployed. However, it means that each method will have to expose its own endpoint (so that it can provide different SLA (Service Level Agreement) for different method, but in the same service). This would require an addressing schema such that the customer will have to invoke service, operation and the version number. This may work well as long as there are only a limited number of customers using fewer versions.

Endpoint1 – First.dll

processRequest(ObjectTypeV1 yourObject) {

CallA();

CallB();

CallC();

}

Endpoint2 – Second.dll

processRequest (ObejctTypeV2 yourObject){

CallAV2();

CallB(); // upon analysis does not require versioning

CallCV2();

}

1.5 Semantic Versioning (Accepting Xml as string)

Exposed service provides a method which takes in an xml string – which can accept any version of the schema. Thus, all the changes are contained in the semantics changes. The messages are defined using schemas which can be identified from their contents. This brings the issue of code having xpaths and not being resilient to changing schema as well as the potential to accept any invalid messages causing the code to break.

processRequest (string anyObject)

1.6 Adding Versioning attribute to XML Schema

As discussed earlier, one of the options is to have an attribute which identifies the version. Making this field optional in lieu of required because:

  1. Custom processing may be required to identify version (aside from other validation and parsing) to ensure correct one is being used.
  2. XML Validation tools can not validate an instance of XML (checking if it is deserialisable may be possible but seems like an inappropriate approach).
  3. Since versioning happens inside the XML Document, it is hard to identify without parsing as to which instance it is a schema of (and deserialising into the correct one using interface classes may add code unnecessarily).
  4. XSD, and XSDObjectGen (System B is planning to change its approach and use custom hand-written classes so Sandcastle can be used) can be used but it generates namespaces based on schema namespace and not schema version. However, using auto-generators may not be always be suitable (consult the David Ezzo link and Notes at bottom, or the architect team about project suitability of auto-generators).

It is possible to denote schema version in the XML namespace with each major release. It will resolve the issue of marshalling (preparing to serialise or transmit) which will then generate code into different packages, and thus enabling support of multiple schema simultaneously.

Another similar technique is to add an element for passing customising how the messages will be dealt with. Open Application Group’s Business Object Document includes this custom attribute as part of the schema to pass custom information to maximize extensibility  without changing namespace. However this makes the schema a bit more complicated to deal with, and does not work well with different similar schema (e.g. major changes in the xml schema, and trying to support both via this).

1.7 Base Type Versioning

A colleague mentioned using a based type and derived types for versioning:

“Common service gateway to process different type (multiple versions) of messages

Objective: Process different type (multiple versions) of messages derived from same base message, without changing end point.

Description: In this approach we will have a service gateway, which will accept and process multiple versions of messages as long as they are drive from base message.

Advantages:

  • Can process multiple version of messages from single end point
  • Fully backward compatible
  • Help us to write code in Object Oriented way
  • Help us to share common code base
  • Loosely couple the different systems
  • Help us to model our service contract in object oriented way
  • Easily maintainable code base
  • We can share only required information in service contract with client.

Disadvantages:

  • We will have to expose new endpoint if we make any change in our base message, that we can easily solve by making our base message light weight and by putting only those information which we think is mandatory in each message.

This can only work for extending the schema, and possibly deprecating some nodes. However, this will not work when restructuring the schema (and possibly introducing redundancy) which can also raise maintenance issues.

1.8 Deployment and Endpoints Versioning

1. Version Parameter Routing [one endpoint]: This implements content based routing where the incoming message has a version parameter based on which the padding (hosted at that endpoint) decides which version to forward to. This simplifies service addressing (and minimizes the impact) as there is only one endpoint to call to, encoding the version number it requires. However, since multiple service methods have been coupled, collisions between class names, namespace, and database access, and in turn possibly with components (causing tighter coupling between them), making it into a more complex issue than it was initially. (This can be mitigated but not without raising code smells). In essence this is hiding the complexity for the consumer to simply its usage and comes at a cost of maintaining obsolete code and depending on implementation, violating DRY [Don’t repeat yourself principle]. [methods calling the same internal methods vs methods calling different internal methods which have to be versioned in turn vs deploying at different IIS virtual directories – how many would be needed with each release].

An ESB (Enterprise Service Bus) [using intermediaries] is thought to be a resolution for the routing and transformation issue, however it lowers performance, introduces a hop and must support all SLAs bring accessed through it (think of messages going through this router) to dynamically resolve the endpoints. This may be an overkill when there are only a limited number of clients.

2.  Multiple Endpoints: Each service having its own endpoint address is directly exposed to a service consumer. There is an assumption being made that consumers are able to resolve (and the return path for asynchronous calls) the correct version they require based on service/method version they require. This allows complete separation of multiple methods in deployment at the cost of maintaining addressing issue (which can be mitigated by passing the return path of the asynchronous reply endpoint i.e. Acknowledgement monitor request). There is one less hop than point 1, and lowers coupling between multiple versions.

Steve Vinoski in “The Social Side of Services” mentions that services have to know each other. Hard-wiring this information in code or config files is not a good solution, and thus, should operate dynamically to avoid high coupling. This should be done via a registry. The emphasis here is that exposing multiple end-points involves the client or the system to have a mechanisms to ensure the correct one gets called. This may not be an issue when 1 or 2 clients are involved.

When exposing different schemas externally through different endpoints, from the implementation-side, they xmls can be converted into a baseline internal structure so that with each new requirement too many changes are not required. This is synonymous to translating into an internal objection. [Marshalling by Reference for different types of XML and inheritance may be possible but nevertheless involved many internal object which raises a code smell as they all are in turn used to do the same thing].

1.9 Considerations for Internal Implementation Changes

Bug fixes, adding additional feature, and modification which do not require changing to the implementation of the service do not modify the service contract and thus should not been to be versioned. However, given that different version of consumers may rely of different components to be worked in a certain way, End-To-End testing and Test-Driven-Development approach should ensure should ensure that service does not become broken for some consumers with such changes. This could involve:

  1. The time synchronous calls take to process
  2. Additional validation rejecting the message
  3. Security detail changing (This was seen as a sporadic issue in Automated Deployment project deleting and creating files over the network).

1.10 Phasing Out Versions

As mentioned earlier – corporate governance and policy play a strong part in ensuring that that older versions are supported for a limited period of time to avoid having extra overhead of maintaining old versions. However, reduced period of versioning constraints development of consumer projects to implement and release an upgrade which may contradict and delay other essential features by the whole system.

1.11 Infrastructure

It is important to note consistent adoption of the bindings being used and that they satisfy security measures as well as network capability for future growth.

1.12 Analysis

To compare, at Standard Chartered, when Murex’s (commercial stock-related application, with functionality extended internally) binary updates are taken, the consumers are forced to update. eBay system (web app used by large sellers) tends to provide the older versions (functionality marked to be deprecated) for 18 months.

In order to make a decision, the main criteria is to consider how many different systems will consume the service, and how the messages will be processed and if it will consume any other services.

The most flexible approach seems to be is to have different namespace with each major release (minor versions are backward compatible, thus requiring no change). However, this strategy document is not to make a decision, but rather to provide various options for a decision based on individual projects, and management direction.

If RJIS is moved as a separate service with multiple consumers (System A, System B) it may have followed SOA principles well. If we release a [major] new version of System C which is not consumed, because a new System B is not consume because System A system does not make calls to the new System B, then we are still following the big bang approach where results won’t be known unless the full cycle of the new system is run, thus risking the ability to deploy new business requirements safely [Q Has the dependency been broken?]. This has been mitigated to a certain extent by having technical releases separated out from other types. For the system to have the ability accept a dummy booking which gets discarded just before critical points (e.g. not putting it in SDCI document or sending it off to be printed) might be a better approach.

The question that should be asked when using a certain approach are:

  1. Is it simple, does it require padding (translation to a baseline object)?
  2. Does it effect out existing tests
  3. Will there be an IIS downtime
  4. Are there any rehydration issues (using Biztalk)
  5. Will different teams be working on different versions
  6. If multiple versions are running side by side, will bug fixes be replicated across versions.
  7. Will it require additional resources?
  8. How big of a factor will maintenance be?
  9. Will the build pipeline, deployment, and installer creation be effected?
  10. Will the system be able to get good feedback in case of errors.
  11. Is functionality depreciable?
  12. How easy will it be to break the code? Will there be redundant code? How easy is it to get rid of older versions?
  13. How will rollback be done in an unfortunate circumstance?

Depending on the kind of change required, it may be best to always opt for a minor change as a stepping stone for marking it in stone as a major change.

Using a padding may work efficiently as long as there are not drastic internal changes between versions (e.g. v2 adds to SDCI, and LENNON, whereas v1 does not – then would be have to also add an attribute to which version it is meant for and have lots of if-statements in the code), otherwise the versions are incompatible.

1.13 Conclusion

This is a strategy document which provides information on different approaches to versioning. This was commissioned to have consistency while providing provisions for the long term approach. Short-terms solutions should be implemented with the aim of implemented the long terms goals successfully. Versioning should be done consistently with each new release (V2 to V3 should be implemented similar to V1 to V2). Since versioning primarily deals with backward compatibility, forward versioning should be a non-issue (v1 consumer trying to communicate with v2), and should not be considered for major releases.

Given that change is not always easy to implement it is vital to consider factor that help assist in deciding how to implement changes, and what the best approach is.

References:

Surekha Durvasula (May 2008), Why you need a stated “service versioning policy”, http://entarch.blogspot.com/2008/05/why-you-need-stated-service-versioning.html [ Accessed March 27, 2009]

Rosen, M., (2008), Applied SOA: Service-Oriented Architecture and Design Strategies, Rosen, M., Lublinsky, B., Smith, K., Balcer M., Wiley Publishing (p67, 325, 383, 484)

Boris, L. (2007), Versioning in SOA, Microsoft Architect Journal, April 2007 http://msdn.microsoft.com/en-us/library/bb491124.aspx

David Ezzo, BEA [http://mail-archives.apache.org/mod_mbox/openjpa-dev/200707.mbox/%3C46A674A3.8060601@bea.com%3E]

Test Drive .NET Development with FitNesse by Gojko Adzic

Reading through this book, I made it a point to put ear marks on pages that I think we can use to improve our tests in FA (and potentially Trumps). As I read through a lot of pages near the middle of the became ear marked. This includes suggestion that quick-fit tests should be run seprately (which can be done in the existing environment as FA tests were refactored 2 iterations ago, and thus reduce our waiting time from 6+15 minutes to just 7 minutes) as well as making them more customer centric and readable rather than be developer oriented. This ranged from page formatting to using Do Fixtures. Some idea from this book will be implemented in the next release of my project.

This is one of the books that I would like to keep close and refer to to remind myself to use better ways of writing  Fit tests and to use this framework to its full potential as well as keep things simpler.

“97 Things an Architect Should Know” is an excellent book which reflects on experiences of people who have practiced agile as well as those that have been in IT for more than a decade (It has some (ex) Thoughtworkers, but interesting none from Conchango). It sheds valuble inside into the work of architect as well as reiterate some important points (e.g. software is evolved and not developed, commit-and-run is a crime, avoiding “good ideas” which can effect the project negatively quiet easily, etc…). I think it is a good quick read to reflect on ones experiences on different project and how other people went on to resolve those issues, and potential project smells which no longer feel something that need resolving. However, to keep the book concise it was kept in a point format, which makes it hard to remember most of the points as these come about from experience and retained rather than learned beforehand. Nevertheless, it could be used as a reference to see how we can improve ourselves and excel.

I can usually tell if someone is a Pakistani, Indian, or a Bengali (unless they are Roma). However, I’ve been confused many times by South Asians as being an Arab. Maybe I can be one:

The Arab Kashan

The Arab Kashan

Reflecting back after using C# for 6 months, I remember “environment hell” I faced in Java with conflicting jar versions as well as going on a goose hunt for missing jar – when I found one, there was yet another one that was required but not packaged. I never faced such an issue in C#. Here is a Rails Envy video which accurately describes my feelings:

M – Model, V – View, C – Controller, P – Presenter

Many if not most of the people do not know the difference between the MVC and the MVP pattern. Recently, when someone asked me to describe the MVC pattern, they said that it seems like I’m talking about MVP. Most of us thought that there indeed was no difference, but in fact there is – and its very small.

In MVC, the Controller and View can talk to each other, and both can send messages to the model. In MVP, all communication is directed via the Presenter, and the Ms and Vs do not communicate directly, but only the P can send data to the M, and the View and/or its Interface (decoupling!).

As I begin new work, I encountered an interesting post on job hunting.

Watching one of the current affairs programs, here are some strong words which I remember (in an economy where over 2000 people are loosing work every day):

“I’ve been crying through the nights,

and I’ve been up all hours

keep looking what I can’t believe how the markets have gone down

I’ve had the Internet running til 2-3 O’clock in the morning,

having a look,

hoping that just perhaps the markets around the world have picked,

jobs are being created again,

that there are new openings once more.”


“Its just not fair – big banks gambling and getting billions,

but they’re not helping us, we just have to suffer through it.”


“Every morning I hear the postman coming,

and you just know that it’s going to be another bill.”


In response to government increasing their savings guarantee of £34,000 to £50,000:

“I’ve been affected by financial problems 5-6 years now.

It’s not just over the last few days, few months that’s had a problem.

I mean I’ve always been on a low income and struggling with rent, bills, expenses and inflation.

Savings up to 50k – anyone with that much saving is in a very fortunate position.

It doesn’t affect me too much, or my family because we’re not in that position

where we can worry about our savings, 50k is a bit of a dream for us.”


Jay Fields talks about Language Specialisation( or Generalisation). I think that would be making life more difficult for someone in the early stages of their career because:

1) When job hunting a lot of recruitment agencies look for “5 years experience Java” and discard C# altogether.

2) Impossible to learn the more fun stuff in different languages in sufficient detail. E.g. the different frameworks (One can only become a Generalizing Specialist with time, and not in “boot camp” mode).

3) Perhaps one might be sufficiently proficient in Java before considering taking a plunge into C#, but a person will nevertheless have one strong point at any stage rather than be fully depth-oriented in both.

I’ve come to the end of the road in Java, and now I am pursuing C# which is more to do for commercial reasons. My personal experience has sown me that there are more C# jobs (about 65/35 ratio) than Java. C# offers are more concentrated opportunity to  learn and contribute as a lot of work is being done under one roof by Microsoft, with strong open-source community involvement. I’ve done a lot of Java, now its time to do some C#. Time might be cyclical, but it might not be this time.

Good bye Java. Welcome C#. *Mem Reset*

Quiet frequently, talking to people in small or medium sized companies one of the most common question I ask is the technologies they are using. I get the common response that they are a full Microsoft shop and use  Microsoft products only.

Probing furthur I usually discover that it is usually because they are using C# and .NET technologies. More commonly Cruise Control, SUbversion, NANT, and NHibernate are frequently being using and are not attributed to being Microsoft product, but rather open source software developed in C#. Microsoft has its own version of Source-Control, and testing framework that is not commonly used yet.

The only place I would truly consider as being Microsoft Shop are many of the Gold Partners. Many exist, especially in the Guildford area that say that they don’t use anything unless it is developed by Microsoft. Even though it has the benefit that the development effort is more concentrated, however, there is an issue of group-think.

When there is group-think, new idea is not necessarily created, but rather if the group starts going in the wrong direction, there is no turning it around. Whereas in open source communities the audience and wide and has direct to access to the inner workings of the software, thus is likely to be high of creativity and adoption of principles.

Many agile projects adopt different variations of XP and Scrum to what they deem fit in their environment. One point which is hard to understand is when companies design reporting tools, how do they decide on if they want to use burn up or burn down charts.

I am a strong supporting of a burn up chart as it gives a more accurate picture of a long running extensible project. In a burn down chart it always seems that we are aiming for a zero but can’t quiet ever seem to reach it.

Lets take a scenario:

Project A
Points: 100

After after 5 iterations 50 points are complete, and new requirements come in.

Points completed: 50
Points remaining: 50
New points: 50

Total remaining: 100

So now, in a burn down it is not visible that we are now aiming for 150 requirements, but rather that we are back to where we were at the start of the project – 100 points to do.

In a burn up we would clearly be able to see that we finished 50, 50 more came in, and our new target is 100 points. Burn down is not suitable for scope changes, thus does not seem very agile.