Wednesday, February 16, 2005

Maintainability

OSS Should Strive for Even Greater Maintainability
Open Source Code Maintainability Analyzed
Halstead Complexity Measures
Cyclomatic Complexity

What makes software maintainable. Is it the number of lines of code, the formatting, the density of operators, the frequency of comments, the Halstead Volume, or perhaps the Cyclomatic Complexity ?

I don't think so. Theres only one way you can measure software maintainability, and that's how well it's been written. The problem is there's no algorithm which can tell you how well written a piece of software is.

Like a good movie script or a compelling novel, well written software this has less to do with metrics like the number of syllables or chapters and more to do with how the story is told.

The problem with most software is that it is not written to be read or maintained, it is written to be compiled and run. If the software industry understood even a little of the psychology involved in conveying information they would have realised long ago that like a novel, code needs to be presented in a readable format.

Which brings us to the first important factor in maintainability.

Understanding

For code to be understandable it must be clear, unambiguous and easy to read. This immediately excludes languages like Perl, frameworks like MFC, and standards like Hungarian Notation.

If you are an English speaker then it makes sense that the easiest thing for you to understand is "plain english", so it follows that code should be written in that form. Intent and semantics are far more important than type and scope. Mixed case isMuchEasierToRead than singlecasewithnospaces, or single_case_with_underscores, and MUCH LESS OFFENSIVE THAN UPPERCASE. Understadable code is easier to maintain because it takes less time and effort to comprehend.

Comprehension

Once you understand the code, you can work out what its doing. When you work out what its doing you can determine its purpose and therefore comprehend it. When you can comprehend code you are well on your way to being able to make effective modifications. Maintainable code should always be easy to modify.

But just comprehending the code is not enough. You also need to be able to work within the confines of the implementation.

Implementation

There is no one way to implement an algorithm, a thousand people will implement the same code in a thousand different ways. Implementations differ becuase different people have different priorities and ways of thinking about things. Good implementations are those which don't just achieve the desired outcome, they follow principles.

An implementation is a balance between performance, complexity, size, extensibility, and clarity. It is a common misconception that high performance code necesitates a smaller and less extensible implementation whith reduced clarity and increased complexity. Just as it is also a common misconception that larger code with greater extensibility results in increased clarity and reduced complexity.

The very best implementations are those which break down the problem and express it in terms of domain specific concepts and nomenclature, aiding in comprehension and making it easier to maintain.

Expresive code which is separated into obvious concepts with logical relationships and sensible naming makes it easier for a developer to modify one area wihtout adversly impacting other unrelated areas. It can also assist by removing complexity from otherwise monolithic implementations and encouraging re-use, without which maintenance would be a never ending and impossible task.

Sometimes you may see an implementation which is so simple and understandable that you may wonder why it is that no one thought of it before. Such implementations are akin to graphical proofs of mathematical theorems. Powerful in their impact and yet obvious in their semantics.

Semantics

Possibly the most important aspect of maintainability and yet the least understood. Semantics is the meaning conveyed by the code. It influences how the developer interacts with and uses the libraries or environment with which the develop their software.

Code which is written with good semantics is more likely to be used correctly and extended to produce better, more reliable software. Code with bad semantics on the other hand is more likely to make it difficult for other programmers to use and will therefore result in more maintenance issues.

The essential point here is that ambiguity of purpose leads to incorrect use, and incorrect use leads to bugs.

Difference types of software require different approaches. Writing a library or framework is has vastly different concerns than writing and application.

Writing great software is an art not a technique, but by obeying common sense and using good principles you are more likely to succeed.

My favorite is the principle of least surprise.


There are a lot of common words for the concepts that ive described here which i have deliberately chosen not to use beucase of their ambiguous semantics. Different people interpret things in different ways and often labeling concepts can result in the adoption of a false understanding of what they mean.