Friday, January 11, 2008

Code Metrics in Visual Studio 2008

If there's one thing in a development project with which I have an almost unhealthy obsession it's statistics. I'm rarely happier than when I'm knee-deep in profiler output or digging through build statistics trying to work out where in the code coverage levels the unit testing needs beefing up. So when I discovered that the Team Developer versions of Visual Studio 2008 includes some new Code Metrics functionality I must admit it got me quite excited.

The metrics are available under the "Analyze" menu and allow you to scan selected projects or the whole solution. The results look like this:



The five indicators that are calculated for each project are:

Maintainability Index
This is basically a summary value which uses the results of the other calculations to produce a value from 0-100. The higher the value, the better the code is in terms of maintainability.

Cyclomatic Complexity
This is an interesting measure which is effectively the number of code paths through all methods in the project. The idea here is that the higher the number of decision points in the code the greater the level of code coverage required in the form of unit tests.

Depth of Inheritance
At the top level, the depth of inheritance is the maximum number of levels of inheritance for all types in the project. A high (and therefore deep) inheritance hierarchy can probably point to possible over-engineering of the class structure. It can also hinder maintenance as the more layers of inheritance, the more difficult it can be for someone to understand exactly where code for particular members lives.

Class Coupling
This metric is an indicator as to the level of coupling between classes in the module. This includes dependencies through properties, parameters, template instantiations, method return types etc. Obviously the current mantra with development, particularly in these times of TDD is that code should have high levels of cohesion and lower levels of coupling, this makes unit testing individual classes much easier to achieve.

Lines of Code
Finally, the lines of code count. I'm not entirely sure but I suspect that this is some approximation based on the IL generated for each method rather than a straight count of the physical lines of code in the source files. Obviously this is useful when drilling down through the namespaces and types to be able to see where there are methods that are possibly doing too much and could be good candidates for refactoring.


The metric I'm most interested in at the moment, being a TDD acolyte, is the Cyclomatic Complexity (CC) value. I'm a big fan of using code coverage analysis tools such as NCover as part of a continuous integration environment to keep an eye on the level of coverage that the current set of unit tests are providing and being able to target those parts of the project where additional testing is required. CC is another tool to help with that as by drilling down into projects and namespaces with high CC we can get down to the specific methods that have a high value and target those with extra unit tests.

So how exactly is Cyclomatic Complexity calculated? As a simple example have a look at the following code:

public int
CountItemsWithStatus(IList<Item> items, StatusCode status)
{
if(items == null)
{
throw new ArgumentNullException("items");
}

int count = 0;

foreach(Item item in items)
{
if(item.Status == status)
{
count++;
}
}
return count;
}


Obviously this is pretty standard stuff., but we're looking to count the number of possible paths through the method so here we go:

1) If the "list" parameter is null, the code will throw an exception.
2) The parameter contains an empty list. The code will skip the foreach and return 0.
3) The list is non-empty but none of the items have the required status so returns 0.
4) The list is non-empty one or more are the required status so the count++ is executed and the method returns the count.

So by my reckoning, we have a Cyclomatic Complexity here of 4, in a method which contains approximately 7 lines (so a high ratio.) And in theory, we would need our unit tests to cover these four scenarios in order to completely cover all code in the method.

One of the nice things about this method of analysis is that by using the decision points (foreach, if, for, while etc) as a basis for complexity we could use tools to automatically generate a lot of the unit tests for us.

No comments: