Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!

# Calculating Cyclomatic complex for a whole class

harii haran
Greenhorn
Posts: 11
Hi,
I have been reading about calculating cyclomatic complexity, all the examples given are for just a method. How do we calculate it for a entire class.
any help would be appericiated.

Lasse Koskela
author
Sheriff
Posts: 11962
5
I'm moving this to the OO forum. Please continue the discussion there.

Ilja Preuss
author
Sheriff
Posts: 14112
Cyclomatic Complexity is a metric for a flow of control - it basically counts the number of decision points. As a class doesn't present a flow of control, it doesn't make sense to measure it's cyclomatic complexity.
What you could do, though, is calculating it's Weighted Method Count. This is the sum of a metric value for all methods of a class, and could well be used together with Cyclomatic Complexity.
Did that help?

harii haran
Greenhorn
Posts: 11
But how should I count Weight method count? Could you please give me an example? or give some links to read more about it.
thanks.

Ilja Preuss
author
Sheriff
Posts: 14112
You calcute a the metric value for every method and then just sum them up.
For example, if you are using Cyclomatic Complexity as your method metric and you have a class with three methods of CC 2 each and one method with CC 1, the class would have a Weighted Method Count of 7.
Does that compute?

Reid M. Pinchback
Ranch Hand
Posts: 775
Generally the reason for computing McCabe cyclomatic complexity (I'll use 'MCC' for short) is to estimate testing or maintenance effort. A Weighted Method Count ('WMC') of class methods won't really give a correct estimate (correct in the sense that if you agree with the rationale behind using MCC, then the weighted method count won't compute a value consistent with that theory).
The simplest algorithm generally published in books is based on the analysis of a program (where 'program' is meant in the computing theory sense... a single terminating function). Ilja's point about flow of control is correct with respect to this commonly-published algorithm.
MCC(method)=#decisions + 1
There is a variation on applying the algorithm used for multi-module systems, and is where the reason behind that "+ 1" comes into play. The simpler algorithm is a special-case adaptation of a graph-theoretic computation. The graph theoretic version can apply to a collection of disconnected subgraphs, which is what allows for applying it to multiple modules. The "+ 1" becomes "+ m" where m is the number of modules (think: methods, constructors, initializer code).
Here is an example of why WMC != MCC
class Foo {
public Foo() {}
public int methodA(int x, int y) {
if (x > 0) { return methodB(y); }
return 0;
}
public methodB(int y) {
if (y > 0) { return 2*y; }
return 0;
}
}
MCC(methodA) = 1 (for the if) + MCC(methodB) + 1
MCC(methodB) = 1 (for the if) + 1
MCC(constructor) = 1
From a graph-theoretic standpoint, there are two disconnected subgraphs here; one containing just the constructor, the other containing methodA and
methodB. MCC(methodA+methodB) != MCC(methodA)+MCC(methodB); you already accounted for methodB when computing MCC for methodA because any use of methodA will tend to cause methodB to be used.
MCC(Foo) = MCC(constructor)+MCC(methodA) = 5
WMC(Foo) = MCC(methodA)+MCC(methodB)+MCC(constructor) = 7
The problem is that WMC is treating every method as if it were a disconnected subgraph, and that isn't correct. The more methods you have in a class, and the more they call each other, the more WMC will diverge from MCC.
Note also that the MCC number does tend to correspond reasonably to the number of test cases required to achieve coverage. In this example, you'd need:
1 test for the constructor (in the case where your constructor did something more interesting than the example above)
4 tests for method A because you need 2 values of 'y' so you can exercise both branches of methodB, and 2 values of 'x' so you can exercise both branches of code directly within methodA.
tests(subgraph1)+tests(subgraph2)=1+2*2=5.

Ilja Preuss
author
Sheriff
Posts: 14112
Originally posted by Reid M. Pinchback:
MCC(Foo) = MCC(constructor)+MCC(methodA) = 5
WMC(Foo) = MCC(methodA)+MCC(methodB)+MCC(constructor) = 7
The problem is that WMC is treating every method as if it were a disconnected subgraph, and that isn't correct. The more methods you have in a class, and the more they call each other, the more WMC will diverge from MCC.

I agree. MCC(Foo) has its own problems, though - it doesn't take the size of an "edge" into account. I think I can write two methods of both MCC 1 with quite different complexity. With methods of a reasonable size, I have the feeling that WMC gives me a better approximation of the class' complexity.

Note also that the MCC number does tend to correspond reasonably to the number of test cases required to achieve coverage.

I would expect this to only work in isolation. If you take the clients of the class into account, you might find that some methods don't need explicite testing.
But I have to admit that I don't have any experience with using MCC or WMC for estimation purposes. I use them mainly as "code smell detectors"...

Reid M. Pinchback
Ranch Hand
Posts: 775
Originally posted by Ilja Preuss:
MCC(Foo) has its own problems, though - it doesn't take the size of an "edge" into account. I think I can write two methods of both MCC 1 with quite different complexity.

And how exactly does a WMC based on MCC constitute taking the size of an 'edge' into account? There is also a gaping hole in the logic; if complexity is measured by a complexity algorithm of your choice, and two functions yield the same number, you can't then turn around and say they can have different complexity. You can say that different metrics measure complexity in different ways, but then obviously it is trivial to say that you can create two input examples such that metric A behaves differently than metric B.
In cany case, definitely edge size could be an issue if you were concerned with estimating single-statement complexity, computational performance, programming effort, or are concerned with the total volume of code when compared with its logical complexity. Edge size is not relevant if you are trying to calculate testing effort (you either have a test that traverses the edge, or you don't... what is on actually on the edge does not change the minimal number of tests you need to exercise - i.e. cover - the code).
If you take the clients of the class into account, you might find that some methods don't need explicite testing.

Obviously anybody can choose not to write a test (although from an XP or refactoring standpoint I bet that untested-because-it-is-unused code shouldn't even exist). From a library development standpoint (which class testing is), the tests should always be written, because the *current* client isn't your ownly concern, its *all* clients that could use the library.
Choosing not to write a test is a separate issue from understanding the characteristics of a metric and the original motivation behind its creation. MCC is a logical complexity metric that counts the minimal number of paths in code (in other words, ignores the dynamic complexity of recursion), and hence the minimum number of tests required to test that code (because unless you have a test to traverse a path, you haven't established that the path can work by exercising it). It isn't an isolated occurrence that the computation yields numbers with this correspondence, it is intentionally its fundamental nature.
One issue MCC is not suited for that WMC would be better at, would be estimating refactoring effort. The WMC numbers probably give you a better sense of the dependencies behind the API you are using.
[ January 19, 2004: Message edited by: Reid M. Pinchback ]

Ilja Preuss
author
Sheriff
Posts: 14112
Originally posted by Reid M. Pinchback:
And how exactly does a WMC based on MCC constitute taking the size of an 'edge' into account?

Not at all, if viewed alone. If used as a "smell detector" in combination with other metrics such as method size (in Number Of Statements), it can work quite well, though.
There is also a gaping hole in the logic; if complexity is measured by a complexity algorithm of your choice, and two functions yield the same number, you can't then turn around and say they can have different complexity. You can say that different metrics measure complexity in different ways, but then obviously it is trivial to say that you can create two input examples such that metric A behaves differently than metric B.

Yes, sorry, I was sloppy. I used the term "complexity" in a subjective sense, such as in "hard to understand", "probability of having a bug" etc.
Edge size is not relevant if you are trying to calculate testing effort (you either have a test that traverses the edge, or you don't... what is on actually on the edge does not change the minimal number of tests you need to exercise - i.e. cover - the code).

That, of course, depends on how you are writing the tests. If you are testing a big edge, you might get a big test for it, too. Personally I would probably prefer splitting such a test into several smaller ones.
And if you don't, I am not convinced that number of tests very strongly correspondents to testing effort...
Obviously anybody can choose not to write a test (although from an XP or refactoring standpoint I bet that untested-because-it-is-unused code shouldn't even exist).

Yes and yes. Nevertheless I might decide to not explicitely test a simple accessor. It probably will be appropriately tested by the tests for its clients.
Choosing not to write a test is a separate issue from understanding the characteristics of a metric and the original motivation behind its creation.

Yes, of course.
MCC is a logical complexity metric that counts the minimal number of paths in code (in other words, ignores the dynamic complexity of recursion), and hence the minimum number of tests required to test that code (because unless you have a test to traverse a path, you haven't established that the path can work by exercising it). It isn't an isolated occurrence that the computation yields numbers with this correspondence, it is intentionally its fundamental nature.

Yes, I agree. I only wanted to point out that it isn't as straightforward as it might have been understood from the initial posts.
BTW, thanks for the concise explanations - it actually made some things clearer to me.

harii haran
Greenhorn
Posts: 11
Thanks a lot for taking time to explain..
It made many things clear to me..
one more question is what if we have main the program and we know how the flow is going to be..
it is ok to draw a flow diagram for the whole and calculate complexity? or I am not still understaning the concept? :-(
Please correct me if I am wrong..
harii

Ilja Preuss
author
Sheriff
Posts: 14112
The way I understand it, that would result in the Cyclomatic Complexity of the whole program, yes.