aspose file tools*
The moose likes Agile and Other Processes and the fly likes Tests for documentation ? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » Agile and Other Processes
Bookmark "Tests for documentation ?" Watch "Tests for documentation ?" New topic
Author

Tests for documentation ?

Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
As some of you may know, I am often an advocate for agile methodologies, and Extreme Programming (XP) in particular. I now find myself with the responsibility for producing some documentation for a project I have been working on, and would like to use XP practices as much as possible to accomplish this task.
I have to produce some "design and maintenance" documentation so that after I deliver the software and move on, future developers who need to work on the system will have the benefit of some of my understanding. Although I have tried my best to make the code self-documenting, I can see that there are many over-arching issues which don't make sense to document at any particular place in the code, so I'm "happy" to undertake the task.
The organization in which I am currently working tolerates my working habits, mostly. I have managed to get some of the parts of XP in place: The "customer" for this documentation sits just across a short partition from me; I have the freedom to deliver the documentation in whatever format I want, and deliver it in several small iterations etc. No pair programming on this one, though.
In my coding, I make extensive use of automated unit tests and refactoring. Not only can I not see how to provide a unit test "safety net" to allow me to refactor the documents as I wish, but I can't even see how to set up practical acceptance tests so I know when to stop!
The XP books say (glibly) that if you find you need documentation, simply schedule it like any other task. I'm finding that documentation is not like any other task.
Can anyone offer any help or suggestions ?


Read about me at frankcarver.me ~ Raspberry Alpha Omega ~ Frank's Punchbarrel Blog
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
You've hit my big pet peeve--documentation. I'm really into documentation.
Here is the biggest problem I see with documentation, it's not easy to update. This is generally true. You write some design documents, then start coding and the design changes. Developers then forget to update the documentation to reflect these changes.
My gut says this will be even worse in XP, just by the nature of the culture.

What you really want is something like Javadoc, something which can automatically generate documentation from a rapidly changing codebase. I wish I had a tool like that, but I think we're a while off.

--Mark
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
In general, I think I agree. But.
I reckon most directly code-related documentation is best conveyed by the code itself, maybe with a little JavaDoc style markup here and there to clarify what something is for. But that still ignores all the alternative ways of looking at the solution, which might really help someone to get to grips with why things are done the way they are.
Often, when I have to fix a fault on some software I am familiar with, I can go directly to the code in question. I can do that not because I have some sort of class-lookup process in my head, but because I undestand the "shape" of the software and the collective responsibilities of groups, subsystems and ad-hoc collaborations within the codebase.
Where I see the need for non-machine-generated documentation is in things like this. In some ways, the stuff UML is good for.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Frank Carver:
In my coding, I make extensive use of automated unit tests and refactoring. Not only can I not see how to provide a unit test "safety net" to allow me to refactor the documents as I wish, but I can't even see how to set up practical acceptance tests so I know when to stop!

You are right - it is hard to come up with automatic tests for documentation outside of code. In fact, I have never heard of one.
I think that is the main reason why most agile advocates suggest to create these forms of documentation at the latest possible moment, so that you don't have to refactor it much...
Regarding acceptance tests - the best advice I can give you is to show your documentation to an other developer and let him explain the documents to you. That should give you good insights in how well the documents are able to transport the essentials about the system.

The XP books say (glibly) that if you find you need documentation, simply schedule it like any other task. I'm finding that documentation is not like any other task.

I think this advice in fact only regards scheduling...
What do you think?


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
What you really want is something like Javadoc, something which can automatically generate documentation from a rapidly changing codebase. I wish I had a tool like that, but I think we're a while off.

Mhh, automated unit tests can be seen as documentation which is automatically always up to date (if you happen to run them regularly, of course).
It would be interesting to try custom javadoc tags and an accompanied doclet, so that you could create a document with essential code examples from the unit tests...
Reid M. Pinchback
Ranch Hand

Joined: Jan 25, 2002
Posts: 775
Frank, if you are looking for a tool that will help you know if the comments in the source are up to date, you may want to take a look at iDoc
I haven't used it, so I'm not making any claims on how it works.
If you want a place to locate doc comments that aren't specific to a single class, I use the package page included by JavaDoc into the overview for that material.
As for UML diagrams, yes, I think those can be useful. As I understand it, Agile advocates tend to keep those to a minimum. Maybe just have a small number of diagrams that give the essential architecture of the system, and display them in the overview.
As for waiting until the last minute to write documentation, while I understand the logic I guess that makes me a bit nervous. In a large program, by the time I get to the end I've often forgotten why I did what I did. Also, I'd be really worried if I was the customer... I've been through too many situations where a consulting firm promised to write docs but just delivered some useless crud to get the client off their backs.


Reid - SCJP2 (April 2002)
Steve Fahlbusch
Bartender

Joined: Sep 18, 2000
Posts: 582
    
    7

Frank,
I have a brief document at the system level that details system wide things, like roles, definitions, security issues, architecture (mostly the non-functional aspects of a system)
Then I have a - for lack of a better terminalogy - use case document for each feature that captures two things: the intent of a feature (or logical transaction) and test cases for the feature (or system level test if you would)
{note: these just might be links to my annotated test code - for system or feature level tests }.
Mostly a paragraph or two on the intent will do. But I also use these document feature specific stuff documented no where else.
I link the system doc to each of the feature docs and each feature doc to test source and a link to kick off each feature test.

Steve
[ July 08, 2002: Message edited by: Steve Fahlbusch ]
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Frank Carver:
In general, I think I agree. But.
I reckon most directly code-related documentation is best conveyed by the code itself, maybe with a little JavaDoc style markup here and there to clarify what something is for.

I wholeheartedly disagree. Code is complex, confusing, and can easily be misunderstood. Now I'm not for writing essays in code. Comment should be susccint. Howevever, a few words interjected every few lines in the code, e.g. defining variables, describing the purpose of a for loop or non-trivial if statement, makes the code more readable. People can go through it, i.e. maintain it, faster and with fewer mistakes. Method and class level comments help give a high level overview of the local code.

--Mark
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Ilja Preuss:

Mhh, automated unit tests can be seen as documentation which is automatically always up to date (if you happen to run them regularly, of course).

While I think automated unit tests are great, they are not documentation! When someone opens up your files to modify your code, the unit tests won't explain to them how it works, and why it was implamented this way, and not another way. Tests and documentation serve two different purposes.

--Mark
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Ilja Preuss:

I think that is the main reason why most agile advocates suggest to create these forms of documentation at the latest possible moment, so that you don't have to refactor it much...

I think this is inappropriate for the reasons Frank mentioned--two weeks after writing the code sometimes I can't understand it (assuming no comments), let alone 6 months later.
Mark's Rule of Documentation:

Code is not complete without comments.


--Mark
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:

I think this is inappropriate for the reasons Frank mentioned--two weeks after writing the code sometimes I can't understand it (assuming no comments), let alone 6 months later.

With all due respect, but perhaps that is telling us something about the complexity of your code?

Ilja's Rule of Refactoring:
Code is not completed when it seems to need additional documentation.


[ July 09, 2002: Message edited by: Ilja Preuss ]
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
While I think automated unit tests are great, they are not documentation! When someone opens up your files to modify your code, the unit tests won't explain to them how it works, and why it was implamented this way, and not another way.

Well, they explain how the code is supposed to be used and what results to expect.
I think in almost any case, most of the *how* can best be documented by the code itself. You shouldn't need much more than a rough overview additionally, ime.
Can you give an example of what you mean by the "why", please?
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
I wholeheartedly disagree. Code is complex, confusing, and can easily be misunderstood.

Does it have to be that way?
a few words interjected every few lines in the code, e.g. defining variables, describing the purpose of a for loop or non-trivial if statement, makes the code more readable.

I don't write methods bigger than a few lines of code, generally. The method name tends to describe the purpose of these lines adequately. And I don't accept non-trivial if statements in my code.
I use the extract-method refactoring regularly.
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
I keep trying to come back to the sort of documentation I am looking for. In some ways it is like what Mark describes :- the "why", but in a lot of cases (as Ilja points out), the local "why" can be expressed in the code as well as the how with careful naming and grouping.
But the information I'm trying to pass on to future developers using the code is all the stuff I know about the software which is orthogonal to the code itself. Things like: under what kinds of circumstances which asociations of cache classes might be populated, how to find which piece of code is at fault when something goes wrong, external factors which might cause the build process to fail, and so on, and so on.
I really can't imagine any way of documenting this with the code or the unit tests, however clever the naming. I also can't think of anywhere in the (hundreds of) source files to put this sort of stuff which would stand any chance of being found by the people who might need it.
Reid M. Pinchback
Ranch Hand

Joined: Jan 25, 2002
Posts: 775
Originally posted by Frank Carver:
I really can't imagine any way of documenting this with the code or the unit tests, however clever the naming. I also can't think of anywhere in the (hundreds of) source files to put this sort of stuff which would stand any chance of being found by the people who might need it.

I agree, these are important things to document. They aren't going to show up in tests because often they reflect integration-level issues, not unit test issues. It is the one thing I really hated about getting stuff dumped on me from a consulting company; they would document things at the unit level - a lot of which I could have figured out on my own by looking at the code for 30 seconds. The problems that spanned code modules drove me nuts, because I'd have to learn a huge hunk of the app and reverse-engineer the design philosophy before I could figure out what was going on. Documentation to help you triage problems is very different from unit-test information.
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Ilja Preuss:

With all due respect, but perhaps that is telling us something about the complexity of your code?

I should have expected a comment like ths from you :-) All my statements are subject to scruntiny and I welcome when they are questioned.
Perhaps I just haven't seen the light and my code is overly complex. I think it's reasonably well designed, although maybe I'm not quite in that top 1% of the field.
However, I've found when I've got 20 some lines of code, the occasional comment will make it much more readable. Here's some code I wrote a while back


It's 5 lines of code, but I think those 2 lines of comments help clarify what's going on.
Originally posted by Ilja Preuss:

Does it have to be that way? {refering to complex code}
...
I don't write methods bigger than a few lines of code, generally. The method name tends to describe the purpose of these lines adequately. And I don't accept non-trivial if statements in my code.
I use the extract-method refactoring regularly.


And there-in lies the complexity. Again, maybe I'm missing something in all this refactoring, but the result is it makes a large number of methods and there is a lot of indirection. Unless you have 80 character method names for all your methods, it can get comfusing what exactly each method does. Often I have to jump back and forth between methods to figure out what's going on. It's all this indirection and trying to hold a half dozen methods in my head (or anyone's head) at once, which is likely to cause misunderstanding and mistakes to be made.
Need code be complex to the point that such comments are unnecessary? Perhaps. But I don't see any current tools or metholodogies which will help us to that end.
Originally posted by Ilja Preuss:

Well, they explain how the code is supposed to be used and what results to expect. {wrt unit tests}

So you're saying that if you need to understand some code, instead of reading comments inbedded in the code, you should go read the test case code?

--Mark
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
Mark Herschberg wrote: So you're saying that if you need to understand some code, instead of reading comments embedded in the code, you should go read the test case code?
In many ways, yes. Being brutally honest, I'm sure we've all encountered comments which were misleading, out of date or just plain wrong. Unit tests, on the other hand, have to be right (at least if they are actually ever run). The compiler and the code won't let them get out of step with reality.
Let's take your code snippet, for example. I won't snipe at your coding style or commenting style (I'm sure my own is worse), but take a look at this:

This is the same code a few weeks later. While testing a deadlock problem, a developer put a line of code in to force this method to only allocate one thread in the pool. Being a good commenter, he updated the comment to show what he was doing.
Then, after that and a bundle of other bugs were fixed, the code was due for release. Part of the release tests is to run a relatively slow thread pool and load test. Oops it failed! Suddenly he remembers the thread pool hack, and quickly puts back the old line of code. The tests now pass, and the code is released.
We now have a comment which is completely wrong. And worse, it's not an easy one to spot, confusing Math.min with Math.max is common at the best of times.
A few months later, our programmer is working on this code again to track down a performance problem. While browsing this file he sees his comment, and vaguely remembers putting in a thread pool limiting hack. "Oops, left it in there by mistake. No wonder it runs like a dog", he thinks to himself, and deletes both the comment and the Math.max line. He's a cautious sort of chap, so he re-runs all the tests. They all pass, but the performance is not noticeably better, so he keeps on looking and forgets about it.
Now it happens that this routine is almost always called with a parameter greater than one. Except after a timeout, when the worker pool has been emptied and needs to be started from scratch. Guess what, on Monday morning, a week after deployment, support starts getting loads of wierd bug reports. "I hate working on this flaky piece of $#!?", thinks our developer, and rolls up his sleeves again.
After getting burned a few times like this, the crusty old programmers in the team learn to ignore the comments. I've seen people build automatic filters to remove all comments when trying to understand a complex bit of code.
My questions are these. Given the obvious time pressure most of us work under, would it have been more valuable to the project to spend the time commenting this limit case, or writing a unit test for it? Would this problem (and, admit, it's not a particularly unlikely scenario) have been made better or worse if there were no comments at all?
[ July 10, 2002: Message edited by: Frank Carver ]
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Frank Carver:
We now have a comment which is completely wrong. And worse, it's not an easy one to spot, confusing Math.min with Math.max is common at the best of times.

In fact, I wondered wether you accidentally posted the unchanged code until you mentioned the difference... :roll:
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Ilja Preuss:

In fact, I wondered wether you accidentally posted the unchanged code until you mentioned the difference... :roll:

Hell, it *was* the unchanged code, wasn't it?
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
Perhaps I just haven't seen the light and my code is overly complex. I think it's reasonably well designed, although maybe I'm not quite in that top 1% of the field.

From what you have presented it doesn't seem to be overly complex to me. OTOH it might probably still be simpler. Well, at least I think *my* code *always* could...
I don't think that it matters wether you are "in the top 1% of the field", it just matters that you don't stop to strive to get even better.

It's 5 lines of code, but I think those 2 lines of comments help clarify what's going on.

Well, here is what I would have done to get rid of the comments:

Of course, if the ThreadPool got used by outside clients without access to the source code, I would still want to add some javadoc.

What do you think?

So you're saying that if you need to understand some code, instead of reading comments inbedded in the code, you should go read the test case code?

No, I am saying said *I* would rather read the test case code than the embedded comments.
Regards, Ilja
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
And there-in lies the complexity. Again, maybe I'm missing something in all this refactoring, but the result is it makes a large number of methods and there is a lot of indirection.

Yes. And a lot of classes, I might add.
Unless you have 80 character method names for all your methods, it can get comfusing what exactly each method does.

That is not my experience. A long method name might be an indication that the method wants to reside on another class, though.
Often I have to jump back and forth between methods to figure out what's going on. It's all this indirection and trying to hold a half dozen methods in my head (or anyone's head) at once, which is likely to cause misunderstanding and mistakes to be made.

I probably needed some time to get used to it, but I find that I don't need to hold half a dozen methods in my head. Generally I tend to trust objects and methods to do what their names tell me they do.
Of course, an extensive test suite which will immediately tell me about a misunderstanding, and modern tools like a decent class browser help a lot.
here is a typical class how I tend to write them (from a current hobby project). I think it is rather easy to understand despite the lack of comments, though it probably could be made even simpler.
See also http://c2.com/cgi/wiki?LotsOfLittleMethods for another discussion of this subject.
Regards, Ilja
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
It was brave of you to post that Ilja. I'm usually too ashamed of my own code to post it here. But now, both you and Mark have "put your code where your mouth is". I'll have to find some I'm happy enough with.
I sure felt a huge compulsion to "extract method" when I saw just how many times you use "game.getField()", though.
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
Oh, and could you show us the tests for that behaviour, to show how much we can learn form them about this code. Thanks.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Frank Carver:
It was brave of you to post that Ilja. I'm usually too ashamed of my own code to post it here. But now, both you and Mark have "put your code where your mouth is". I'll have to find some I'm happy enough with.



I sure felt a huge compulsion to "extract method" when I saw just how many times you use "game.getField()", though.

Mhh, yes - though I am not sure I am that happy with that game field at all...
This project in fact started as a test project to try some different design (and other development) practices. You should have seen it in its early state - we had just started reading the GOF book and Singletons were scattered all over the place... (in fact, the game and the field are in some way still suffering from this period, imo)
The good thing is, it is getting better and better through time - while adding functionality!
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Frank Carver:
Oh, and could you show us the tests for that behaviour, to show how much we can learn form them about this code. Thanks.

Yes, they are here and here. Please take into account that we are still learning how to best write these tests (and slowly recovering from our early mistakes - well, besides our current ones... ).
BTW, you can even browse the whole CVS repository of the project...
[ July 10, 2002: Message edited by: Ilja Preuss ]
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Frank Carver:

{franks counter-points}

Well therein lies the problem. Always assume comments lie, at least to the point that you assume the code is buggy. So then how can we use them?
Well drawing on comments from Cockburn's Agile Software Development which I was reading last night, we only need to be able to parse things, "well enough." We know that a bug could potentially have origins anywhere in the code. However, we skim through the classes and packages, trusting them to be correct, until we get to an class/method/section of code which seems like it's cloely related. That code, we inspect more carefully.
We do the same thing with code comments. Given a class in which you think there's a bug, you skip the methods which you think are unrelated, and focus on where you think the area is. You inspect that code line by line. You assume the other methods are correct, at least until the Sherlock Holmes deductive reasoning requires you to recheck your assumptions.
We add documentation because code is hard to read, and we make mistakes reading it. Trying to understand code by reading more code sounds dubious.
I can't imagine anyone who's had to maintain uncommented code, suggesting that we don't bother commenting.

Originally posted by Frank Carver:

My questions are these. Given the obvious time pressure most of us work under, would it have been more valuable to the project to spend the time commenting this limit case, or writing a unit test for it? Would this problem (and, admit, it's not a particularly unlikely scenario) have been made better or worse if there were no comments at all?

Good point. That's why when your under time pressue you should always skip the unit tests and documentation. I mean, the code is more important, right? It'll probably work so doesn't need the tests, and it should be clear enough so save some time on documentation. Besdies, I'm sure engineers will be all to happy to go back and write the tests and documentation later. Engineers are very thorough like that! Ooohhh, ooohhh, better yet, if you use single letter variable and method names, you'll code faster because you'll be pressing fewer keys!
For the same reason you cringed when I told you to skip the tests, you should similarly cringe when you skip the documentations. Just as code is not compelte until it's documentated, it's also not complete until it's been unit tested. Skipping either one is a recipie for disaster.

--Mark
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
We add documentation because code is hard to read, and we make mistakes reading it.

Well, that is one way to react to hard to read code. The other is to make the code easier to read.
I like the metaphor Martin Fowler uses in his Refactoring book. He talks about indicators for code that should be refactored as "smells". In this metaphor, comments are "perfume" - they don't remove the smell, but they cover it.
Of course, sometimes there is no other way than commenting code (or at least, we don't see an other way at the moment). The it is certainly better to have the comment than to have nothing.
Nevertheless, I would always see a comment as a missed opportunity to bring the code in a better state.
I can't imagine anyone who's had to maintain uncommented code, suggesting that we don't bother commenting.

Oh, I can!

Good point. That's why when your under time pressue you should always skip the unit tests and documentation. I mean, the code is more important, right? It'll probably work so doesn't need the tests, and it should be clear enough so save some time on documentation.

Well, in contrast to writing the comments, writing the tests will make me go faster, so it would be really stupid to skip them, wouldn't it? The same with refactoring the code to a clear state. (Mhh, that doesn't mean that I never fall into this trap myself... :roll: )
[ July 10, 2002: Message edited by: Ilja Preuss ]
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Ilja Preuss:

Can you give an example of what you mean by the "why", please?

The why can be for many reasons:
1) There may be multiple implamentations.
Consider the choice of a vector or array. You choose one, and you mention that reason why, e.g. we can't knwo how many records will be returned by the DB. This is probably a fairly low implamentation detail, which may not be covered in higher level docs. However, if the requirements change, e.g. the DB will always return 20 records at a time, then maybe an array would be a better choice. (We'll ignore other issues for why it won't be, this is a trivial example.)
2) The why may have to do with contraints.
For example, maybe there's a pre-condition, post-condition, or invariant, which should be met by the code or method.
3) The why may simply be there for clarity.
It will explain what the author was thinking.

--Mark
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Ilja Preuss:

Of course, if the ThreadPool got used by outside clients without access to the source code, I would still want to add some javadoc.

What do you think?

I think Frank's example of how coders muck with each other's code makes my point.
You don't need to comment becaue the class is internal. But then, some developer learning the value of resuse, makes the methods less private and starts calling them. Perhaps the methods get moved to new classes. In any case, now the methods are being used by other people. Don't say it will never happen, it happens more often then not. Now you start to have undocumented production code.
But hey, code maintanence is always the easy part, and I doubt the lack of comments will hinder it much. :-p

--Mark
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Ilja Preuss:

Well, that is one way to react to hard to read code. The other is to make the code easier to read.

I think you missed my point. Again, maybe I'm missing something, but I don't think that's realistic. I think code is inherently hard to read. With all due respect, I see that statement as equivalent to:
Code is error prone.
Well, maybe you should write higher quality code.
It's not so easy to do.
Originally posted by Ilja Preuss:

Well, in contrast to writing the comments, writing the tests will make me go faster, so it would be really stupid to skip them, wouldn't it? The same with refactoring the code to a clear state. (Mhh, that doesn't mean that I never fall into this trap myself... )

That was my point. I got faster when I write comments, for two reasons...
1) I write comments before I write code. It gets me thinking about what I'm going to do, and structures my thoughts.
2) I feel more confident about the code I do write, because I know I can come back to it and quickly understand what's going on. I can also better integrate with existing code, because I can see what's happening there, too.
Do these arguments sound familiar? ;-)

--Mark
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
1) There may be multiple implamentations.
Consider the choice of a vector or array. You choose one, and you mention that reason why, e.g. we can't knwo how many records will be returned by the DB. This is probably a fairly low implamentation detail, which may not be covered in higher level docs. However, if the requirements change, e.g. the DB will always return 20 records at a time, then maybe an array would be a better choice. (We'll ignore other issues for why it won't be, this is a trivial example.)

I see two options here:
a) the choice is driven by a requirement
In this case, the choice should be documented by at least one test - if I change the implementation so that the requirement isn't fulfilled any longer, a test should fail.
b) the choice is arbitrary
Well, why should we care about?

2) The why may have to do with contraints.
For example, maybe there's a pre-condition, post-condition, or invariant, which should be met by the code or method.

Can you come up with such a constraint that can't be documented by the code itself or in unit tests?

3) The why may simply be there for clarity.
It will explain what the author was thinking.

Which thoughts are critical for understanding the code but cannot be documented by the code itself? Do you have an example?
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
You don't need to comment becaue the class is internal. But then, some developer learning the value of resuse, makes the methods less private and starts calling them. Perhaps the methods get moved to new classes. In any case, now the methods are being used by other people.

You mean you have collective code ownership between team boundaries? (And I am meaning team in the sense of XP here - with tight collaboration and heavy communication.) I don't wonder that such a thing wouldn't work well...
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
I think you missed my point. Again, maybe I'm missing something, but I don't think that's realistic. I think code is inherently hard to read. With all due respect, I see that statement as equivalent to:
Code is error prone.
Well, maybe you should write higher quality code.
It's not so easy to do.

Well, it's not so hard to do, once you know how to do it.
But if you don't believe it can be done, you will probably be right...

That was my point. I got faster when I write comments, for two reasons...
1) I write comments before I write code. It gets me thinking about what I'm going to do, and structures my thoughts.
2) I feel more confident about the code I do write, because I know I can come back to it and quickly understand what's going on. I can also better integrate with existing code, because I can see what's happening there, too.
Do these arguments sound familiar? ;-)
[/QB]

*Very* familiar, indeed. In fact I discovered "comment-first development" in high school and was a big fan of it. I dropped it when I started test-first development - it just became redundant.
If I had to drop tests for some reason, I probably would resort to comments. I hope I never have to, as it would make me less efficient.
[ July 10, 2002: Message edited by: Ilja Preuss ]
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
It occurred to me in a flicker of inspiration today, a model of why I see code and unit tests as being more important than comments.
I don't know if any of you guys have done much accounting, but vital to all modern accounting is the idea of double-entry bookkeeping. In a double-entry system, every transaction appears in two places. You can gain a massive amount of confidence in the accounts by ensuring that the totals "balance" - the sums of the two places are the same.
Of course, it's still possible to miss something completely, so it never appears in either column, but that's what auditing (customer acceptance testing) is for. An implementation error on either side of a transaction will show up immediately; if I transcribe two digits in a number, the two totals are different.
With comments, this is simply not the case. There is no counter-check for the validity of a comment. If a developer forgets, loses concentration, or just plain screws up a bit of tested code, or a bit of implemented test, the system goes out of balance, and the next test run will show it. If a developer forgets, loses concentration, or just plain screws up a comment, nothing happens at all, the screw-up is delivered.
If we take Mark's viewpoint, that comments are an important part of the software, this is a worrying situation. We have just delivered an unknowingly faulty system.
If we take Ilja's viewpoint, that comments are largely a waste of space, then there was no comment to screw up, so the system is still in good shape.
Don't get too smug, though, Ilja.
What happens if a developer forgets, loses concentration, or just plain screws up choosing a name for something, or which class a method should belong to, or whatever? The tests are coded to test the same names and responsibilities, so they pass.
If we take Ilja's viewpoint, that naming and responsibilities are important parts of the software, then we've just delivered an unknowingly faulty system.
If we take Mark's viewpoint that commented, working, code is what matters, there's no issue. The system is still in good shape.
Hmm. Anyone want to disagree that naming and grouping of code are just comments, with all the advantages and disadvantages thereof?
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Frank Carver:
Don't get too smug, though, Ilja.

Well, I always endeavor to not get too smug. If I fail (which I certainly do sometimes), please nudge me...

What happens if a developer forgets, loses concentration, or just plain screws up choosing a name for something, or which class a method should belong to, or whatever? The tests are coded to test the same names and responsibilities, so they pass.
If we take Ilja's viewpoint, that naming and responsibilities are important parts of the software, then we've just delivered an unknowingly faulty system.
If we take Mark's viewpoint that commented, working, code is what matters, there's no issue. The system is still in good shape.

Well, only if those same developer *didn't* get inattentive regarding the comments. Is that a reasonable thing to expect?
Also, let's suppose we just *did* screw up - which one is more likely to get noticed (and fixed) later on, the screwed up code or the screwed up comment?
But it seems to me that Mark *would* agree that readable code would be better than comments - after all, resorting to comments to understand code also inflicts an indirection. Additionally, holding the code clean makes it also easier to extend it, so I would be surprised hearing Mark say that "working code with good comments" equals "good enough code". I didn't read something like that from his posts, at least...
So for me the debate seems to be more about wether it would be *possible* to write code that doesn't need comments than wether it would be preferable.

Hmm. Anyone want to disagree that naming and grouping of code are just comments, with all the advantages and disadvantages thereof?

Yes.
IME the naming of code is far less easy to ignore, because I simply *have* to read it when I want to work on it.
And the grouping of code also has some impact on reuse - smaller, more focused units tend to be easier to be reused.
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Ilja Preuss:

a) the choice is driven by a requirement
In this case, the choice should be documented by at least one test - if I change the implementation so that the requirement isn't fulfilled any longer, a test should fail.

I agree.
Originally posted by Ilja Preuss:

b) the choice is arbitrary
Well, why should we care about?

Well, let's say performance is an issue. In my example you choose an array over a vector for performance reasons. Now performance is a requirement and there are specific test for it, but those tests will not delve into this level. performance is one of those cross-system soft requirements which is not implamented in a single location. So you need notes as to why. The same applies to decisions for security, scalability, etc.
--Mark
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Rather then respond to all the individual points, I'm going to surface back up to the higher issues.
Code is hard to write, full of bugs, and hard to maintain. I believe this to be fundamentally true in our current paradigm. Perhaps some day well have sufficent tools and/or methodologies to write bug-free code, but not today. Look at the bug rates and maintanance costs of current software to confirm this.
I do not believe we can write "simple code." That is to say, any and all code we write will be complex to the point that someone else looking at will have to spend non-trivial time trying to understand it and is likely to misread and or misunderstand the points.
Using more code to clarify the situation, i.e. looking at the test code to understand what's going on, in my opinion, is heading in the wrong direction. Can bas documentation cause trouble? Damn straight it can. Can poor variable names cause trouble? Right again. Does this mean we should get rid of variables? Of course not, it means we should create guildelines and follow them.
Documentation, if used properly, can make code easier to understand and maintain. Oh, but wait, people don't like to write documentation and won't update it. Now take that previous sentence and replace "documentation" with "tests." If you can get programmers to do one, you can get them to do the other.

--Mark
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Ilja Preuss:

You don't need to comment becaue the class is internal. But then, some developer learning the value of resuse, makes the methods less private and starts calling them. Perhaps the methods get moved to new classes. In any case, now the methods are being used by other people.
You mean you have collective code ownership between team boundaries? (And I am meaning team in the sense of XP here - with tight collaboration and heavy communication.) I don't wonder that such a thing wouldn't work well... <http://www.javaranch.com>


How many slapped together pieces of demo code end up in production environments?
How many code hacks never got fixed the next day?
How many comments were never written later?
How many tests are going to be written tomorrow?
How many things will be cleaned up after release?


I promise you this, your code will be used by people you never met, in ways you never intended. Thinking you don't need to comment the code because no one else will see it/use it is a Bad Idea(TM).
(This presumes you buy my previous argument that comments are necessary.)

But to answer your question directly... at my last company, we had roughly 4 teams, although teams people and teams were very dynamic. People often would modify each other's code. Not unilaterally, they would usually check first with the original author. You think the original author remembers what he wrote 4 months back, if it's uncommented? I certainly don't.

--Mark
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
I promise you this, your code will be used by people you never met, in ways you never intended. Thinking you don't need to comment the code because no one else will see it/use it is a Bad Idea(TM).
(This presumes you buy my previous argument that comments are necessary.)

I don't buy that many comments are necessary as long as the person working on the code can resort to
a) the well formed code itself
b) its unit tests
c) and a person familiar with the code

But to answer your question directly... at my last company, we had roughly 4 teams, although teams people and teams were very dynamic. People often would modify each other's code. Not unilaterally, they would usually check first with the original author. You think the original author remembers what he wrote 4 months back, if it's uncommented? I certainly don't.

I think that he would remember very fast once he sits down with the partner to browse through the code to explain it.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Mark Herschberg:
Code is hard to write, full of bugs, and hard to maintain. I believe this to be fundamentally true in our current paradigm.

Well, I don't know your paradigm very well, so it may be true.
Perhaps some day well have sufficent tools and/or methodologies to write bug-free code, but not today. Look at the bug rates and maintanance costs of current software to confirm this.

Well, there are XP teams reporting very low bug rates. I have heard of one whose members didn't know wether the bug database was still working, because they didn't use it anymore (they had dozens of bugs per week before adopting XP, iirc)...
And there are people claiming that adopting XP does in fact involves a paradigm shift.

Documentation, if used properly, can make code easier to understand and maintain.

The same is true for the code itself.
It is *my* strong believe (and experience) that concentrating on the code pays back much more than concentrating on documentation - even so much that with really well crafted code, most additional documentation presents much less benefits than costs.
Oh, but wait, people don't like to write documentation and won't update it. Now take that previous sentence and replace "documentation" with "tests." If you can get programmers to do one, you can get them to do the other.

I don't think so - you will have a hard time to get me writing inline documentation, but I willingly write tests over tests. That is because I personally gain so much from the tests - they make my development live much easier *at the time I write them*. (http://c2.com/cgi/wiki?TestInfected)
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Tests for documentation ?