File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Testing and the fly likes when are broken unit tests ok? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » Testing
Bookmark "when are broken unit tests ok?" Watch "when are broken unit tests ok?" New topic
Author

when are broken unit tests ok?

Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 31079
    
163

Page 117 of "Next Generation Java Testing - TestNG and Advanced Concepts", discusses broken tests in the "real world" and states "every day, probably a small percentage of tests are broken for a variety of reasons." This confuses me. On my team, we only have one reason for an end to end test to be broken - and even then it can only be for a couple days straight and they have to inform everyone working on the app. Our reason is when there is a large underlying database refactoring that breaks the end to end tests. Back end tests get fixed right away and higher level tests get fixed over the next couple days.

What other reasons have people experienced in practice for having broken tests (that don't get fixed within a day of when noticed) ?

The book lists:
1) more impotant things to do - "sometimes deadlines need to be met, even if it's at the expense of tests"
2) depend on another developer's code before can fix test
3) Code owner not available to address test breakage
4) Testing a feature that currently being implemented or improved

And my comments on them (lest my teammates read this and think I am endorsing them.)
1) I disagree with this. I'm sure the author's don't mean this the way it sounds. To me, it sounds like it is ok to meet deadlines at the cost of quality. The goal is to get the code into production not to meet a date and fix subtle bugs later.
2) I think our team's exception is a variant of this. The first developer intentionally causes the test to break and announces it. Then we go and fix the remaining tests.
3) They were available to break the tests. Seriously, either someone else can fix the code/test or they can rollback the offending code. If someone releases code and goes on vacation for 3 weeks, do we sit on our hands until they come back? I have made an exception to this twice (in five years) as a training thing. I was able to see that the problem was in the tests, not the code and let it sit two days (rather than one) until the developer came back. Once was for tests they forgot to commit and once was a training thing.
4) If the feature isn't implemented yet, why would the test be in the test suite? And if it is being improved, it should at least do what it did before.

I'm particularly interested in this because I want to make sure our team policy isn't too restrictive.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Craig Bayley
Ranch Hand

Joined: Sep 27, 2007
Posts: 46
I'd be interested to read whether that was intended as an endorsement or simply a statement of how things often are. My place are currently going through a (sometimes painful) reeducation process so that testing becomes a core part of people's development thought process, rather than something to do to stop team leaders complaining.

Is there a section on 'Psychology of testing - pumping up your teammates for a brighter future!'

Now that's a section I could get excited about...
Jeff Langr
author
Ranch Hand

Joined: May 14, 2003
Posts: 762
Greetings Jeanne,

Originally posted by Jeanne Boyarsky:
What other reasons have people experienced in practice for having broken tests (that don't get fixed within a day of when noticed) ?


I currently am at a shop where they had one failing unit test not terribly long ago. "Deadline!" was of course the reason. The number of failing tests grew to over 100 within a few months.

Once one test is "allowed to" fail, developers quickly figure that it's ok for other tests to fail too, particularly those who don't see much value in tests. Laziness takes hold quickly. It's like warnings--as soon as there are regularly more than zero, there will quickly be thousands.

Perhaps it's ok to have failing tests from time to time; in fact, it's reality. What's not ok is to not put in place some sort of process for what to do when one does fail when it's time to ship. I've found a triage board, with representatives that include not just programmers but testers, management, etc., to be reasonably effective.

If it's a broken test because there's a real problem, we should do some quick analysis of the potential impact, and admit that we're shipping a defect if we choose to do nothing. If it's a broken test because it's a crappy test, maybe we should just toss the test. But I think there are much better places to expend effort than unit tests if we're going to toss them when they might be most valuable.

1) I disagree with [the authors' notion that "sometimes deadlines need to be met, even if it's at the expense of tests"]. I'm sure the author's don't mean this the way it sounds. To me, it sounds like it is ok to meet deadlines at the cost of quality. The goal is to get the code into production not to meet a date and fix subtle bugs later.


I think that depends on how much disdain we have for our customer. :-) Or what customer expectations are for software. I work in an environment where a single significant defect costs millions of dollars within minutes. Hopefully they're not "just tests" but a real reflection of whether or not the software actually works.

We should be so honest as to say "sometimes deadlines need to be met, even if it's at the expense of shipping defects."

Regards,
Jeff


Books: Agile Java, Modern C++ Programming with TDD, Essential Java Style, Agile in a Flash. Contributor, Clean Code.
Cedric Beust
author
Ranch Hand

Joined: Oct 12, 2004
Posts: 46
Hi Jeff,


Perhaps it's ok to have failing tests from time to time; in fact, it's reality. What's not ok is to not put in place some sort of process for what to do when one does fail when it's time to ship.


Agreed, and you can use TestNG's groups for that.

People typically put tests that are currently broken in a specific group, say "broken", and they always exclude this group from all runs.

When the deadline approaches, you take a look at the latest TestNG reports, read the list of methods that are in that group and start fixing them one by one. This gives you a very clear exit criterion for when to ship, and for some companies, it means "No tests left in the 'broken' group" while others decide this on a case per case basis, with a committee of people like the one you describe.


If it's a broken test because there's a real problem, we should do some quick analysis of the potential impact, and admit that we're shipping a defect if we choose to do nothing.


A broken test doesn't necessarily mean there's a defect.


To me, it sounds like it is ok to meet deadlines at the cost of quality. The goal is to get the code into production not to meet a date and fix subtle bugs later.


Again, a broken test doesn't mean the quality went down. What if a unit test is failing but a functional test that covers this part of the code is passing? I'd be fine shipping in this condition, because functional tests serve users while unit tests serve programmers, and users always come first. After shipping, it's up to me to fix my unit test, but in the meantime, I'm pretty sure I shipped code without any regressions.

--
Cedric
Hani Suleiman
author
Greenhorn

Joined: Nov 18, 2007
Posts: 22
As an aside, my personal experience (so usual disclaimer applies) is that when tests break, a significant percentage of the time, its not indicative of any actual bugs, other than a broken test. Sometimes the code changes but the test isnt updated. Obviously this isnt a good thing, but it isnt the end of the world for the developer to mark this as something that should be addressed.

As Jeff mentioned though, this approach can only work if you have regular triage and a process where these are not allowed to accumulate beyond a certain point.

Basically, nobody likes broken tests, but the reality of the work we do means that they do happen (sadly), and that they shouldnt get in the way of delivering software!
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Originally posted by Cedric Beust:
Again, a broken test doesn't mean the quality went down. What if a unit test is failing but a functional test that covers this part of the code is passing? I'd be fine shipping in this condition, because functional tests serve users while unit tests serve programmers, and users always come first. After shipping, it's up to me to fix my unit test, but in the meantime, I'm pretty sure I shipped code without any regressions.

What if your unit tests are simulating conditions that are likely to happen every now and then in production but weren't tested by the functional tests?

The only times when I've found a broken test take too long to fix that it couldn't be done before shipping, the test has either exposed a defect or been so unfathomable that it's worthless (because you can't tell what it's actually testing).

What I'm saying is, in my experience, fixing the tests is so quick and the cost of the broken window syndrome so damaging that I'd be hard pressed to suggest shipping with broken tests.


Author of Test Driven (2007) and Effective Unit Testing (2013) [Blog] [HowToAskQuestionsOnJavaRanch]
Jeff Langr
author
Ranch Hand

Joined: May 14, 2003
Posts: 762
Greetings Hani and Cedric,

A broken test doesn't necessarily mean there's a defect.


Correct, that's what I said. It's generally either a defect (happens once in a rare while), a configuration problem (happens often), or a crappy test (happens too often). I work lots on getting people to understand how to turn crappy tests into more useful, less painful ones.

Basically, nobody likes broken tests, but the reality of the work we do means that they do happen (sadly), and that they shouldnt get in the way of delivering software!


In general, yes, but substitute the word "defect" for "test," and many people will disagree. It depends on the nature of the software. Most defects in my current place means that software should not and cannot ship. Ignoring potential failure points is not a consideration given the extreme cost of failure. Our triage first determines if the estimated cost of *potential* failure is significant; if so, we fix either the test or the defect, whichever it happens to be. If not, we scrap the test.

At some point, the investment in unit testing is of questionable value if the tests don't mean much at critical junctures.

Regards,
Jeff
Hani Suleiman
author
Greenhorn

Joined: Nov 18, 2007
Posts: 22
Jeff, I'm curious, did you always have automated tests?

In my experience this is a relatively new development, where developers are empowered to write tests. Previously (and still by far most common) is QA engineers, people whose sole job is to do testing. In such cases, a failure is not dismissed because the science of QA has progressed enough that you know there's a real issue (also due to factors such as real people testing the actual product, instead of treating it like a white box the way tests do).

On the other hand, when developers are writing the tests themselves, we definitely have different categories and different 'alarm' levels. None of us ship products that are 100% bug free, the more you know what bugs you're shipping, the better off you are. So for this case, isnt it nice to have tests that fail to show that something is a known bug? So when someone comes around to fixing it, there's a test to verify the fix all ready. This to me suggests a 'knownbug' group for example, and a triage group could regularly review these and see if its feasible to start fixing them for the next release, and so on.

Traditionally, people would comment out such tests, or even not write them until the fix is ready to be implemented. Having a test that is known to break is far more valuable and informative that either of those solutions.

I dont think it's ever OK to have a random test failure and shrug it off. It's important to always know why a test is failing. What we're saying is that you dont have to drop everything to fix it, you already have additional information provided to you by a failing test, so are better off than not having it at all.
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 31079
    
163

Originally posted by Jeff Langr:
Once one test is "allowed to" fail, developers quickly figure that it's ok for other tests to fail too, particularly those who don't see much value in tests. Laziness takes hold quickly. It's like warnings--as soon as there are regularly more than zero, there will quickly be thousands.

This is one of the things that is important to me. We seem to have a boolean state - not just for testing - but for other metrics (like coverage and static analysis.) If the build passes, everyone is happy and will ignore any metrics. A "failure" calls a metrics drop to attention.

Perhaps it's ok to have failing tests from time to time; in fact, it's reality. What's not ok is to not put in place some sort of process for what to do when one does fail when it's time to ship. I

I recognize it happens. What I'm not ok with is things staying broken. It reminds me of the sentiment that it's ok to move on to the next task and leave the first one half completed. If the tests are really so useless that they don't need to pass, we shouldn't be writing them. And if they aren't useless, building a huge pile of technical debt leaves an overwhelming task to get them working. At which point, it is likely to never happen. The thousands of warnings that Jeff mentioned

We should be so honest as to say "sometimes deadlines need to be met, even if it's at the expense of shipping defects."

Assuming the customer agrees that those defects can wait (which happened for us recently.) The catch is that our defects weren't things the tests caught - which meant we found out about them right before deployment. Had we found out about them earlier in the game through unit tests, they never would have been committed as defects.


Originally posted by Cedric Beust:
When the deadline approaches, you take a look at the latest TestNG reports, read the list of methods that are in that group and start fixing them one by one.

I think this is part of the reason why we aren't in this boat. We do continuous integration when it comes to compiling and unit tests (and a couple other things.) It isn't ok for us to defer tests to right before the release. We want the code to be in a good state throughout. I've discussed this with a few people when they were new to our team. It took a while to get in the mindset of completing a feature before going on to the next thing. It also took a while to convince people that they benefited more from writing the tests with the code than afterwards.

A broken test doesn't necessarily mean there's a defect.

I disagree. A broken test doesn't necessarily mean there is a defect in the code. It does mean there is a defect somewhere because the test is stale. If the developer who left it in that state goes on vacation/wins lotto/gets hit by a truck, the developer taking over the task loses a valuable safety net. They may know that the test was failing, but they don't have the knowledge that the first developer had. The second developer has to guess why the test is broken and hope they aren't making things worse. The receiving end of this is scary. And it's avoidable by completing the task (including unit tests) in the first place.

What if a unit test is failing but a functional test that covers this part of the code is passing?

And as devils advocate, why write the unit test at all and do everything through functional tests? The reason we write unit tests is the reason they should work. This also falls into the maintenance/technical debt category I mentioned above. New employees/consultants/contractors also have to deal with this. If a test is broken for an unknown reason, it takes longer to get up to speed. Or the bad practice gets copied. I have to admire how new people manage to find the worst practice we have and use it as an example.

After shipping, it's up to me to fix my unit test,

And everyone finds the time to do that? What tends to happen to us is that we get a new set of tests and don't have time to recover the technical debt.

And from my recent personal experience: I found a potential defect in code review. The author decided to defer fixing it and went on vacation. During that period, I spent 2 hours hunting down an error in that code base, which wound up having a root cause of that code review comment that I had identified a few weeks earlier. I want that two hours of my time back! The parallel here is that we can judge a unit test to be deferred, but we don't always know the impact; especially if it is something subtle. So we say we will fix it "later." Now the user reports a seemingly unrelated defect. We investigate and fix because a defect is clearly higher priority than something that might be one. At this point, the test is STILL broken and the cycle is free to continue.

Originally posted by Jeff Langr:
At some point, the investment in unit testing is of questionable value if the tests don't mean much at critical junctures

This reminds me of something that my co-worker says: "If we throw the process out the window when we are busy/stressed/under pressure/deadline, it isn't really our process." On our team, we have reached the point where the tests and build are considered valuable and we don't stop doing them because there is a deadline. We don't stop doing them for production problems either. We perceive the risk of something slipping in to be even higher when we are under that kind of pressure/deadline.

Originally posted by Hani Suleiman:
This to me suggests a 'knownbug' group for example, and a triage group could regularly review these and see if its feasible to start fixing them for the next release, and so on.

If I'm understanding this, it means we would declare a task done having knowingly introduced a defect? Tests mainly break when we change code. Allowing tasks to be fuzzily completed could be hiding other defects. Which means we don't even have all the known bugs.

Traditionally, people would comment out such tests, or even not write them until the fix is ready to be implemented

I think there are two categories here. My intent with starting the thread was regressions - the ones that shouldn't be commented out (or left broken in my mind.) New tests to illustrate a bug are definitely useful.
Andrew Och
Ranch Hand

Joined: Mar 19, 2004
Posts: 32
One of our more important customers has a problem, our PM in the US calls us and says this problem must be resolved immediately (7.30am Shanghai). So we must drop everything we are doing, i.e. "But couldn't I just finish this.."
"NO!"

So I leave a task half done and move onto the higher priority issue.

Posted by Jeanne Boyarsky :
What I'm not ok with is things staying broken. It reminds me of the sentiment that it's ok to move on to the next task and leave the first one half completed.


And then its 7pm in the evening and I want to go home and my block of code has broken several tests. Its been a long long day. So it stays broken until things calm and I get a chance to fix them. I don't want to roll back, because it is just little things left to do.

Staying broken, means that each regression run nags me to fix what I broke.
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 31079
    
163

Originally posted by Andrew Och:
And then its 7pm in the evening and I want to go home and my block of code has broken several tests. Its been a long long day. So it stays broken until things calm and I get a chance to fix them. I don't want to roll back, because it is just little things left to do.

Isn't there a middle ground here - namely, leave the code and tests uncommitted until the next day when you have time to complete them. While I don't think this practice should span multiple days, I don't see the harm in having a bit of uncommitted work. Especially when I start a task an hour before my day ends. I'd rather leave the code committed (and risk the slight chance of losing it to a crash) than commit a task that is clearly in progress.

Do you have a policy that says you have to roll back every night no matter what?
Hani Suleiman
author
Greenhorn

Joined: Nov 18, 2007
Posts: 22
The problem with allowing significant chunks of uncommitted code is that the habit becomes easy to fall into, and in my experience, you very quickly end up with people not committing for days.

It's much better (and safer) to commit regularly, and just make sure your tests are tagged appropriately.
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 31079
    
163

Originally posted by Hani Suleiman:
The problem with allowing significant chunks of uncommitted code

Who said anything about significant chunks? I'm talking about a couple hours of work. I agree that significant chunks of uncommitted code. if this happens, it is a smell that the developer should think about how to break the task down into smaller committable pieces.



It's much better (and safer) to commit regularly

Surely you aren't saying it is safer to commit code in a random broken state to the HEAD for others to pull in just to commit it? This seems dangerous to the other developers. What if someone needs to rebuild their workspace and can't trust the head? It's not just the unit tests being broken that matters. It's the whole workspace being in a potentially bad state.
Pradeep bhatt
Ranch Hand

Joined: Feb 27, 2002
Posts: 8919

If we are doing TTD wouldn't there be many broken tests because the features are still not implemented.


Groovy
Jeff Langr
author
Ranch Hand

Joined: May 14, 2003
Posts: 762
Originally posted by Hani Suleiman:
I dont think it's ever OK to have a random test failure and shrug it off. It's important to always know why a test is failing. What we're saying is that you dont have to drop everything to fix it, you already have additional information provided to you by a failing test, so are better off than not having it at all.


I don't think we're saying that much that is different. However, the way the team reacts to failing tests may be different depending on circumstances and what the team values.

The problem with broken tests is that they waste time, they cost time to develop in the first place, and they represent increased risk. The strategy I've seen work better is that the team gets in a regular habit of minimizing broken tests as soon as they occur, and they do learn how to get better at building them. The strategy I've seen waste more time is the phasist approach where we try to integrate everything close to the point of delivery, and find out that there are problems and broken tests, and then, due to time pressure we have little choice but to ignore them. YMMV.

Jeff
Jeff Langr
author
Ranch Hand

Joined: May 14, 2003
Posts: 762
Originally posted by Pradip Bhat:
If we are doing TTD wouldn't there be many broken tests because the features are still not implemented.


Greetings Pradip,

I presume you mean TDD. If doing TDD, you build tests incrementally, not in chunks up front. One test (or even a portion of a test) is coded, then appropriate implementation for that test is built, before proceeding to coding the next test. In contrast, test-after development affords the freedom of writing the tests at any time, in any order, and to any extent, after the code has been built.

There are other tradeoffs that have been discussed elsewhere. In this forum, we're not making a presumption about TDD vs. test-after. What we're talking about instead is the inevitability that sometimes, people will check in code that makes existing tests break.

Jeff
Jeff Langr
author
Ranch Hand

Joined: May 14, 2003
Posts: 762
Originally posted by Hani Suleiman:
Jeff, I'm curious, did you always have automated tests?

In my experience this is a relatively new development, where developers are empowered to write tests. Previously (and still by far most common) is QA engineers, people whose sole job is to do testing. In such cases, a failure is not dismissed because the science of QA has progressed enough that you know there's a real issue (also due to factors such as real people testing the actual product, instead of treating it like a white box the way tests do).


I've been writing automated tests on a personal basis for about 15 years, so no, not always. (I've been using more holistic automated testing approach for about eight years.)

You haven't worked with the same QA people I have. :-) Many of them don't treat it like a science, unfortunately. We sometimes get failure reports that are instead operator error. But I hear your point. My point is that we can get better, much better, in fact, at the "science" of automating (non-exploratory) testing. Perhaps TDD is not the solution, but if we're going to make software development closer to an engineering science, nor is the idea of a non-disciplined approach that says developers are smart enough to figure out what, when, and how things get verified.

Jeff
Cedric Beust
author
Ranch Hand

Joined: Oct 12, 2004
Posts: 46
Originally posted by Pradip Bhat:
If we are doing TTD wouldn't there be many broken tests because the features are still not implemented.

Yes, although you usually don't commit anything until you have tests that don't break (even if it means they're still empty or not testing much yet).

Having said that, I still confess a certain skepticism toward TDD, which you will find echoed in the book.

While I agree that TDD is a good practice to make someone more aware of what "testable code" is (if they have to write the tests first, they will be forced to make their code testable), I am still concerned that the "dark side" of TDD is hardly ever brought up by its proponents.

Some of the problems I see with TDD are:

- It's not intuitive. This is not necessarily a bad thing, but it's not always easy to ask developers to do something against their training.

- It forces you to spend a lot of time with broken code. By definition, you write a lot of code that at first doesn't even compile and then, invokes code that is empty. Doing that basically means you are foregoing a lot of the powerful features that IDE's offer (automatic completion, browsing, etc...) since IDE's do very poorly with files that don't compile.

- It's sometimes much less practical than testing last (try doing TDD on graphical user interfaces or mobile phones).

- It can generate a lot of churn (testing code that is quickly going to be scrapped and replaced).

- I haven't seen any evidence that in the hands of an experienced developer, code that is written test first is necessary of better quality than code written test last.

- TDD tends to promote micro-design (designing at the method level) over macro-design (thinking ahead of time about a class hierarchy and starting to implement it even though it might not be needed at this early stage).

- It can make developers oblivious to the fact that the code they are writing is for users and not for themselves.

There is much more to say on this subject and we dive into more details in our book, but overall, I tend to recommend to learn about TDD but consider it like a tool that can *sometimes* (but not always) lead to better code.

Testing first or testing last doesn't matter too much as long as you do write tests.

--
Cedric
Jeff Langr
author
Ranch Hand

Joined: May 14, 2003
Posts: 762
Greetings Cedric,

Originally posted by Cedric Beust:
- It's not intuitive. This is not necessarily a bad thing, but it's not always easy to ask developers to do something against their training.


Like learning OO, for example... Usually the next generation of how we view things is not intuitive.

It forces you to spend a lot of time with broken code. By definition, you write a lot of code that at first doesn't even compile and then, invokes code that is empty. Doing that basically means you are foregoing a lot of the powerful features that IDE's offer (automatic completion, browsing, etc...) since IDE's do very poorly with files that don't compile.


I'm not following. The production code has to get there one way or another. Either you type the production code in first (at which point you have the same problem), or you write a test where you type it in as presumed, then use the IDE to generate the stubs.

It's sometimes much less practical than testing last (try doing TDD on graphical user interfaces or mobile phones).


Not sure about the mobile phones or even why, you'd need to explain this claim. I've done TDD with GUIs, and it's more effective. Most GUI code written using test-after development (TAD) is several times larger than it need be. Doing TDD pushes you into the mold of incremental, continuous refactoring. I've ended up building nice abstraction layers atop things like Swing and as a result dramatically simplified the GUI code. This wouldn't have happened with test after (because I would've just not bothered writing the tests, and therefore wouldn't have felt compelled to refactor).

The other interesting thing that happens is that if developers learn that GUIs don't really need to be tested, they tend to let more logic ooze into them, to the detriment of the overall design and testability.

It can generate a lot of churn (testing code that is quickly going to be scrapped and replaced).


True. One tradeoff is in minimization of debugging sessions. The other notion is that TDD promotes a more incrementalist approach to software development, whereby at any point in the time the system is considered shipworthy. It's a different philosophy.

I haven't seen any evidence that in the hands of an experienced developer, code that is written test first is necessary of better quality than code written test last.


I have seen the evidence, although I'm sure we could debate the point of being "better quality."

TDD tends to promote micro-design (designing at the method level) over macro-design (thinking ahead of time about a class hierarchy and starting to implement it even though it might not be needed at this early stage).


Correct. That's the philosophy distinction again. TDD promotes the notion of incremental completion. Sometimes this results in additional churn, as mentioned before, but sometimes it results in costs savings and minimization of unnecessary complexity in the system. It's a tradeoff. TDD is the slow-and-steady tortoise and ad hoc test-after is the hare.

It can make developers oblivious to the fact that the code they are writing is for users and not for themselves.


That's interesting, and I'd love to hear more. I haven't seen this happen in practice. If it does, it means that developers are emphasizing TDD as a silver bullet, to the detriment of other good practices. Things like this can certainly happen on a dysfunctional team, but then again that's the case with anything.

There is much more to say on this subject and we dive into more details in our book, but overall, I tend to recommend to learn about TDD but consider it like a tool that can *sometimes* (but not always) lead to better code.


I think that's a reasonable approach. Skepticism is good unless it prevents you from opening up to other approaches. I've done both approaches, TDD and TAD, and I'm still seeking something better.

Testing first or testing last doesn't matter too much as long as you do write tests.


In theory, I agree. In practice, it usually makes a big difference in what really happens.

Jeff
Eric Nielsen
Ranch Hand

Joined: Dec 14, 2004
Posts: 194
Originally posted by Jeff Langr:

Originally posted by Cedric Beust:
[QB]
It can make developers oblivious to the fact that the code they are writing is for users and not for themselves.



That's interesting, and I'd love to hear more. I haven't seen this happen in practice. If it does, it means that developers are emphasizing TDD as a silver bullet, to the detriment of other good practices. Things like this can certainly happen on a dysfunctional team, but then again that's the case with anything.
[/QB]


Well in part I think its also a strength -- when the developer of a class/module/service is also a consumer of the service as in TDD, the API tends to be much cleaner and easier to use. If you're a mixed agile+waterfall type shop with rather extensive requirements documents/APIs spec'd up front, then at least the developers have a chance to realize the API is wrong before going too far and elevating it back to the "powers that be". If they merely coded the API as spec and tested afterward you wouldn't get this earlier chance to revisit the design...
[ December 21, 2007: Message edited by: Eric Nielsen ]
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 31079
    
163

I think TDD is useful for some types of tasks. I tend to use it when developing library code (because it gets me thinking about the API like a user) and classes within behind a public API/service layer. This keeps the high level design that Cedric mentioned and gives me the benefits of TDD for testing and quality.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Cedric Beust:

Testing first or testing last doesn't matter too much as long as you do write tests.


At XP Day Germany this year, there was an interesting lightning talk by Johannes Link, a consultant who trains a lot of teams in unit testing (besides other things).

He reported that when he has trained a team and comes back a couple of weeks later, he typically finds the team in one of two situations:

1) almost everyone is still writing tests, code coverage 80-90%, or

2) most developers stopped writing tests, only a few are still doing it, code coverage ~50%

Strikingly often, the teams in situation 1) are those who adopted TDD, the ones in situation 2) those who practiced test-last.

Of course there are several possible interpretations of this observation. It's still interesting to think about...


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 31079
    
163

Ilja,
That's interesting. I've noticed two (or three) patterns on my team for test writing. Everyone writes tests, but some people find it more useful than others. And I find some people write better tests.

Pattern A:
1) Write code
2) Run app
3) Debug app going back to step 1 as needed
4) Write unit tests now that code is "done"

Pattern B (and C) :
1) Write code and tests (or tests and code) until code works
2) Run app
3) App likely works, fix if results not as expected

As you can guess, the people following pattern A see less value in the process. And they are right. They are only writing the tests because it is needed for regression. They aren't getting any benefit of it at the current time. They are also the people who try to give a status by saying they are done and "just need to test." I usually respond by asking if that means they have broken code and how they know otherwise .

Personally, I don't care if the tests are written before or after the code. I do think it matters whether they are done before debugging the feature. I find testing saves me time from having to debug; especially when integrating. With pattern A, people say that testing is "overhead." And they are right - from their point of view.
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Originally posted by Cedric Beust:
It forces you to spend a lot of time with broken code. By definition, you write a lot of code that at first doesn't even compile and then, invokes code that is empty. Doing that basically means you are foregoing a lot of the powerful features that IDE's offer (automatic completion, browsing, etc...) since IDE's do very poorly with files that don't compile.

I actually find myself getting more mileage out of tools like Eclipse because of the programming-by-intention that goes with TDD. Basically I create a new test class by right-clicking on the package explorer and from thereon, I do something like this:

Type "Test" and [ctrl-space] to generate a new test method
Write the body of the test as *syntactically valid* Java code (that sometimes doesn't compile because I'm using a method/class that doesn't exist yet)
[Ctrl-1] on the compilations errors, generating default implementations for the missing methods/classes
etc.

Originally posted by Cedric Beust:
It's sometimes much less practical than testing last (try doing TDD on graphical user interfaces or mobile phones).

Yes, some APIs are not that testing-friendly but code is code, regardless of the application domain. You can, if you want to, isolate your stuff from the givens such as the Symbian C++ API or Java Swing, for example. That is, add enough abstraction to keep the nasty parts in few places and have the majority of your code isolated from the nasties.

TDD has the potential to have even more drastic improvements for your design in such settings compared to, say, "plain old Java". For example, GUI programmers who haven't been test-infected often have a long tradition of building GUIs in a very top-down, cannot-construct-any-component-in-isolation-from-the-rest-of-the-system. When that programmer gets test-infected and especially if she goes test-driven, this stereotypical design quickly moves towards a more MVC/MVP type of architecture.

I wonder if Ilja's team at disy started developing their GUI app test-first, test-last, or test-nothing and whether they've observed such progress?

Originally posted by Cedric Beust:
TDD tends to promote micro-design (designing at the method level) over macro-design (thinking ahead of time about a class hierarchy and starting to implement it even though it might not be needed at this early stage).

Well, the implicit 0th step before the test-code-refactor cycle is called "think"... Granted, people learning TDD on their own from a short online article are often drawn into the pit of mindless test-code-refactor and it's unfortunate how many of those articles give the impression that you're not allowed to design up front at all in order to "do TDD".

The way I test-drive is that once I have an idea of what I want to have in the system -- some small piece of functionality -- I envision a design for it (sometimes sketching boxes and lines on paper or whiteboard), start test-driving towards that design, and find myself with a design that may or may not match that original, envisioned design.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Lasse Koskela:
Well, the implicit 0th step before the test-code-refactor cycle is called "think"...


I even seem to remember that some descriptions actually explicitly include a quick design session at the start. Don't know where I read about that...
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Ilja Preuss:
I wonder if Ilja's team at disy started developing their GUI app test-first, test-last, or test-nothing and whether they've observed such progress?


The application was started test-nothing, four years before I joined the team.

Today, six years after I joined, half of the team is very much into TDD. Most of our GUI code is quite trivial, though, because we have a good library of reusable components that most often just need to be plugged together and connected to the model. Many of those components have been developed test-first.

Our GUI code certainly has moved from a monolithic design to a pluggable MVP-style one. TDD was one driver for that move, though I'm not sure whether it was the main driver.

Does that answer your question?
Pradeep bhatt
Ranch Hand

Joined: Feb 27, 2002
Posts: 8919

In TDD, we test->code->refactor (Lasse's book)
I am getting an impression that in TDD that we don't list all the tests initially. If we do so the all tests would fail (and hence broken unit tests) because we do not have any implementation.

Thanks
Pra dip
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 31079
    
163

Originally posted by Prad Dip:
If we do so the all tests would fail (and hence broken unit tests) because we do not have any implementation.

I don't consider a test 'broken' until it is in the repository. I am going to have one or more red bar tests while I am working. Even in TDD, the test is not passing for a few moments while I write the code.
Pradeep bhatt
Ranch Hand

Joined: Feb 27, 2002
Posts: 8919

Originally posted by Jeanne Boyarsky:

I don't consider a test 'broken' until it is in the repository. I am going to have one or more red bar tests while I am working. Even in TDD, the test is not passing for a few moments while I write the code.


Are you saying that tests should not be checked-in until it passes ? What if it takes months to make it pass i.e. I don't implement the logic to make the test work because it is a low priority.

OR

Should not code the test in the first place because I am not going to implement the feature now ?
thanks
Eric Nielsen
Ranch Hand

Joined: Dec 14, 2004
Posts: 194
Originally posted by Pradip Bhat:

Should not code the test in the first place because I am not going to implement the feature now ?
thanks


This is the case. Under TDD you want to have at most one failing test at a time. Having 100s of failing tests (ie writing all your test first before doing any development) is counter-productive, IMO.
Pradeep bhatt
Ranch Hand

Joined: Feb 27, 2002
Posts: 8919

Originally posted by Eric Nielsen:


This is the case. Under TDD you want to have at most one failing test at a time. Having 100s of failing tests (ie writing all your test first before doing any development) is counter-productive, IMO.


That means you think what you are going to implement , write tests for them and then finally code.
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Originally posted by Prad Dip:
That means you think what you are going to implement , write tests for them and then finally code.

No. It means you...
  • think what you are going to implement,
  • write one test that takes you closer to that goal,
  • implement the change that makes the test pass,
  • refactor your code (both test code and production code) to exhibit a good, simple design.

  • That's the "TDD mantra" you may have heard of: Test-Code-Refactor. I just like to make the additional step of thinking explicit: (Think-)Test-Code-Refactor.
    Jeanne Boyarsky
    author & internet detective
    Marshal

    Joined: May 26, 2003
    Posts: 31079
        
    163

    Originally posted by Pradip Bhat:
    Are you saying that tests should not be checked-in until it passes ? What if it takes months to make it pass i.e. I don't implement the logic to make the test work because it is a low priority.

    Further if the logic is low priority, so is the test. They go together and have the same priority.
    Pradeep bhatt
    Ranch Hand

    Joined: Feb 27, 2002
    Posts: 8919

    Thanks Lasse and JB. Now I understand TDD much better.
    Jeanne Boyarsky
    author & internet detective
    Marshal

    Joined: May 26, 2003
    Posts: 31079
        
    163

    Originally posted by Prad Dip:
    Thanks Lasse and JB. Now I understand TDD much better.

    Pradip: You mean Jeanne. "JB" is taken by JB Rainsberger
    Pradeep bhatt
    Ranch Hand

    Joined: Feb 27, 2002
    Posts: 8919

    Originally posted by Jeanne Boyarsky:

    Pradip: You mean Jeanne. "JB" is taken by JB Rainsberger


    Sorry for that. Thanks Jeanne for your help. Who is JB Rainsberger ?
    Jeanne Boyarsky
    author & internet detective
    Marshal

    Joined: May 26, 2003
    Posts: 31079
        
    163

    Originally posted by Prad Dip:
    Who is JB Rainsberger ?

    The author of JUnit Recipes. He's even on wikipedia
    Ilja Preuss
    author
    Sheriff

    Joined: Jul 11, 2001
    Posts: 14112
    Coming back to the original topic, it's interesting to compare the question to the Lean principle of "stop the line". It states (and remember that this is how Toyota builds cars) that when there is a problem, the whole team should stop their work immediately, and together not only solve the immediate problem, but also remove the root cause of the problem.
    Jeanne Boyarsky
    author & internet detective
    Marshal

    Joined: May 26, 2003
    Posts: 31079
        
    163

    Originally posted by Ilja Preuss:
    Coming back to the original topic, it's interesting to compare the question to the Lean principle of "stop the line".

    Interesting. My current project is a bit quieter now. We're essentially in the QA phase, so now we have more time clean things up. Some were refactorings/additional tests to make things more maintainable in the future.

    One was particularly interesting. We had a JSP with nested HTML forms. A number of times we talked about getting rid of them, but it never happened because "there wasn't time" and it wasn't "causing a problem." Somehow we found a the multiple days to fix bugs caused by the nested forms. Last week, after having yet another defect reported against the nested forms (that only occurred in one environment and not locally), I finally insisted we get rid of the nested forms. It took less than 7 hours to entirely get rid of the nested forms, test and make sure we didn't create any new defects. (2 hours pairing and 3 hours alone.) I should have insisted earlier. "Stop the line" is a good way of describing why not to make a problem worse. I'll have to read up on it and use it as an example next time we get into this situation!
    Eric Nielsen
    Ranch Hand

    Joined: Dec 14, 2004
    Posts: 194
    Correct me if I'm wrong, but that story sounds more like a case of "Technical Debt" that should've been addressed earlier, rather than a broken test that was tolerated? Or was their a broken unit test that was allowed to persist for a while? (Your statement that it "wasn't causing a problem" makes it sounds like either there wasn't a failing unit test, or a unit test was created to wrap a reported defect?

    Which means it sounds a little like this is an example of the thin line between cleaning up "code/technical debt" (ie the refactoring stage of TDD) versus YAGNI?
     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: when are broken unit tests ok?