aspose file tools*
The moose likes Agile and Other Processes and the fly likes Estimating Test Development and Test Coverage Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » Agile and Other Processes
Bookmark "Estimating Test Development and Test Coverage" Watch "Estimating Test Development and Test Coverage" New topic
Author

Estimating Test Development and Test Coverage

Kevin Sprague
Greenhorn

Joined: Jun 20, 2005
Posts: 8
Welcome Lisa and Janet! My question has two parts: 1) During iteration planning, are there any "rules of thumb" for estimating the time that should be allocated to developing the test cases? Some have suggested to tie it to number of functions built during the iteration, however, it seems like that approach would not be able to take into account implementation decisions that may impact how/what we test. 2) What are your thoughts on using test coverage tools? Thanks very much!
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
How do you estimate the time to implement the functionality? Why wouldn't you use the same approach for estimating implementing the tests?


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Kevin Sprague
Greenhorn

Joined: Jun 20, 2005
Posts: 8
Thanks. That is how we estimate currently, however, as our team gains more experience writing tests, we are beginning to cover more than function testing alone (e.g. performance, error recovery, coding standards, etc.). How do we know when we're done?
Anna Baik
Greenhorn

Joined: Dec 08, 2006
Posts: 9
Kevin wrote:
Thanks. That is how we estimate currently, however, as our team gains more experience writing tests, we are beginning to cover more than function testing alone (e.g. performance, error recovery, coding standards, etc.). How do we know when we're done?


Why are you covering the non-functional tests?

If it's to solve a problem for the customer (e.g. performance issues on past iterations), then did the tests help you resolve the problem? I would say you're done then. (Caveat: I have no experience of Agile testing, just trad testing. So I'm curious to see what answers people come up with, but that's how I would approach it).
Kevin Sprague
Greenhorn

Joined: Jun 20, 2005
Posts: 8
Your reply has touched on the heart of the matter. Our client is pushing back on the extra testing time we've started including in the interation planning. You have reminded me of a key Agile principle which is easy to lose sight of, Simplicity. Why look ahead to a "potential" performance issue. Have the Courage to deal with it when the customer places it on the iteration backlog. Thanks!
Lisa Crispin
Ranch Hand

Joined: Feb 03, 2009
Posts: 43
Kevin Sprague wrote:Welcome Lisa and Janet! My question has two parts: 1) During iteration planning, are there any "rules of thumb" for estimating the time that should be allocated to developing the test cases? Some have suggested to tie it to number of functions built during the iteration, however, it seems like that approach would not be able to take into account implementation decisions that may impact how/what we test. 2) What are your thoughts on using test coverage tools? Thanks very much!


Hi Kevin,
When we first started doing iteration planning on my current team and we weren't very experienced at estimating the testing tasks, I did use a rule of thumb that testing time for a story was generally 30-50% of the coding time. Our application is pretty testing-intensive. It can take a lot of time, for example, to come up with realistic test data. Also, we always automate regression tests for every story, and need a lot of exploratory testing time since we're working on a financial application and nobody wants to lose their money! Sometimes, testing takes longer than coding.

After five years, we have a pretty good feel for how long a testing task might take, so I usually don't think in terms of how much time is estimated for coding task cards. However, if the coding time is a lot less or a lot more than the testing time, that's a red flag and we want to make sure we feel good about estimates for both types of tasks.

Test coverage tools are useful, but of course, they only measure the code that your team remembered to write! It won't catch missed functionality. My team uses test coverage to set goals for improvement. For example, when we were at 70% test coverage, we set a goal to improve that to 73% in six months. But what counts the most to us is the trend. If the coverage number goes down suddenly, we want to see why that is. It might just be that code covered by tests was removed.

-- Lisa


Co-author, with Janet Gregory: Agile Testing: A Practical Guide for Testers and Agile Teams (Addison-Wesley, 2009) http://lisacrispin.com
Lisa Crispin
Ranch Hand

Joined: Feb 03, 2009
Posts: 43
Kevin Sprague wrote:Your reply has touched on the heart of the matter. Our client is pushing back on the extra testing time we've started including in the interation planning. You have reminded me of a key Agile principle which is easy to lose sight of, Simplicity. Why look ahead to a "potential" performance issue. Have the Courage to deal with it when the customer places it on the iteration backlog. Thanks!


The problem with saying that the customers ought to think of the non-functional requirements is that most customers just don't think about that. One big contribution that professional testers can make on an agile team is to ask questions about non-functional requirements. "How many concurrent users will there be? What response time is needed? Are there any special security concerns?" We raise these issues with the customers - usually they have simply assumed we would take care of those types of things, so we want to make them visible and include them in our estimates.

Here's an example of how we have kept non-functional requirements in mind. For awhile we had issues with security, so we included both developer and tester task cards for security. Security is definitely critical, but it's not something our customers think about on a daily basis, they just take it for granted.
-- Lisa
Kevin Sprague
Greenhorn

Joined: Jun 20, 2005
Posts: 8
Lisa,

Thanks for taking time to reply, your insights are extremely helpful; our team has been struggling with these issues for a while now. Ensuring that relative coding and testing times are in balance makes sense and we will continue to push for visibility and inclusion of non-functional concerns in our estimates. We're not quite ready for test coverage tools but based on your feedback I should revisit our expectations.

Thanks again for the great feedback!
Lance Zant
Greenhorn

Joined: Jan 15, 2008
Posts: 3
Lisa Crispin wrote:Test coverage tools are useful, but of course, they only measure the code that your team remembered to write! It won't catch missed functionality.
-- Lisa


When we say "test coverage" we almost always mean "code coverage", which as you point out is inherently "white box". Customer tests, missed functionality and otherwise, on the other hand are inherently "black box". To me that means coverage needs to be understood differently, probably in terms of the input space instead of the code. Does your book treat the question of what "coverage" means in the context of customer tests?

As with code coverage, it would seem necessary to devise some way to tame the combinatoric explosion of possible inputs. Unlike code paths, these seem to arise in two dimensions - combinations of values within each operation and sequential combinations of operations. One approach I have encountered that seems to address the former is Orthogonal Array Testing. I haven't seen anything addressing the latter. Does the book have any recommendations in these areas?

<sidebar>
Orthogonal Array Testing essentially does for input values what Basis Path Testing does for code paths. Basis Path Testing covers a subset of paths such that all untested paths are combinations of tested paths. Orthogonal Array testing exploits the fact that most value-interaction bugs involve binary interactions to define a minimal set of test cases that covers all the binary combinations of input values.
</sidebar>

On a different but related note, in automating customer tests, what can you tell us about managing test fixtures (DB state, etc.)? That's where my current team is hung up.

Thanks,
Lance
Lisa Crispin
Ranch Hand

Joined: Feb 03, 2009
Posts: 43

When we say "test coverage" we almost always mean "code coverage", which as you point out is inherently "white box". Customer tests, missed functionality and otherwise, on the other hand are inherently "black box". To me that means coverage needs to be understood differently, probably in terms of the input space instead of the code. Does your book treat the question of what "coverage" means in the context of customer tests?

As with code coverage, it would seem necessary to devise some way to tame the combinatoric explosion of possible inputs. Unlike code paths, these seem to arise in two dimensions - combinations of values within each operation and sequential combinations of operations. One approach I have encountered that seems to address the former is Orthogonal Array Testing. I haven't seen anything addressing the latter. Does the book have any recommendations in these areas?

<sidebar>
Orthogonal Array Testing essentially does for input values what Basis Path Testing does for code paths. Basis Path Testing covers a subset of paths such that all untested paths are combinations of tested paths. Orthogonal Array testing exploits the fact that most value-interaction bugs involve binary interactions to define a minimal set of test cases that covers all the binary combinations of input values.
</sidebar>

On a different but related note, in automating customer tests, what can you tell us about managing test fixtures (DB state, etc.)? That's where my current team is hung up.

Thanks,
Lance

Hi Lance,
You're right, my team's code coverage numbers are for unit level tests only. We keep meaning to run code coverage for our functional and GUI tests, but we've never found it important enough.

Because we drive coding with our business-facing tests, coverage is, in a way, built in. The tests are there first, and the code has to be written to make them pass. Writing tests first (which of course is not a new idea, people have been doing that for decades), especially executable tests that programmers are willing to run, helps with risk mitigation also. While we first want to get the happy path working, our next priority is the high risk/high impact areas, we get those tests passing early on.

Our book doesn't cover the testing techniques you mention. They are just as appropriate in agile projects as in any other; however, our book's focus is on special challenges for testing and testers on agile teams. With my own team, we work closely with the business experts to write the test cases in a domain-specific manner.

I'm not sure what you mean by managing test fixtures. You mention db state. We talk about different approaches to test data in the book. When possible, we like to avoid using the actual db - tests run faster, and if our purpose is to test an algorithm, we don't need the db. But of course at some point, you have to test the database layer. My preference is for each test to set up and tear down its own test data, so that each test is independent and rerunnable. The application I work on has a complex data structure, and for our GUI smoke tests, it's not feasible for the tests to set up all their own data, so we also use a "canonical data" approach where the build process first refreshes the test schema with "seed" data before running the suite of tests. This is a pain because the tests have to be run in a particular order.

If that's not what you meant, let me know and I'll try to answer.
-- Lisa
Mike Farnham
Ranch Hand

Joined: Sep 25, 2001
Posts: 76
The application I work on has a complex data structure, and for our GUI smoke tests, it's not feasible for the tests to set up all their own data, so we also use a "canonical data" approach where the build process first refreshes the test schema with "seed" data before running the suite of tests. This is a pain because the tests have to be run in a particular order.


So, is this "canonical data" stored in a database schema entirely outside the path to production?

We have dev, test, qa, and prod environments.
The code migrates from dev to test to qa and finally to production.
Each environment has its own schema.

I would be interested if the "canonical data" you are talking about resides,
in a separate schema.

This might be a big help to our situation,
at least for the data we actually maintain.

Our biggest challenge however is the data we get from other schemas
that we do not maintain. Plus, the fact that our data is cyclical in nature.

Do you have an suggestions for testing data that has a cyclical nature?
(I work for a University and our applications primarily deal with data for the current semester.)

Lisa Crispin
Ranch Hand

Joined: Feb 03, 2009
Posts: 43
Mike Farnham wrote:
The application I work on has a complex data structure, and for our GUI smoke tests, it's not feasible for the tests to set up all their own data, so we also use a "canonical data" approach where the build process first refreshes the test schema with "seed" data before running the suite of tests. This is a pain because the tests have to be run in a particular order.


So, is this "canonical data" stored in a database schema entirely outside the path to production?

We have dev, test, qa, and prod environments.
The code migrates from dev to test to qa and finally to production.
Each environment has its own schema.

I would be interested if the "canonical data" you are talking about resides,
in a separate schema.

This might be a big help to our situation,
at least for the data we actually maintain.

Our biggest challenge however is the data we get from other schemas
that we do not maintain. Plus, the fact that our data is cyclical in nature.

Do you have an suggestions for testing data that has a cyclical nature?
(I work for a University and our applications primarily deal with data for the current semester.)



We also have dev, test, staging and prod environments. In dev, test and staging, we have a recent copy of the production data so that we can do realistic exploratory testing.

We also have several "canonical" schemas that have a tiny subset of production-like data. Our different suites of regression tests each have their own schema, which is refreshed before the tests run. The unit tests also have their own schema.

We have two "seed" schemas - one for the unit tests, and one for the business-facing regression tests. The other test schemas get refreshed from these two schemas. The confusing part is that each test schema may have data that just lives there and doesn't get refreshed (this drives our DBA crazy). Lookup tables, for example, don't get refreshed, they just stick around.

I started out my career at a university so I wish I could remember some examples from that time! But it was too long ago. My current team's business is somewhat cyclical in that we have date-dependent activities in the application. For example, right now, our actual customers are having to run tests that prove their compliance with IRS rules governing retirement plans. Our canonical test schemas are frozen in time, so at the end of each year, we have to decide what data to "roll forward" in time or what to change in our regression tests so that our regression tests will still pass. Our production-like schemas can be refreshed whenever needed so that they reflect what's going on right now in production.
-- Lisa
Lance Zant
Greenhorn

Joined: Jan 15, 2008
Posts: 3
Lisa Crispin wrote:
Because we drive coding with our business-facing tests, coverage is, in a way, built in. The tests are there first, and the code has to be written to make them pass....
-- Lisa


The question I was trying to get to is "coverage of what?" in the case of business-facing tests. Writing them first is great, but seems orthogonal to the question of how many are enough (or better, which ones are needed). The goal is to cover the requirements. Using tests to document requirements might turn the question back to the customer/product owner. If there's not test where x=1 && y=-1, you can argue that there's no "requirement" to handle that condition. If you can make that work, I'd love to know how you do so.

In my experience, tests identified by business product owners' tend to be indicative rather than exhaustive. They tend to come up with a sunny day case and stop there. Prodded for error cases, they give me a couple of obvious missing or bad values. A second round of prodding may or may not produce a couple of interaction exceptions (no cash refund for a credit purchase), but it definitely begins to raise the frustration level. ("I just need it to work, dammit!") Unfortunately, when a subtle interaction bug arises, the fact that there was no test for that combination is cold comfort, and the blame game begins. ("Of COURSE, we need to process payments against canceled orders!")

So the question is, how do you assess the adequacy of your business-facing tests, if it's not based on some kind of coverage of the possible input combinations and sequences? If the answer is "heuristically", fair enough. The follow up in that case is whether any of the heuristics are general across projects and domains, and how do you get the business types to really engage them?

thanks again,
Lance
Lisa Crispin
Ranch Hand

Joined: Feb 03, 2009
Posts: 43
Lance Zant wrote:
Lisa Crispin wrote:
Because we drive coding with our business-facing tests, coverage is, in a way, built in. The tests are there first, and the code has to be written to make them pass....
-- Lisa


The question I was trying to get to is "coverage of what?" in the case of business-facing tests. Writing them first is great, but seems orthogonal to the question of how many are enough (or better, which ones are needed). The goal is to cover the requirements. Using tests to document requirements might turn the question back to the customer/product owner. If there's not test where x=1 && y=-1, you can argue that there's no "requirement" to handle that condition. If you can make that work, I'd love to know how you do so.

In my experience, tests identified by business product owners' tend to be indicative rather than exhaustive. They tend to come up with a sunny day case and stop there. Prodded for error cases, they give me a couple of obvious missing or bad values. A second round of prodding may or may not produce a couple of interaction exceptions (no cash refund for a credit purchase), but it definitely begins to raise the frustration level. ("I just need it to work, dammit!") Unfortunately, when a subtle interaction bug arises, the fact that there was no test for that combination is cold comfort, and the blame game begins. ("Of COURSE, we need to process payments against canceled orders!")

So the question is, how do you assess the adequacy of your business-facing tests, if it's not based on some kind of coverage of the possible input combinations and sequences? If the answer is "heuristically", fair enough. The follow up in that case is whether any of the heuristics are general across projects and domains, and how do you get the business types to really engage them?

thanks again,
Lance


Our focus as a team has been on learning the business quite well ourselves, as well as working closely with the customers to identify not only tests for the story at hand, but potential ripple effects on other parts of the system. Our product owner has a "story checklist" which helps him research whether a story impacts things like reports, plan documents (we manager 401(k) plans), external partners, vendors, legal concerns, government regulations, other parts of the system, training and the like. He writes high level test cases and because he's been doing this so long with us, he does think of negative and edge cases. We go over these checklists, then we add our own test cases, and go over those with the PO. We're communicating about the stories all the time.

I confess that I have never used the techniques you described on either non-agile or agile projects. No doubt they'd be helpful, but I haven't had a problem that those would seem to address. If we had a lot of bugs getting out to production, it would be worthwhile. But thanks to diligent TDD and ATDD, our code emerges fairly defect-free.
-- Lisa
Janet Gregory
Author
Ranch Hand

Joined: Jan 25, 2009
Posts: 31
Lance Zant wrote:
In my experience, tests identified by business product owners' tend to be indicative rather than exhaustive. They tend to come up with a sunny day case and stop there. Prodded for error cases, they give me a couple of obvious missing or bad values. A second round of prodding may or may not produce a couple of interaction exceptions (no cash refund for a credit purchase), but it definitely begins to raise the frustration level. ("I just need it to work, dammit!") Unfortunately, when a subtle interaction bug arises, the fact that there was no test for that combination is cold comfort, and the blame game begins. ("Of COURSE, we need to process payments against canceled orders!")

So the question is, how do you assess the adequacy of your business-facing tests, if it's not based on some kind of coverage of the possible input combinations and sequences? If the answer is "heuristically", fair enough. The follow up in that case is whether any of the heuristics are general across projects and domains, and how do you get the business types to really engage them?


Lance,

You are right that many product owners only see the happy path. That is one of the reasons we advocate professional testers on a team - to help identify all the other cases. How each team determines what is enough is usually based on the skills of the testers. I find many experienced testers use heuristics without knowing that is what they are doing. If you ask them why they decided to do that, they can usually explain and it starts with "Because ....", and ends with "in my experience".

When I do find teams that don't seem to have any kind of process for figuring out "what is good enough" I encouage use of simple tools like truth tables, or decision trees, or maybe even a flow diagram. And I recommend they take testing courses who can teach more indepth methods. Often though, the simple tools provide 'good enough' coverage.

Janet
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Estimating Test Development and Test Coverage