aspose file tools*
The moose likes Testing and the fly likes GenRocket: Extrapolating Test Data Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » Testing
Bookmark "GenRocket: Extrapolating Test Data" Watch "GenRocket: Extrapolating Test Data" New topic
Author

GenRocket: Extrapolating Test Data

Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18843
    
  40

Gregg,

Question: Given a small subset of production data, can GenRocket extrapolate that data into a much larger set? And if so, is there a way to configure the rules on how to configure the data extrapolation?

Thanks,
Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Gregg Bolinger
GenRocket Founder
Ranch Hand

Joined: Jul 11, 2001
Posts: 15299
    
    6

Hey Henry. GenRocket's stance on extrapolating production data into test data is "Don't Do It". We've found that scrubbing production data to remove or mask sensitive data so that it is "safe" for test purposes is an anti-pattern. It usually isn't any quicker or easier than creating synthetic test data. Especially with GenRocket.

That said, we do have Generators that can query a database for their data. You would need some custom Receivers that would do the scrubbing prior to writing out the final data. But again, our position is that this is a bad idea and should be avoided.


GenRocket - Experts at Building Test Data
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18843
    
  40


I guess that makes my intended follow up question (regarding data masking) moot...


However, I would like to know why it is an anti-pattern. Is it merely because GenRocket has really good tools, that makes extrapolating production data unnecessary? Or are there other issues that I should be concerned about?

Henry
Hycel Taylor Iii
GenRocket Founder
Greenhorn

Joined: Feb 24, 2014
Posts: 10
Hey Henry,

Great question...

When you're wanting test data or to generate test data, the first question that should be asked is what are you trying to test? Are you trying to do load testing, functional testing, integration testing, uniting testing, negative testing? It is important to know because each of these requires a specific type of data.

If you're load testing, then you don't really care what type of data you want to generate as long as it consists of the write data types, it has referential integrity and it is generated in massive quantities.

If you want to perform functional testing, then the data needs to look and feel like the real data that would be consumed by your application.

If you want to do integration, unit or negative testing, then you may want to condition the data so that you can test for specific results and you want to generate a unique set of data for each test.

In all of the above instances you want to control the data that your are creating. In all of the above instances you want to determine and generate the test data you want and you do that by generating synthetic test data.

None of the instance above should be using data from a production environment. No time should be spent or waisted in pruning production data. This is why we see scrubbing production data to remove or mask sensitive data so that it is "safe" for test purposes as an anti-pattern.

Scrubbing data is done in the absence of a good test data generation platform. GenRocket is designed to allow you to model and generate the specific test data you need to perform tests on the challenges you need to solve.
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18843
    
  40

Thanks. I guess I had a unique situation. It was indeed load testing, but it only failed with one customer's data. So, it was really debugging with a specific (large amount) set of data.

In the end, it was important enough, and along with my promise that I would be the only one seeing the data (and will delete it right afterwards), I got the data. I did feel uncomfortable having it though -- especially, since it was not scrubbed.


Regardless, I am not even sure if extrapolating (or scrubbing) would have even helped here -- as at the time, we didn't know where in the mass amount of data was triggering the issue.

Henry
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: GenRocket: Extrapolating Test Data