• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

GenRocket: Extrapolating Test Data

 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Gregg,

Question: Given a small subset of production data, can GenRocket extrapolate that data into a much larger set? And if so, is there a way to configure the rules on how to configure the data extrapolation?

Thanks,
Henry
 
Ranch Hand
Posts: 15304
6
Mac OS X IntelliJ IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey Henry. GenRocket's stance on extrapolating production data into test data is "Don't Do It". We've found that scrubbing production data to remove or mask sensitive data so that it is "safe" for test purposes is an anti-pattern. It usually isn't any quicker or easier than creating synthetic test data. Especially with GenRocket.

That said, we do have Generators that can query a database for their data. You would need some custom Receivers that would do the scrubbing prior to writing out the final data. But again, our position is that this is a bad idea and should be avoided.
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

I guess that makes my intended follow up question (regarding data masking) moot...


However, I would like to know why it is an anti-pattern. Is it merely because GenRocket has really good tools, that makes extrapolating production data unnecessary? Or are there other issues that I should be concerned about?

Henry
 
GenRocket Founder
Posts: 10
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey Henry,

Great question...

When you're wanting test data or to generate test data, the first question that should be asked is what are you trying to test? Are you trying to do load testing, functional testing, integration testing, uniting testing, negative testing? It is important to know because each of these requires a specific type of data.

If you're load testing, then you don't really care what type of data you want to generate as long as it consists of the write data types, it has referential integrity and it is generated in massive quantities.

If you want to perform functional testing, then the data needs to look and feel like the real data that would be consumed by your application.

If you want to do integration, unit or negative testing, then you may want to condition the data so that you can test for specific results and you want to generate a unique set of data for each test.

In all of the above instances you want to control the data that your are creating. In all of the above instances you want to determine and generate the test data you want and you do that by generating synthetic test data.

None of the instance above should be using data from a production environment. No time should be spent or waisted in pruning production data. This is why we see scrubbing production data to remove or mask sensitive data so that it is "safe" for test purposes as an anti-pattern.

Scrubbing data is done in the absence of a good test data generation platform. GenRocket is designed to allow you to model and generate the specific test data you need to perform tests on the challenges you need to solve.
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks. I guess I had a unique situation. It was indeed load testing, but it only failed with one customer's data. So, it was really debugging with a specific (large amount) set of data.

In the end, it was important enough, and along with my promise that I would be the only one seeing the data (and will delete it right afterwards), I got the data. I did feel uncomfortable having it though -- especially, since it was not scrubbed.


Regardless, I am not even sure if extrapolating (or scrubbing) would have even helped here -- as at the time, we didn't know where in the mass amount of data was triggering the issue.

Henry
reply
    Bookmark Topic Watch Topic
  • New Topic