We have tables to capture user activities over internet. So we have tables like USERS, ACTIVITIES, ACTIVITY-PARTICIPANTS, MESSAGES, FILES etc.
Our requirement is to insert data for multiple users in database for multiple days.
Let’s assume we need to insert data for 10000 users spread across 6 months.
All users data will be different depending upon some MODEL.
We have different types of Activities (e.g. group chat, voice chat, 1-1 chat etc..)
Also the content/message for each users will vary. They will have different sized messages (with different % of them) for them depending upon some MODEL.
Same is true for FILES also.
Number of participants in activities will be different.
For all of the above the configurations/percentage distribution of data will be passed (as configuration), e.g. no of users, no of days data to populate, no of activities, messages per activities, % activities containing files, percentage of different size & type of files, percentage of different size of messages, percentage of different activity types etc.
For single day data, we may expect 60-100k activities or rows in ACTIVITIES table (ACTIVITY-PARTICIPANTS & MESSAGES will be more depending upon MODEL). And size of data inserted in DB for all tables is approx 3-4 GB per day.
Now my questions are.
1. Will GenRocket be able to insert data like this in DB.
2. What will be the time taken to insert 6 months data (considering total data size of approx 500GB).
Please get back if you need more clarifications.
Joined: Mar 20, 2014
Also, how easy it is to implement DB schema changes for the same set of table with some column level changes or may be few addition/deletion of tables?
There's a lot of information here, so I'll do my best to answer your questions one at a time over multiple posts and as I have time to answer them.
This first answer concerns the speed of GenRocket to produce 500GB of data.
We measure how much data GenRocket can generate by rows per second, rows per minute or rows per hour. On a decent computer with a quad processor, GenRocket can generate on average 15,000 rows of data per second, which translates to 900,000 rows per minute or 54,000,000 rows per hour. GenRocket can further produce data on multiple GenRocket runtime instances to increasing the amount of data produced over a shorter period of time (you may like to check out how GenRocket produced 100,000,000 rows of user data in 24 minutes, http://support.genrocket.com/customer/portal/articles/1489429-scalable-test-data-for-big-data).
Thus, if you can calculate or approximate that amount of data each row will produce, you can then approximate how many rows you need to generate and then deduce approximately how much time it will take GenRocket to generate 500GB of test data.
Hycel Taylor Iii
Joined: Feb 24, 2014
This answer concerns reading parameters from a configuration file.
Thus, you could create a CSV file containing parameters from which to initialize certain Generator values within of a given Scenario Domain Attribute. The preferred method is to create multiple GenRocket Scenarios with each one having specific Scenario Domain Attribute Generators set to different values.
This also goes to the GenRocket philosophy of having each Scenario concentrate on one specific task of test data generation. You can then execute multiple GenRocket scenarios one after the other to generate all of the data needed. Creating multiple Scenarios decouples and breaks down complex test data solutions into smaller sub tasks making it easier to create and modify the type of test data needed.
Hycel Taylor Iii
Joined: Feb 24, 2014
This answer concerns changing database schema.
If you want to populate data directly into database tables, then you would use GenRocket Domains to model your tables. To do this cleanly, you could import each table's DDL to create its Domain and afterwards add a Generator for each Attribute.
As your database tables change, you can modify, add or delete Attributes within a given Domain. Most often you will change the behavior of a Domain's Attributes in a given Scenario. You can even create new versions of your entire project as you move from one release to another.
GenRocket Domains do not necessarily need to be tied directly to database schema. It all depends on how you choose to use GenRocket generated test data.
For example, generating data that is then used by service methods that contain business logic to populate one or more database tables, greatly decouples GenRocket Domains from a database schema. It also allows the same Domains and Scenarios to be used for load testing, functional testing and integration testing.