The application I work on has a complex data structure, and for our GUI smoke tests, it's not feasible for the tests to set up all their own data, so we also use a "canonical data" approach where the build process first refreshes the test schema with "seed" data before running the suite of tests. This is a pain because the tests have to be run in a particular order.
So, is this "canonical data" stored in a database schema entirely outside the path to production?
We have dev, test, qa, and prod environments.
The code migrates from dev to test to qa and finally to production.
Each environment has its own schema.
I would be interested if the "canonical data" you are talking about resides,
in a separate schema.
This might be a big help to our situation,
at least for the data we actually maintain.
Our biggest challenge however is the data we get from other schemas
that we do not maintain. Plus, the fact that our data is cyclical in nature.
Do you have an suggestions for testing data that has a cyclical nature?
(I work for a University and our applications primarily deal with data for the current semester.)