freedom to structure your tests as you wish (package structure does not need to follow production code - this is huge win - in practice we often need to organize tests a little bit different than the production code to make it more expressive and cover different angles of testing (unit, integration, performance, thread-safety, acceptance, ...)
convention over technical limitation (instead of language-enforced access modifiers we simply acknowledge the naming convention; API clients are explicitly warned to not rely on the internal stuff)
does not prevent API client from making an informed trade off and use the internal classes (let's face it: such trade off can be occasionally beneficial and it is good thing if it can be made)
I've been playing around with this idea for a while and decided to hear from you. RFC!