Steve,
Thanks for the clarification. In the development environment, you'll probably run it in single user mode, and can afford to take the hit of JVMPI/JVMTI. Something which integrates well with your
IDE would work well, like NetBeans IDE Profiler.
When running in a test environment, the overhead of JVMPI/AOP based profilers becomes significant, specially when running a multi-user load test. When you are willing to live with 20-30% overhead, you can try byte code instrumentation. For true scalability tests, where you want the overhead of the analysis to be below 1%, you could consider Auptyma's
Java Application Monitor.
With bytecode instrumentation, you can get lower overheads but you'll need to get experts who'll only be focused on optimizing the instrumentation points without sacrificing visibility.
Hope that helps.