My first time posting here, so a quick related advert. If you have a corporate testing budget for test tools, check out my company's VoiceXML testing tool: Voiyager at www.Voiyager.com. While not obvious in most of the online material, it has a full Java and .Net API that can be used to drive code level tests through a VoiceXML call flow. It's main purpose is to automatically explore and test. I'm very much in the CI (Continuous Integration) world, so I find the API to be incredibly powerful, but it doesn't play to the marketing message. I am the products development manager and architect.
As for some advice that doesn't involve buying things:
With VoiceXML, there isn't an easy free way to do end to end system testing. HTTPUnit and the like will let you drive the HTTP requests, but validating the responses other than a simple automated spot check isn't going to be very easy. Since there isn't an easy answer, as suggested above, divide and conquer.
Business logic: If in the browser, isolate your ECMAScript so that it can be tested with those frameworks. If server side (e.g. Java), again, isolate so that you can use existing test frameworks.
Host interfaces: This is probably server side or using VoiceXML's data or script element. In any case, server side code should be isolated to test with normal test frameworks (
JUnit) or HTTPUnit (for Data elements)
If you've done the work above, you can take a broader stroke with the VoiceXML as you've reduced a lot of the common failure points that won't be caught with traditional manual testing. Using HTTPUnit with VoiceXML can be done. Perform some spot checks on form names, audio clip names and the like. A good VoiceXML generation tool or code framework will hopefully be plumbed with either <log> or comment blocks that you can use as landmarks to insure the flow is moving correctly. Additionally, you could write a VoiceXML to test your voiceXML application. To keep the call flows in sync, replace out key landmarking soundclips with recordings of DTMF tones that can be easily detected with your scripts (this technique may not work in some VoIP configurations because tones are no longer in band audio sounds, but out of band signaling messages).
I hope this response has been helpful and not just spam.