The current implementation of JAXP is incompliance with DOM level 2 specification, which lacks many features including specifying encoding type. The new DOM level 3 spec has addressed many of the shortcomings of DOM 2, including specifying encoding types and stand alone documents. Here is an excellent summary of new features in DOM 3 . You will have to understand that JAXP merely provides a layer of abstraction so that the developer can plug in any JAXP-compliant parser without having to reconfigure application or change the code. JAXP itself is not a parser and hence Sun is not under great pressure to keep up with chaning DOM specs. Here's a quote from Sun's JAXP FAQ page - "The reason for the existance of JAXP is to facilitate the use of XML on the Java platform. For example, current APIs such as DOM Level 2 do not provide a method to bootstrap a DOM Document object from an XML input document, JAXP does. (When DOM Level 3 provides this functionality, a new version of the JAXP specification will probably support the new Level 3 scheme also.) Other parts of JAXP such as the javax.xml.transform portion do not have any other equivalent XSLT processor-independent APIs."
The best way to get around this problem is to unplug your application from JAXP and use other parser implementations such as Xerces which are more likely to provide faster and more frequent upgrades. Alternatively, you can serialize the DOM tree, and parse it again using org.sax.xml.InputSource. You can set any encoding using InputSource.setEncoding() just before you parse the document. This will give you the exact same document with a different encoding!
Hope that gives you something to think about
Open Group Certified Distinguished IT Architect. Open Group Certified Master IT Architect. Sun Certified Architect (SCEA).
Joined: Jan 08, 2001
thanks Ajith. So switch back to xerces again. I cant understand. They allways say that xml is cool for internalization. And with UTF-8 I have problems with my german ä . So I found the above encodings suits well for my needs. And then... change encoding with this huge Dom 2.0 api impossible. The<xml version="1.0" standalone="yes" encoding="UTF-8"> simply not accesible. O.k. in Dom 3.0 we have access through Document interface. Hey. Thats were I was searching first, when exploring the dom 2.0-api. The article mentioned by Ajith is really very good. Axel, just a little bit shocked. [ March 13, 2002: Message edited by: Axel Janssen ]