File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes XML and Related Technologies and the fly likes Bug in JRE XML parser Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Bug in JRE XML parser" Watch "Bug in JRE XML parser" New topic
Author

Bug in JRE XML parser

Mattew Force
Greenhorn

Joined: May 03, 2007
Posts: 16
Hi,

I am making this post about a critical bug I have found in the internal XML parser classes in the JRE. I've found the bug in JRE 1.6 but i guess it may live in earlier versions of the JRE as well though I have not confirmed it. The bug has been reported to Sun twice but they keep ignoring it even if it is very obvious how critical and reoccuring it is. It is very frustrating for me and my client!

I will not keep you waiting but will post my latest bug report here and my hopes are that some of you also push this bug forward to Sun so that a patch is made as soon as possible. This bug makes a huge impact on my clients business since the default XML parser of Oracle Weblogic Server 10.3 is the one included in the JRE. We have however solved the issue in some cases by switching to another parser at runtime, but that is not standard procedure at the moment. Otherwise my only hope is to wait for Oracle to fix the bug now that they will take over Java with the purchase of Sun.

My latest bug report to Sun:

Date Created: Tue May 05 10:15:17 MDT 2009
Type: bug
Customer Name: Mattias Forss
Customer Email: xxx
SDN ID: xxx
status: Waiting
Category: jaxp
Subcategory: dom
Company: xxx
release: 6
hardware: x86
OSversion: win_xp
priority: 4
Synopsis: Parsing error occurs when using JRE's internal XML parser
Description:
FULL PRODUCT VERSION :
Java(TM) SE Runtime Environment (build 1.6.0_12-b04) Java HotSpot(TM) Client VM (build 11.2-b01, mixed mode, sharing)

ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [Version 5.1.2600]

EXTRA RELEVANT SYSTEM CONFIGURATION :
To verify the bug I downloaded the latest Xerces and Xalan libraries from http://www.apache.org/dist/xerces/j/
Xerces-J-bin.2.9.1.zip
http://apache.jumper.nu/xml/xalan-j/
xalan-j_2_7_1-bin-2jars.zip

We need to have xercesImpl.jar and xalan.jar on the class path.



A DESCRIPTION OF THE PROBLEM :
The following source code is built with jdk160_05.


Test.java
---------



---------

The code parses an XML-file using the DocumentBuilder and then transforms the document back to string representation using the Transformer.


schema.xml
----------



----------


Consider the XML-file above, which has an element with eight attributes whose values contain the percentage sign (%). When parsing the file using JRE's internal parser classes they will cause the error shown below. It was detected when upgrading an application from Weblogic Application Server version 8.1 to version 10.3. The latter version of Weblogic uses JRE's internal XML parser classes and the former uses some classes provided with the Weblogic library and these do not produce the same internal Document Object Model after parsing.

I have experienced strange problems from the bug - some attributes values were being overwritten by the length of other attributes values. An attribute's value will be totally overwritten if the original value is shorter than the one that is used to overwrite, but the original length of the value will remain. If the original attribute's value is longer than the one that is used to overwrite we can see the original value being chopped off at the end.

In the example file above the error occurs after eight attributes of an element use the percentage sign in their values. There are several other combinations, for six attributes with XPath brackets "[]" as their value followed by tho values with percentage sign in their value will also cause the same replacement bug. There are probably more ways to trigger the bug.

I have an idea that the parser thinks it is reading DTD and then stores the attributes using the percentage sign as variables but I am really not sure what to think of this.

This is the result when using Sun's internal Xerces DocumentBuilderFactory and Xalan TransformerFactory:

>java Test
JAXP: find factoryId =javax.xml.parsers.DocumentBuilderFactory
JAXP: loaded from fallback value: com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
JAXP: created new instance of class com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl using ClassLoader: null
JAXP: find factoryId =javax.xml.transform.TransformerFactory
JAXP: loaded from fallback value: com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl
JAXP: created new instance of class com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl using ClassLoader: null XML is:
<?xml version="1.0" encoding="iso-8859-1" standalone="no"?><node a="%atleasteight" b="%argumentswith" c="%percentsigns" d="%willmakesuns" e="%internalxmlparser" f="%tomessupthedom" g="%HelloWorld" h="%Hello">


This is the result when using external Xerces DocumentBuilderFactory and Xalan TransformerFactory:


java -Djavax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl -Djavax.xml.transform.TransformerFactory=org.apache.xalan.processor.TransformerFactoryImpl -classpath .;C:/xercesImpl.jar;C:/xalan.jar Test
JAXP: find factoryId =javax.xml.parsers.DocumentBuilderFactory
JAXP: found system property, value=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl
JAXP: created new instance of class org.apache.xerces.jaxp.DocumentBuilderFactoryImpl using ClassLoader: sun.misc.Launcher$AppClassLoader@133056f
JAXP: find factoryId =javax.xml.transform.TransformerFactory
JAXP: found system property, value=org.apache.xalan.processor.TransformerFactoryImpl
JAXP: created new instance of class org.apache.xalan.processor.TransformerFactoryImpl using ClassLoader: sun.misc.Launcher$AppClassLoader@133056f
XML is:
<?xml version="1.0" encoding="iso-8859-1"?><node a="%atleasteight" b="%arguments with" c="%percentsigns" d="%willmakesuns" e="%internalxmlparser" f="%tomessupthe dom" g="%WorldWorld" h="%Hello"> </node>

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Create the XML-file and the Java class from the description. Save the XML-file on C:\schema.xml.

Compile the Java class using desired jdk160_05. Then run the application using the different variants described at the end of the description.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
XML is:
<?xml version="1.0" encoding="iso-8859-1"?><node a="%atleasteight" b="%arguments with" c="%percentsigns" d="%willmakesuns" e="%internalxmlparser" f="%tomessupthe dom" g="%WorldWorld" h="%Hello"> </node> ACTUAL - XML is:
<?xml version="1.0" encoding="iso-8859-1" standalone="no"?><node a="%atleasteight" b="%argumentswith" c="%percentsigns" d="%willmakesuns" e="%internalxmlparser" f="%tomessupthedom" g="%HelloWorld" h="%Hello">

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
Test.java
---------



---------
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Use external parser or rearrange the ordering of the attributes. Another option may be to insert some dummy attributes whose values can be replaced with without causing problems. The latter option is however not an acceptable long term solution.
workaround:
comments: (company - xxx, email - xxx)




Hope you guys have some comments on this. I'd love to get this fixed in the JRE!

Sorry for some poor language (English is not my mother tounge).

Cheers,

Matt
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41603
    
  55
I can't reproduce it in Java 1.4 and Java 5, so it may be a more recent regression. Similar-sounding bugs have been reported already. Make sure you vote for those as well as the one you reported.


Ping & DNS - my free Android networking tools app
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Bug in JRE XML parser