Mattew Force

+ Follow
since May 03, 2007
Cows and Likes
Total received
In last 30 days
Total given
Total received
Received in last 30 days
Total given
Given in last 30 days
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Mattew Force


I am making this post about a critical bug I have found in the internal XML parser classes in the JRE. I've found the bug in JRE 1.6 but i guess it may live in earlier versions of the JRE as well though I have not confirmed it. The bug has been reported to Sun twice but they keep ignoring it even if it is very obvious how critical and reoccuring it is. It is very frustrating for me and my client!

I will not keep you waiting but will post my latest bug report here and my hopes are that some of you also push this bug forward to Sun so that a patch is made as soon as possible. This bug makes a huge impact on my clients business since the default XML parser of Oracle Weblogic Server 10.3 is the one included in the JRE. We have however solved the issue in some cases by switching to another parser at runtime, but that is not standard procedure at the moment. Otherwise my only hope is to wait for Oracle to fix the bug now that they will take over Java with the purchase of Sun.

My latest bug report to Sun:

Date Created: Tue May 05 10:15:17 MDT 2009
Type: bug
Customer Name: Mattias Forss
Customer Email: xxx
SDN ID: xxx
status: Waiting
Category: jaxp
Subcategory: dom
Company: xxx
release: 6
hardware: x86
OSversion: win_xp
priority: 4
Synopsis: Parsing error occurs when using JRE's internal XML parser
Java(TM) SE Runtime Environment (build 1.6.0_12-b04) Java HotSpot(TM) Client VM (build 11.2-b01, mixed mode, sharing)

Microsoft Windows XP [Version 5.1.2600]

To verify the bug I downloaded the latest Xerces and Xalan libraries from

We need to have xercesImpl.jar and xalan.jar on the class path.

The following source code is built with jdk160_05.


The code parses an XML-file using the DocumentBuilder and then transforms the document back to string representation using the Transformer.



Consider the XML-file above, which has an element with eight attributes whose values contain the percentage sign (%). When parsing the file using JRE's internal parser classes they will cause the error shown below. It was detected when upgrading an application from Weblogic Application Server version 8.1 to version 10.3. The latter version of Weblogic uses JRE's internal XML parser classes and the former uses some classes provided with the Weblogic library and these do not produce the same internal Document Object Model after parsing.

I have experienced strange problems from the bug - some attributes values were being overwritten by the length of other attributes values. An attribute's value will be totally overwritten if the original value is shorter than the one that is used to overwrite, but the original length of the value will remain. If the original attribute's value is longer than the one that is used to overwrite we can see the original value being chopped off at the end.

In the example file above the error occurs after eight attributes of an element use the percentage sign in their values. There are several other combinations, for six attributes with XPath brackets "[]" as their value followed by tho values with percentage sign in their value will also cause the same replacement bug. There are probably more ways to trigger the bug.

I have an idea that the parser thinks it is reading DTD and then stores the attributes using the percentage sign as variables but I am really not sure what to think of this.

This is the result when using Sun's internal Xerces DocumentBuilderFactory and Xalan TransformerFactory:

>java Test
JAXP: find factoryId =javax.xml.parsers.DocumentBuilderFactory
JAXP: loaded from fallback value:
JAXP: created new instance of class using ClassLoader: null
JAXP: find factoryId =javax.xml.transform.TransformerFactory
JAXP: loaded from fallback value:
JAXP: created new instance of class using ClassLoader: null XML is:
<?xml version="1.0" encoding="iso-8859-1" standalone="no"?><node a="%atleasteight" b="%argumentswith" c="%percentsigns" d="%willmakesuns" e="%internalxmlparser" f="%tomessupthedom" g="%HelloWorld" h="%Hello">

This is the result when using external Xerces DocumentBuilderFactory and Xalan TransformerFactory:

java -Djavax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl -Djavax.xml.transform.TransformerFactory=org.apache.xalan.processor.TransformerFactoryImpl -classpath .;C:/xercesImpl.jar;C:/xalan.jar Test
JAXP: find factoryId =javax.xml.parsers.DocumentBuilderFactory
JAXP: found system property, value=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl
JAXP: created new instance of class org.apache.xerces.jaxp.DocumentBuilderFactoryImpl using ClassLoader: sun.misc.Launcher$AppClassLoader@133056f
JAXP: find factoryId =javax.xml.transform.TransformerFactory
JAXP: found system property, value=org.apache.xalan.processor.TransformerFactoryImpl
JAXP: created new instance of class org.apache.xalan.processor.TransformerFactoryImpl using ClassLoader: sun.misc.Launcher$AppClassLoader@133056f
XML is:
<?xml version="1.0" encoding="iso-8859-1"?><node a="%atleasteight" b="%arguments with" c="%percentsigns" d="%willmakesuns" e="%internalxmlparser" f="%tomessupthe dom" g="%WorldWorld" h="%Hello"> </node>

Create the XML-file and the Java class from the description. Save the XML-file on C:\schema.xml.

Compile the Java class using desired jdk160_05. Then run the application using the different variants described at the end of the description.

XML is:
<?xml version="1.0" encoding="iso-8859-1"?><node a="%atleasteight" b="%arguments with" c="%percentsigns" d="%willmakesuns" e="%internalxmlparser" f="%tomessupthe dom" g="%WorldWorld" h="%Hello"> </node> ACTUAL - XML is:
<?xml version="1.0" encoding="iso-8859-1" standalone="no"?><node a="%atleasteight" b="%argumentswith" c="%percentsigns" d="%willmakesuns" e="%internalxmlparser" f="%tomessupthedom" g="%HelloWorld" h="%Hello">

This bug can be reproduced always.

---------- BEGIN SOURCE ----------

---------- END SOURCE ----------

Use external parser or rearrange the ordering of the attributes. Another option may be to insert some dummy attributes whose values can be replaced with without causing problems. The latter option is however not an acceptable long term solution.
comments: (company - xxx, email - xxx)

Hope you guys have some comments on this. I'd love to get this fixed in the JRE!

Sorry for some poor language (English is not my mother tounge).



Originally posted by Gabriel Vargas:
Hi Mattew,

I think you could store both the host and the server port and if you like the name of the service, normally i see than people stores host and server port and name service is fixed (I do this), i do that becuase specification tell me than it is only one server at one time so i be worried only to ask where is the server and what port are avaible (i think ask name service to client is a bit confuse for him) but maybe with good arguments you can do in your way. I'm not worried about firewall because this is out of scope of the application, this is configuration of the machines.

I hope it helps you.

Hi Gabriel,

Thanks for your reply.

Do you ask the client users for the RMI service port? I don't see the need to ask client users for the server port which is specified when you call UnicastRemoteObject.exportObject(remoteService, serverPort) because the RMI registry will find the server regardless of what port it has, the clients only need to get the servers RMI registry but this is run on another port, default is 1099.


Hi all,

I'm done with most of the coding on the B&S project now, but I have some questions.

First, should the users be able to specify on which ports the RMI registry and server should run on, e.g.

Once the client does a lookup on the service it doesn't need to provide a port to the server, only to the remote registry, e.g.

So there is no need to ask the client for the server port, but only the host and registry port if we allow it to be other than the default 1099. It would nonetheless be good to let the server run on a user-specified port if the server runs behind a firewall.

I guess it would also be good to allow changing the name of the RMI service in both the client and server mode. This way we can connect to different services and we would also avoid RMI service name conflicts.

Thoughts on this?


[ August 21, 2007: Message edited by: Mattew Force ]

Originally posted by Jason Moors:
Have a look at this post it may help.

Extend the DBMain Interface


Thanks Jason, I have the schema and the records stored in the data class now since it seems perfectly valid to extend the DB interface.



Originally posted by Jason Moors:

Why are you creating the schema in your Database class? In my application the schema is read from the header section of the db file in the Data class using RandomAccessFile.

Well, I think the schema should belong in the database class so it is possible to create a flexible gui which is based on the constraints of it. Also, I'm not allowed to add a getSchema method in the data class because then I would break the DB interface :-/



In order for the examiners to test the data class they should be able to get an instance of it in a way that it documented, right? After reading the instructions carefully, I assume there's no need of a default constructor.

The issue I'm having is that I have a database class which has the data class as a variable (no setters or getters though, it's a kind of delegate) and it passes itself (during instantiation) to the data class as a parameter so that the data class knows how it should read/write data since the database class knows the schema etc.

There is a way to test the data class separately, but it requires to also create an instance of the database class (which has the schema, remember)and thus we will end up with two instances of the data class (one which the tester has an instance of and one which is in the database class).

If the examiners only test the methods through the data instance they get directly and not indirectly (delegation) through the database instance, I should be ok. However, if they test the two instances of the data class with different threads, I don't know how things will work. The data class uses an instance of RandomAccessFile to read/write the data file. Is this a bad design?

I don't want the database class to expose the data class via getters. Can anyone give me a hint on how I should continue? It seems to me that I'm running in circles.


I also realized that the schema should be instantiated in my database class and then passed to the data class constructor as a parameter. By doing it this way I won't need to add a getSchema method in the data class which would break the interface.

Originally posted by Jason Moors:
I think you need to ensure the methods defined by the DBMain interface work as described, otherwise you are breaking the contract of the interface.

However that does mean you can't add additional methods in your Database class, i.e. the offset/schema can be stored in you data class to enable the read/update methods, but only exposed to your business layer by your Database interface.

Thank you for your reply, Jason. Your response helped me realize that the schema should probably be stored in the data class, otherwise I will have problems following the contract of the interface.



I, too, have a schema class (and several classes and interfaces), however, they are all access by the data class (which is thus simply a delegate). This seems sensible has the data class has quite high level functions in it (for a data access layer).

Won't you break the interface if your Data class acts as a delegate? If you have additional public methods, shouldn't they be defined in the interface as well?
Hi guys,

I'm wondering if the examiners/assessors require that the Data class can be instansiated separately so they can test it that way.

My code doesn't allow this. I have another class, called Database, that extends the Data class and reads the metadata, schema information, etc. of the data file before it is possible to read/write/delete records through it. It is not possible to read/write/delete records only by instansiating the Data class because it needs information about data offset, record length and deleted records from the other class.

I don't want the Data class to hold schema information and cached records because that should exist at a higher level which is accessible by the business layer.

Do you think my solution will pass?

I am trying to find out how the *beep* my servlet can authenticate a HttpURLConnection but I only end up in frustration. The servlet is running on the intranet but sometimes it needs to access data e.g. from URLs outside the net and then the browser should ask the user for authorization to the proxy server in order to get the input stream from the connection. See the following code:

But of course this fails. I have seen examples where they do:

And it actually worked, but I don't want to hard-code some username and password because the servlet should restrict users which don't have access.

I also tried solving this by redirecting to the url we want to read from to trigger the proxy authentication in the browser but that didn't work.

This is probably a very stupid question but this is only my second servlet so I'm sure there are people here who knows better

Hope someone can help me.

14 years ago

Originally posted by Ernest Friedman-Hill:
Why the "with Java" part? Since you've already got a Microsoft technology front end as an example, and the back end is MS technology, I'd think it'd be a lot simpler to come up with an MS-tech based web service, no?

Well there is already a Java-based web service which is not complete. It must be extended to summarise search results provided from other systems and I thought it would be good if the other systems used Java-based web services as well. :roll:
14 years ago

I would like to set up a web service that provides search results (preferably using Microsoft Indexing Service) from a webpage to another web service that summarises search results from several web services.

I now have an ASP page which performs a search on a webpage using Microsoft Indexing Service, but I would like to have a summmary search page accessible through a web service which summarises the results from the ASP page, but I prefer the ASP results must be ported to a web service that uses this MS indexing service to search.

Does anyone have suggestions on how to solve this?

Sorry if my language is bad - I was in a hurry!

14 years ago

Originally posted by Henry Wong:
My questions is ... what is the purpose of this excercise? If it is to understand two's complement, or if it is a homework assignment, then by all means, what you are doing is perfectly fine. There is no better way to understand bits, than to do low level manipulation of them.

However, if the goal is merely to extract the individual 4 bytes from an int, or to create an int from 4 bytes, as part of an actual program in production, this may be the long way to do it. Converting an int to a byte array, and back, doesn't take bit manipulation, and it is built into the core Java libraries (as of Java 1.4).


I used BigInteger.valueOf(int i).toByteArray() but it gives different sized arrays, so I think using your code would be better.


14 years ago

Originally posted by Campbell Ritchie:
[QB]Welcome to the Ranch.

One of the more interesting questions I have seen recently.

Hi Campbell,

Thank you ;-)

Part of your problem has to do with binary number formats.
Get yourself a standard principles of computing book,
eg Alan Clements the Principles of Computer Hardware [3/e]
Oxford: Oxford University Press (2000) page 175-184
(there is a 4th edition which came out in 2006).

I've actually browsed through the Clements book when I took a class in computer hardware a couple of years ago when I was pursuing my M.Sc. in computer science :-)

So I know about two-complement and how it works well, I had to remind my self how it works, and I can follow your extensive answer (thanks again) quite easily and remember how these things acually work.

This stuff is not currently in my line of field, but I wanted to remind myself about bitwise manipulation etc. and study some more about how primitives are dealt with in Java.

So, -48 in a byte, and +208 in an int use the same bits.
That is why you are converting -48 to 208, and if you try casting 208 (int)
to a byte, (as far as I remember) you will get -48.

I think you are correctly converting your int number to a byte[]
array; you are converting it back into a different format.

Aha, some bits must have been lost in translation then. Casting does work for 208 and it becomes -48 again, but it doesn't work for larger integers which don't fit into a single byte, e.g. -765 which becomes 64771 when I convert from byte[] to int and then becomes 3 when casted back to byte.

But if you take -(2^16 - 64771) you get -765 :-)

So when the MSB of a byte is one I think I must 'or' the byte with 0xFFFFFF00. I updated my code and now it seems to give me negative integers as well. Here's the method which converts a byte array to an int:

Not sure now how to convert the -48 which is 1101 0000 as a byte to what it should be
as an int, ie this:-
1111 1111 1111 1111 1111 1111 1101 0000.

That's what I do in the method above.

Thank you again for straightening out my ideas!
14 years ago