wood burning stoves 2.0*
The moose likes Web Services and the fly likes Trouble parsing XML document received via http (not SOAP) Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » Web Services
Bookmark "Trouble parsing XML document received via http (not SOAP)" Watch "Trouble parsing XML document received via http (not SOAP)" New topic
Author

Trouble parsing XML document received via http (not SOAP)

Luke Porter
Greenhorn

Joined: Jan 30, 2006
Posts: 1
Hi there,

I'm having some issues parsing an xml document that is streamed from an http source.

My program sends an XML request via http (to an ASP Page) like so (conn is the HttpURLConnection):


conn.setDoOutput(true);
conn.setUseCaches(false);
conn.setDoInput(true);
conn.setRequestMethod("POST");
conn.setRequestProperty("Content-type", "application/x-www-form-urlencoded");

OutputStreamWriter outStream = new OutputStreamWriter(new BufferedOutputStream(conn.getOutputStream()),"UTF-8");
outStream.write(xmlRequestString);
outStream.close();


I've left out some of the finer details, but the request works fine.

When a CDATA node is returned with a large amount of formatted binary data in the response XML document, the text is always squashed onto 2 lines, essentially losing the layout of the CDATA contents (in this case, a text report).

eg. Correct formatting:

<CDATA[[
report

company 45393987398
figures 983983983983
blah blah 1023848484
hello
]]>

eg. Actual formatting:

<CDATA[[report company 45393987398 figures 983983983983blah blah 1023848484 hello]]>

I have verified that the XML being sent to me is ok as I have pulled it using the ServerXMLHTTP object via asp - the report comes back fine, all formatting correct.

Here are the two ways I have tried to retreive the XML response in java:

1)


InputStream is = conn.getInputStream();
InputStreamReader isr = new InputStreamReader(is,"UTF-8");
BufferedReader in = new BufferedReader(isr);
StringBuffer contents = new StringBuffer();

boolean finished = false;
while(!finished)
{
String aLine = in.readLine();
if(aLine == null)
{
finished = true;
}
else
{
contents.append(aLine);
}
}


2) Using a SAX parser:


XMLReader parser = XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");

org.xml.sax.ContentHandler handler = new MySAXHandler();
parser.setContentHandler(handler);

InputStream in = conn.getInputStream();
InputSource source = new InputSource(in);
parser.parse(source);


In both cases, the entire XML document is received over http, but the formatting in the CDATA node is lost. I have done a test whereby I save the correctly formatted XML document to a file on the disk and try to parse from the local file instead of a stream, and IT WORKS FINE! Seems to be an issue with how the stream is pulling in the XML document.

I am really out of ideas and wondered if anybody had some suggestions on what I can try?

Many thanks for your time.
wise owen
Ranch Hand

Joined: Feb 02, 2006
Posts: 2023
Peer Reynders
Bartender

Joined: Aug 19, 2005
Posts: 2921
    
    5
Avoid using either "\n" or "\r\n" (they're platform dependent); use the line.separator property instead i.e.



Anyway, to elaborate on the previous post see java.io.BufferedReader.readLine().
Returns: A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached

[ February 03, 2006: Message edited by: Peer Reynders ]
 
Don't get me started about those stupid light bulbs.
 
subject: Trouble parsing XML document received via http (not SOAP)
 
Similar Threads
Issue with XML DOM parser
Problems connecting to LDAP from Weblogic Server: BAD_CERTIFICATE error!
IllegalAccessException in Tomcat
? How to call JAXP/JDOM from other class
help "Error parsing XML document: null"