File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes XML and Related Technologies and the fly likes Error while parsing XML with PDF attachment Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Error while parsing XML with PDF attachment" Watch "Error while parsing XML with PDF attachment" New topic

Error while parsing XML with PDF attachment

Sunil Trivedi

Joined: Feb 28, 2004
Posts: 8

I am generating a XML file and retrieving all the elements (includin attachment) from the XML. The steps are mentioned below:

1) Generate a XML file based on schema.
2) Attach a file (ex. PDF, doc, image etc) to the XML
3) Sign the generated XML file using XMLDsig
4) Send the file to client
5) Parse the XML file to retrieve all elements includin the attachment file.

I am facing problem in step (5) while parsing the attachment tag. The error is as followings:

org.xml.sax.SAXParseException: The content of elements must consist of well-formed character data or markup.
at org.apache.xerces.framework.XMLParser.reportError(
at org.apache.xerces.framework.XMLDocumentScanner.reportFatalXMLError(
at org.apache.xerces.framework.XMLDocumentScanner.abortMarkup(
at org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(
at org.apache.xerces.framework.XMLDocumentScanner.parseSome(
at org.apache.xerces.framework.XMLParser.parse(
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
at XMLSignGen.main(

Do you know how to resolve this problem?

Thanks in advance
Paul Clapham

Joined: Oct 14, 2005
Posts: 19973

The error message explains how to resolve the problem: it says "The content of elements must consist of well-formed character data or markup" so you need to make sure that is the case.

I don't understand step (2) in your algorithm though. Does that mean you just tack the file onto the XML after the closing tag of the root element? Or does it actually go into a text node inside the element? And do you realize that XML is a text format that cannot contain arbitrary binary data such as that found in images and PDF and MS Word documents?
Sunil Trivedi

Joined: Feb 28, 2004
Posts: 8
Hi Paul

Thanks for prompt reply.
Actually the steps I have mentioned in my query that is my work requirement.
Curently I am trying to run a stand-alone program. So I have created a SOAP message. Then I am attaching a PDF file to the SOAP message. After that while I am trying to parse the message, I am facing the mentioned (in my first query) error.

The attachment is being done in encoding format. So "The content of elements"(only for attachment tag) get changed

Do you know why this error is coming?

Thanks & Regards
Paul Clapham

Joined: Oct 14, 2005
Posts: 19973

Nope. Because I don't understand what it means to "attach" a file to a SOAP message. And I don't understand what that "encoding format" is. However you have done it incorrectly in some way -- and that's all I can really say because I know very little about SOAP. I would look at the "encoding format" and see what it's doing. Or better still I would look at the XML being produced as SOAP message and see what is wrong with it.
Sunil Trivedi

Joined: Feb 28, 2004
Posts: 8
I have never worked on SOAP. SO I too know less about it.

First I am creating a SOAP message and then attaching a PDF file using the following code:

MessageFactory msgFactory = MessageFactory.newInstance();
SOAPMessage message = msgFactory.createMessage();
FileDataSource file = new FileDataSource("ch14.pdf");
DataHandler dataHandler = new DataHandler(file);
AttachmentPart attachment = message.createAttachmentPart(dataHandler);

After attachment, I am writing down the message in a XML file ....

OutputStream os = new FileOutputStream("C:/GenTest.xml");

Now I am signing the XML file using XMLDSig...

String providerName = System.getProperty ("jsr105Provider", "");
XMLSignatureFactory fac = XMLSignatureFactory.getInstance("DOM",(Provider) Class.forName(providerName).newInstance());
Reference ref = fac.newReference
("", fac.newDigestMethod(DigestMethod.SHA1, null),
(Transform.ENVELOPED, (TransformParameterSpec) null)),
null, null);

SignedInfo si = fac.newSignedInfo
(C14NMethodParameterSpec) null),
fac.newSignatureMethod(SignatureMethod.DSA_SHA1, null),

KeyPairGenerator kpg = KeyPairGenerator.getInstance("DSA");
KeyPair kp = kpg.generateKeyPair();
KeyInfoFactory kif = fac.getKeyInfoFactory();
KeyValue kv = kif.newKeyValue(kp.getPublic());

keyInfo ki = kif.newKeyInfo(Collections.singletonList(kv));
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document doc = dbf.newDocumentBuilder().parse(new FileInputStream("C:/GenTest.xml"));---------------------(1)
DOMSignContext dsc = new DOMSignContext(kp.getPrivate(), doc.getDocumentElement());
XMLSignature signature = fac.newXMLSignature(si, ki);

At the step (1), while parsing the XML file, I am getting the above mentioned error[that I have put in my first query].

Please let me know where I am wrong. If you think there is some othere alternatives to achieve the goal please let me know.

Thanks & Regards
I agree. Here's the link:
subject: Error while parsing XML with PDF attachment
It's not a secret anymore!