Win a copy of Learn Spring Security (video course) this week in the Spring forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

java io UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence

 
Jignesh Gohel
Ranch Hand
Posts: 276
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,

I have two queries as follows:
a) What is the difference between response content types text/pdf and application/pdf ?
2) In my application i am generating a xml.In the generated XML the encoding is alreadys specified as "UTF-8".

Now using this xml, i want to generated a PDF.For generating this PDF i am using JasperReport's class net.sf.jasperreports.engine.data.JRXmlDataSource.

The code snippet for the same is as follows:



Buw when the following line is executed :


i am getting this exception :



Can anybody please explain me why this is happening and how to resolve this ?

Thanks,
Jignesh
 
Nicholas Jordan
Ranch Hand
Posts: 1282
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The first two bytes of a Unicode file are a marker code to determine byte ordering in the file. It appears, as a preliminary guess, that you are getting a xmlDataBuffer and using it as a JRXmlDataSource, the UTFDataFormatException tells us that this first and second byte are not an 0xfe 0xff pair. The obvious place to look is dig deep in the documentaion for the two data types looking for any and all information on the BOM implementation ( BOM == byte order mark )

Byte Order Mark. The Unicode character U+FEFF when used to indicate the byte order of a text


Source: Glossary of Unicode Terms

The exceptions message tells us:




MIME == Multipurpose Internet Mail Extensions
described in - [RFC2045,RFC2046]

See: MIME Media Types
  •   text/pdf
    The PDF format has become a standard for document transfer between computer architectures. A PDF file retains formatting for the file being transmitted. (...snip...)
    SOURCE: FILExt - The File Extension Source
  •   application/pdf The program that displays text/pdf


  • [ March 16, 2008: Message edited by: Nicholas Jordan ]
     
    Paul Clapham
    Sheriff
    Pie
    Posts: 20716
    30
    Eclipse IDE Firefox Browser MySQL Database
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Here's your problem:

    Your XML document declares that it is encoded in UTF-8. But you disregarded that, and encoded it to bytes using your system's default encoding, which is not UTF-8. So if the document contained non-ASCII characters, they would have been mangled. Here's what you want instead:

    Actually I would try to avoid what you are doing, which is to convert chars to bytes and then have the parser convert the bytes back to chars. Even if you do it right, it's wasteful. If JRXmlDataSource has a constructor that takes a Reader, or an InputSource, then use a StringReader containing xmlDataBuffer.toString().
     
    • Post Reply
    • Bookmark Topic Watch Topic
    • New Topic