• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Invalid byte 1 of 1-byte UTF-8 sequence

 
Santiago Rodriguez
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi
I have the next error when I try to transform a xml and xls in pdf (FOP)
javax.xml.transform.TransformerException: javax.xml.transform.TransformerException: com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Invalid byte 1 of 1-byte UTF-8 sequence.|#]

Please help me...
Thanks
Santiago
 
Paul Clapham
Sheriff
Pie
Posts: 20725
30
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That means your document isn't actually encoded in UTF-8, but you are reading it as though it were. This is often because whoever created the document failed to specify its encoding in the prolog.

So send it back to whoever created it and ask them to fix it up. If you don't feel you have the technical background to back up that claim yourself (and you probably shouldn't) then read this tutorial first:

http://skew.org/xml/tutorial/
 
Ramamoorthy Govindaraj
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I got the same exception. Fortunately, I resolved in the following ways, this code will help for others.

String output = "some contents...go here."; //or input from other
String s = new String(output.getBytes(),"UTF-8");//force to convert UTF-8 standard will address this issue Invalid byte 1 of 1-byte UTF-8 sequence
Writer writer = new BufferedWriter(new FileWriter("c:/temp/Jasper/invoice.html"));
try{
writer.write(s);
}finally{
writer.close();
}
 
Paul Clapham
Sheriff
Pie
Posts: 20725
30
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That may work in the sense that it won't throw the exception any more. It may not prevent damage to the data caused by failing to read the document using UTF-8 in the first place.

You may well have seen pages on the web with things like Euro signs and A-with-a-hat characters where there should have been quotes or dashes. This is the sort of thing that happens if you don't use the right encodings.
 
Michael Angstadt
Ranch Hand
Posts: 277
Eclipse IDE Java PHP
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I thought I would share my thoughts, since I was having the same problem (even though this thread is very old).

I did what Ramamoorthy Govindaraj suggested (except my input/output streams used files instead of Strings because my XML document was very large and storing the entire document in memory would have been inefficient):



But that still didn't work. When I opened the file in a text editor (Notepad++), I saw a question mark character at the very beginning of the file. After I deleted that character, I could parse the file successfully.

Working with encodings is annoying because text files are supposed to be simple.
 
James Boswell
Bartender
Posts: 1051
5
Chrome Eclipse IDE Hibernate
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Michael

I think you may need to define the encoding for the output stream. Something like the following:
 
Brylle Lee
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Character encoding differs from system to system, with some common standards including ISO-8859-1, UTF-8 plus other encodings such as Mac OS.
 
Louie Poll
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:That means your document isn't actually encoded in UTF-8, but you are reading it as though it were. This is often because whoever created the document failed to specify its encoding in the prolog.

So send it back to whoever created it and ask them to fix it up. If you don't feel you have the technical background to back up that claim yourself (and you probably shouldn't) then read this tutorial first:

http://skew.org/xml/tutorial/


Im actually having the same problem, and it really stresses me a lot. I hope this get solved by this.....
 
Raju Sharmas
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I also had the same problem. Was looking for solutions in the internet. This trade helped me a lot. Thanks to all.
 
john wise
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Brylle Lee wrote:Character encoding differs from system to system, with some common standards including ISO-8859-1, UTF-8 plus other encodings such as Mac OS.

Does encoding also differ from Windows OS versions?
 
unspoken hermit
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I got a XML doc and a java class which should process this XML doc (on WinXP-OS).

Unfortunately I am getting an execption:

"Invalid byte 1 of 1-byte UTF-8 sequence"

What's wrong?

Because I have not the java source I can only change the XML doc.
The XML doc starts:

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema";>
....

It could be that there are some line end conversion errors when I downloaded the XML file from
Linux server.

Could this be the problem ?
___________________


 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13055
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The first thing I would do is examine the start of that document with an editor that can display HEX values to see what it really starts with.

Personally I use UltraEdit-32.

Do you know how the XML document was created?

Bill
 
steven scortez
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Brogden wrote:The first thing I would do is examine the start of that document with an editor that can display HEX values to see what it really starts with.

Personally I use UltraEdit-32.

Do you know how the XML document was created?

Bill


Will. Is the UltraEdit 32 easy to use?

--

Toshiba NB505 Review
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13055
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ultraedit is always open on my desktop.
I organize all projects, including my personal papers, using the UE Project/Workspace concepts.
I edit all Java and XML with the keyword sensitive editor.
I compile all programs using UE's Project Tool Customization in combination with ANT capabilities.
Getting started for things like viewing files in HEX is easy.

Ultraedit is a commercial program but I feel good about spending money for good tools.

Bill
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic