| Author |
Invalid byte 1 of 1-byte UTF-8 sequence
|
Santiago Rodriguez
Greenhorn
Joined: Aug 16, 2006
Posts: 10
|
|
Hi I have the next error when I try to transform a xml and xls in pdf (FOP) javax.xml.transform.TransformerException: javax.xml.transform.TransformerException: com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Invalid byte 1 of 1-byte UTF-8 sequence.|#] Please help me... Thanks Santiago
|
 |
Paul Clapham
Bartender
Joined: Oct 14, 2005
Posts: 13842
|
|
That means your document isn't actually encoded in UTF-8, but you are reading it as though it were. This is often because whoever created the document failed to specify its encoding in the prolog. So send it back to whoever created it and ask them to fix it up. If you don't feel you have the technical background to back up that claim yourself (and you probably shouldn't) then read this tutorial first: http://skew.org/xml/tutorial/
|
 |
Ramamoorthy Govindaraj
Greenhorn
Joined: Dec 31, 2009
Posts: 5
|
|
I got the same exception. Fortunately, I resolved in the following ways, this code will help for others.
String output = "some contents...go here."; //or input from other
String s = new String(output.getBytes(),"UTF-8");//force to convert UTF-8 standard will address this issue Invalid byte 1 of 1-byte UTF-8 sequence
Writer writer = new BufferedWriter(new FileWriter("c:/temp/Jasper/invoice.html"));
try{
writer.write(s);
}finally{
writer.close();
}
|
 |
Paul Clapham
Bartender
Joined: Oct 14, 2005
Posts: 13842
|
|
That may work in the sense that it won't throw the exception any more. It may not prevent damage to the data caused by failing to read the document using UTF-8 in the first place.
You may well have seen pages on the web with things like Euro signs and A-with-a-hat characters where there should have been quotes or dashes. This is the sort of thing that happens if you don't use the right encodings.
|
 |
Michael Angstadt
Ranch Hand
Joined: Jun 17, 2009
Posts: 269
|
|
I thought I would share my thoughts, since I was having the same problem (even though this thread is very old).
I did what Ramamoorthy Govindaraj suggested (except my input/output streams used files instead of Strings because my XML document was very large and storing the entire document in memory would have been inefficient):
But that still didn't work. When I opened the file in a text editor (Notepad++), I saw a question mark character at the very beginning of the file. After I deleted that character, I could parse the file successfully.
Working with encodings is annoying because text files are supposed to be simple.
|
SCJP 6 || SCWCD 5
|
 |
James Boswell
Ranch Hand
Joined: Nov 09, 2011
Posts: 341
|
|
Hi Michael
I think you may need to define the encoding for the output stream. Something like the following:
|
 |
Brylle Lee
Greenhorn
Joined: Nov 14, 2011
Posts: 1
|
|
|
Character encoding differs from system to system, with some common standards including ISO-8859-1, UTF-8 plus other encodings such as Mac OS.
|
 |
Louie Poll
Greenhorn
Joined: Nov 19, 2011
Posts: 1
|
|
Paul Clapham wrote:That means your document isn't actually encoded in UTF-8, but you are reading it as though it were. This is often because whoever created the document failed to specify its encoding in the prolog.
So send it back to whoever created it and ask them to fix it up. If you don't feel you have the technical background to back up that claim yourself (and you probably shouldn't) then read this tutorial first:
http://skew.org/xml/tutorial/
Im actually having the same problem, and it really stresses me a lot. I hope this get solved by this.....
|
 |
Raju Sharmas
Greenhorn
Joined: Nov 21, 2011
Posts: 1
|
|
|
I also had the same problem. Was looking for solutions in the internet. This trade helped me a lot. Thanks to all.
|
 |
john wise
Greenhorn
Joined: Nov 30, 2011
Posts: 1
|
|
Brylle Lee wrote:Character encoding differs from system to system, with some common standards including ISO-8859-1, UTF-8 plus other encodings such as Mac OS.
Does encoding also differ from Windows OS versions?
|
crucial memory coupon
|
 |
unspoken hermit
Greenhorn
Joined: Dec 14, 2011
Posts: 1
|
|
I got a XML doc and a java class which should process this XML doc (on WinXP-OS).
Unfortunately I am getting an execption:
"Invalid byte 1 of 1-byte UTF-8 sequence"
What's wrong?
Because I have not the java source I can only change the XML doc.
The XML doc starts:
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema";>
....
It could be that there are some line end conversion errors when I downloaded the XML file from
Linux server.
Could this be the problem ?
___________________
|
 |
William Brogden
Author and all-around good cowpoke
Rancher
Joined: Mar 22, 2000
Posts: 11862
|
|
The first thing I would do is examine the start of that document with an editor that can display HEX values to see what it really starts with.
Personally I use UltraEdit-32.
Do you know how the XML document was created?
Bill
|
Java Resources at www.wbrogden.com
|
 |
steven scortez
Greenhorn
Joined: Dec 16, 2011
Posts: 1
|
|
William Brogden wrote:The first thing I would do is examine the start of that document with an editor that can display HEX values to see what it really starts with.
Personally I use UltraEdit-32.
Do you know how the XML document was created?
Bill
Will. Is the UltraEdit 32 easy to use?
--
Toshiba NB505 Review
|
Life is short live it up
|
 |
William Brogden
Author and all-around good cowpoke
Rancher
Joined: Mar 22, 2000
Posts: 11862
|
|
Ultraedit is always open on my desktop.
I organize all projects, including my personal papers, using the UE Project/Workspace concepts.
I edit all Java and XML with the keyword sensitive editor.
I compile all programs using UE's Project Tool Customization in combination with ANT capabilities.
Getting started for things like viewing files in HEX is easy.
Ultraedit is a commercial program but I feel good about spending money for good tools.
Bill
|
 |
 |
|
|
subject: Invalid byte 1 of 1-byte UTF-8 sequence
|
|
|