aspose file tools*
The moose likes I/O and Streams and the fly likes process from request to response(UTF-8) Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "process from request to response(UTF-8)" Watch "process from request to response(UTF-8)" New topic
Author

process from request to response(UTF-8)

Yi Si
Ranch Hand

Joined: Nov 12, 2003
Posts: 54
I was wondering if you could help clarify something for me. Sorry in advance for long post.
Following is process from sending a HTTP request to receive HTTP response.
1.The browser running in the client’s native locale encodes the form parameter data in the HTTP request so that it is in a readable format for the web application.
2.When the application receive the data, it is in Unicode format.
3.Read string with UTF-8 encoding from HTTP request.
4.Do something…
5.Write Unicode characters with UTF-8 encoding to HTTP response.
6.The characters should be display in browser.
I can not read correct Janpanese characters in step 3, pls help me.
another question.
Do following convertion, why some of characters can not be displayed.
Janpanses characters -> encode with UTF-8 -> bytes -> unencode with ISO8859-1 -> character(Malformed)
-> getString(oldstring.getBytes(),"UTF-8").


Sun(java)-SCJA SCJP SCWCD SCBCD SCDJWS SCEA<br />Sun(solaris)-SCSA SCSN<br />IBM-486 Object-Oriented Analysis and Design with UML Test <br />IBM Certified System Administrator - WebSphere Application Server Network Deployment V6.0<br />Oracle-OCA
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Do following convertion, why some of characters can not be displayed.
Janpanses characters -> encode with UTF-8 -> bytes -> unencode with ISO8859-1 -> character(Malformed)

If the chars were encoded with UTF-8, why would you unencode with ISO-8859-1 and expect to get something meaningful? ISO-8859-1 is only good for western languages (roman alphabet) anyway; it can't be used for something like Japanese. However UTF-8 is designed for Japanese and other asian languages (not necessarily designed well for them, but that's another discussion.) If you encode and decode with UTF-8 you should be fine.
[ November 21, 2003: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
Yi Si
Ranch Hand

Joined: Nov 12, 2003
Posts: 54
Jim Yingst , thank you for you help.
Do you agree my folloing idea ?
1.The browser running in the client’s native locale encodes the form parameter data in the HTTP request so that it is in a readable format for the web application.
2.When the web application receive the data, it is in Unicode format.in another word they be encoded with UTF-8 encoding.
3.The web application extract the data using OS default encoding.(so may be ISO-8859 serials or Shift_JIS)
4,We need use getString(string, encoding) to get useful info
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
1.The browser running in the client´┐Żs native locale encodes the form parameter data in the HTTP request so that it is in a readable format for the web application.
Well, the browser can encode data however it wants. A browser running in Japan might encode in Shift-JIS. If my web app server is running in the US, I wouln't necessarily call that a "readable format" unless it's decoded correctly.
2.When the web application receive the data, it is in Unicode format.in another word they be encoded with UTF-8 encoding.
Mmmm... is this data being sent as parameters of an HTTP request? Or (more rarely) as part of the body of a request? If it's a parameter, and you're processing the request with a Servlet, then getParameter(name) returns a String. This String has already been decoded from whatever encoding was used by the browser.
If the data is in the body, you need to use getCharacterEncoding() to learn what encoding was used. Then use something like

Now you can convert the request body to Unicode chars.
3.The web application extract the data using OS default encoding.(so may be ISO-8859 serials or Shift_JIS)
No, it's either a String, or it's an InputStream encoded in whatever encoding the browser chose to use. The web application's default encoding may be completely different from the default that was used by the browser. Never use default encodings when you're exhanging data between different machines which may have different defaults.
4,We need use getString(string, encoding) to get useful info
Or a new InputStreamReader(InputStream, encoding). Or some other classes in java.nio if we prefer.
Yi Si
Ranch Hand

Joined: Nov 12, 2003
Posts: 54
thank you very much.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: process from request to response(UTF-8)