File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
A friendly place for programming greenhorns!
Big Moose Saloon
Register / Login
Win a copy of
EJB 3 in Action
this week in the
EJB and other Java EE Technologies
Problem with java reading a webpage
Colin A Thompson
Joined: Dec 08, 2009
Dec 08, 2009 12:28:41
I am having a problem with copying a webpage to a text file. Every time I run my program for this one site the text that gets copied has Asian characters. There are no Asian characters on the page.
I tried my code on other websites and it works fine. Are there security measures that prevent a web page from being copied?
The website I am having problems with is public information and I am not reselling anything of theirs.
Joined: Oct 14, 2005
Dec 08, 2009 13:22:56
Possibly you are using the wrong charset to convert the downloaded data from bytes to chars. That's just my first guess, though, I'm sure there could be dozens of other things wrong. You don't provide many details for us to comment on.
Joined: Oct 13, 2005
Dec 09, 2009 03:30:43
Also some text editors or terminal windows may be only able to display ASCII or Latin-1 characters.
I agree. Here's the link:
subject: Problem with java reading a webpage
Problem with Combo box
how to identify asian/european characters
Screen rotated between portrait and landscape on the fly
How to write single byte and double byte characters on pdf using java api(iText)
All times are in JavaRanch time: GMT-6 in summer, GMT-7 in winter
| Powered by
Copyright © 1998-2014