Two Laptop Bag
The moose likes Beginning Java and the fly likes Problem with java reading a webpage Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Problem with java reading a webpage" Watch "Problem with java reading a webpage" New topic

Problem with java reading a webpage

Colin A Thompson

Joined: Dec 08, 2009
Posts: 2
I am having a problem with copying a webpage to a text file. Every time I run my program for this one site the text that gets copied has Asian characters. There are no Asian characters on the page.

I tried my code on other websites and it works fine. Are there security measures that prevent a web page from being copied?

The website I am having problems with is public information and I am not reselling anything of theirs.
Paul Clapham

Joined: Oct 14, 2005
Posts: 19973

Possibly you are using the wrong charset to convert the downloaded data from bytes to chars. That's just my first guess, though, I'm sure there could be dozens of other things wrong. You don't provide many details for us to comment on.
Campbell Ritchie

Joined: Oct 13, 2005
Posts: 46352
Also some text editors or terminal windows may be only able to display ASCII or Latin-1 characters.
I agree. Here's the link:
subject: Problem with java reading a webpage
jQuery in Action, 3rd edition