File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Beginning Java and the fly likes Problem with java reading a webpage Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Problem with java reading a webpage" Watch "Problem with java reading a webpage" New topic
Author

Problem with java reading a webpage

Colin A Thompson
Greenhorn

Joined: Dec 08, 2009
Posts: 2
I am having a problem with copying a webpage to a text file. Every time I run my program for this one site the text that gets copied has Asian characters. There are no Asian characters on the page.

I tried my code on other websites and it works fine. Are there security measures that prevent a web page from being copied?

The website I am having problems with is public information and I am not reselling anything of theirs.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18121
    
    8

Possibly you are using the wrong charset to convert the downloaded data from bytes to chars. That's just my first guess, though, I'm sure there could be dozens of other things wrong. You don't provide many details for us to comment on.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 36478
    
  16
Also some text editors or terminal windows may be only able to display ASCII or Latin-1 characters.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Problem with java reading a webpage
 
Similar Threads
Problem with Combo box
Ajax Technology
how to identify asian/european characters
Screen rotated between portrait and landscape on the fly
How to write single byte and double byte characters on pdf using java api(iText)