Win a copy of Clojure in Action this week in the Clojure forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How To Clean Up All The Formats Of The Web Pages On The Internet?

 
JiaPei Jen
Ranch Hand
Posts: 1309
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The web pages that we see over the internet is often formatted; i.e. in tables, with fonts, etc.
Is there a way to clean up all the formats and print those pages in plain text? Take the interfaces and classes of Java 1.4 API for example, I am trying to read them and print
method, parameter, return type, etc.
in plain text using Java. Where may I find explanatin on how to do it?
[ January 02, 2003: Message edited by: JiaPei Jen ]
 
Arun Boraiah
Ranch Hand
Posts: 233
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Simple way is copy past the content to plan text editor like notepad. And print it.
-arun
 
JiaPei Jen
Ranch Hand
Posts: 1309
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a lot of this kind of cleanup work to do. I cannot afford to "copy and paste" by hand. This is the reason I would like to write a Java program to do it. Please if anybody could give me the guidance.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Try looking at JTidy.
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic