This week's book giveaway is in the Servlets forum.
We're giving away four copies of Murach's Java Servlets and JSP and have Joel Murach on-line!
See this thread for details.
The moose likes Java in General and the fly likes How to read text content not source code from webpage in java ? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "How to read text content not source code from webpage in java ?" Watch "How to read text content not source code from webpage in java ?" New topic
Author

How to read text content not source code from webpage in java ?

Marimuthu Udayakumar
Greenhorn

Joined: Jun 17, 2008
Posts: 16
Hi Guyz..
How to read text content not source code from webpage using java ?

Thanks,
http://teknoturfian.blogspot.com


Thanks and Regards,
P.Marimuthu Udayakumar
Venkateswara Rao Desu
Greenhorn

Joined: Apr 13, 2009
Posts: 7
In java.net package we have URLConnection class is there. we can use that to connect to some URL and request and get response from that.

-- Venkateswara Rao Desu
Marimuthu Udayakumar
Greenhorn

Joined: Jun 17, 2008
Posts: 16
Hi Venkateswara ,
Thanks for your reply,
I tried this,


import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;


public class URLExp {

public static void main(String[] args) {
try {
URL google = new URL("http://www.google.com/");
URLConnection yc = google.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(yc
.getInputStream()));
String inputLine;
while ((inputLine = in.readLine()) != null) {
System.out.println(inputLine);

}
in.close();
} catch (Exception e) {
e.printStackTrace();
}
}

}


BUT...
what happend i can get the source code of the webpage ,I need text based real content.So what i do?...
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14074
    
  16

Marimuthu Udayakumar wrote:BUT...
what happend i can get the source code of the webpage ,I need text based real content.So what i do?...

You'd have to parse the HTML in your program and get the text out of it yourself.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 7 API documentation
Scala Notes - My blog about Scala
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19649
    
  18

And next time, please http://faq.javaranch.com/java/UseCodeTags


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Marimuthu Udayakumar
Greenhorn

Joined: Jun 17, 2008
Posts: 16
Hello Jesper Young ,
Thanks for your query,I made it.

Hi Rob Prime,
Thanks for your suggesstion that code Tag, I used that Tag too here...

I used NekoHTML parser ..




I used jar files named nekohtml.jar and xercesImpl.jar for parser ,
I am not able to attach those jarfiles here.just you can download from web,
If you dont get it just mail me to teknoturfian@gmail.com
I will send it to you..
Thanks guys...Have a good day...
http://www.wix.com/muthu_tek/Marimuthu-at-Teknoturf
http://teknoturfian.blogspot.com

" I aim to bring Passion and Quality to every relationship"
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: How to read text content not source code from webpage in java ?
 
Similar Threads
This weeks Giveaway
Writing to a resource inside the JAR, URI not hierarchical?
Display colored output to a console/terminal with java
textarea
Read Data from webpage