This week's book giveaway is in the Java 8 forum.
We're giving away four copies of Java 8 in Action and have Raoul-Gabriel Urma, Mario Fusco, and Alan Mycroft on-line!
See this thread for details.
The moose likes Beginning Java and the fly likes Reading website text Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Java 8 in Action this week in the Java 8 forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Reading website text" Watch "Reading website text" New topic
Author

Reading website text

Farakh khan
Ranch Hand

Joined: Mar 22, 2008
Posts: 726
I made a software using Java Servlet that can extract emails from the text/winword files.

I want to prove it in the way that this should be able to read emails from web. e.g.

1) User will insert the keword(s)

2) User will select the option to search this/these keyword(s) from search engine or specific URL

3) The extracted emails will be given as output *.txt

Can you please help me to know that how can I read emails on the web with Java Servlet?

Thanks & best regards
Ashish Hareet
Ranch Hand

Joined: Jul 14, 2001
Posts: 375
Farakh,

Reading webpages is essentially the same as reading any other file except for the IO classes. You might want to use java.net.URL class to obtain a stream to the resource.

To make things easier with the parsing of webpages, you can treat the webpages as xml resources & then use DOM or SAX parsers. Have a look at the javax.xml.parsers.SAXParser class as a starting point.

HTH
Ashish Hareet
Norm Radder
Ranch Hand

Joined: Aug 10, 2005
Posts: 685
By reading emails, do you mean connecting to an email server and reading the emails it has for a user? Like Outlook express or Thunderbird. To do this I think you need to understand SMTP. Some doc for SMTP are in RFC821 and RFC1869.
Farakh khan
Ranch Hand

Joined: Mar 22, 2008
Posts: 726
Originally posted by Ashish Hareet:
Farakh,

Reading webpages is essentially the same as reading any other file except for the IO classes. You might want to use java.net.URL class to obtain a stream to the resource.

To make things easier with the parsing of webpages, you can treat the webpages as xml resources & then use DOM or SAX parsers. Have a look at the javax.xml.parsers.SAXParser class as a starting point.

HTH
Ashish Hareet



Thanks a lot Ashish Hareet
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Reading website text
 
Similar Threads
Reading specific words from search engines
Problem in reading csv file
Java API to search mails in MS outlook?
how to send emails from Java application using Domino server
Sending mails from same domain