This week's giveaway is in the EJB and other Java EE Technologies forum.
We're giving away four copies of EJB 3 in Action and have Debu Panda, Reza Rahman, Ryan Cuprak, and Michael Remijan on-line!
See this thread for details.
The moose likes Beginning Java and the fly likes basic tools needed create substrings Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "basic tools needed create substrings" Watch "basic tools needed create substrings" New topic
Author

basic tools needed create substrings

Adam Confino
Ranch Hand

Joined: Sep 03, 2009
Posts: 48
Hey Java Gurus,

I am trying write a program that will go to a website, grab the html source code, and parse sections out of it. So far I can return the html source code as a giant string. My question is, what basic classes and methods should I learn to scan through this string and eventually save snippets of data?

I've looked at the split and substring methods on the String class. Other tutorials have suggested the use of the Pattern class, scanner class, and matcher classes. Your thoughts?

As always, thanks for your time.
Adam


Just Another Guy Hooked on Java
John de Michele
Rancher

Joined: Mar 09, 2009
Posts: 600
Adam:

This is a good place to start.

John.
Aneesh Vijendran
Ranch Hand

Joined: Jun 29, 2008
Posts: 125
Hi Adam ,

Do you intend to get the HTML and parse it yourself ? I would mark that as Crazy. You might end up with unexpected results and out of memory exceptions and waste some nice time.

Try using some well tested HTML parsers. I would personally recommend
jerichohtml.sourceforge.net
.

It's clean and friendly.

Cheers
Aneesh


Cheers
Aneesh
Adam Confino
Ranch Hand

Joined: Sep 03, 2009
Posts: 48
Thanks guys.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: basic tools needed create substrings
 
Similar Threads
override .equals() and .hashcode()
How is it possible?
Problem in Locating a properties File (Not the Struts application.properties)
Applet and Servlet connection
For those who eat java, think java and drink java