File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Beginning Java and the fly likes basic tools needed create substrings Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "basic tools needed create substrings" Watch "basic tools needed create substrings" New topic

basic tools needed create substrings

Adam Confino
Ranch Hand

Joined: Sep 03, 2009
Posts: 48
Hey Java Gurus,

I am trying write a program that will go to a website, grab the html source code, and parse sections out of it. So far I can return the html source code as a giant string. My question is, what basic classes and methods should I learn to scan through this string and eventually save snippets of data?

I've looked at the split and substring methods on the String class. Other tutorials have suggested the use of the Pattern class, scanner class, and matcher classes. Your thoughts?

As always, thanks for your time.

Just Another Guy Hooked on Java
John de Michele

Joined: Mar 09, 2009
Posts: 600

This is a good place to start.

Aneesh Vijendran
Ranch Hand

Joined: Jun 29, 2008
Posts: 125
Hi Adam ,

Do you intend to get the HTML and parse it yourself ? I would mark that as Crazy. You might end up with unexpected results and out of memory exceptions and waste some nice time.

Try using some well tested HTML parsers. I would personally recommend

It's clean and friendly.


Adam Confino
Ranch Hand

Joined: Sep 03, 2009
Posts: 48
Thanks guys.
It is sorta covered in the JavaRanch Style Guide.
subject: basic tools needed create substrings
It's not a secret anymore!