File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes java parsing using regular expression Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "java parsing using regular expression" Watch "java parsing using regular expression" New topic
Author

java parsing using regular expression

shan rast
Greenhorn

Joined: Jan 25, 2009
Posts: 7
http://www.csc.liv.ac.uk/teaching/modules/year3s1/comp304.html

i need to parse this page and extract the member of staff on this page using regular expression.
Using java.utility.regex

i only need the regular expression rest code i have done

//////
import java.io.*;
import java.net.*;
import java.util.regex.*;

class Spider{
public static void main(String []argv){
try {

URL url = new URL("http://www.csc.liv.ac.uk/teaching/modules/year3s1/comp304.html");
URLConnection urlConnection = url.openConnection();
DataInputStream dis = new DataInputStream(urlConnection.getInputStream());
String html= "", tmp = "";
// read all HTML source from given URL
while ((tmp = dis.readLine()) != null) {
html += " "+tmp;
}
dis.close();

// replace all white spaces region with single space
html = html.replaceAll("\\s+", " ");
// build the pattern using regular expression

//here is the pattern where i have to define a regular expression to find the name of the author from the page
*
*
//please REPLY ME THE REGULAR EXPRESSION NEEDED PLEASE IN THE Pattern.compile
//for the link http://www.csc.liv.ac.uk/teaching/modules/year3s1/comp304.html

Pattern p = Pattern.compile("");
// Match the pattern with given html source
Matcher m = p.matcher(html);
// Get all matches that matched my pattern
while (m.find() == true){
// Print the first matched pattern
System.out.println(m.group(1));
}
}catch (Exception e) {
System.out.println(e);
}
}
}

/////
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8834
    
    7

Please Do Your Own Homework
I am certain Dr K Atkinson would not want us to give you the answer.


"blabbing like a narcissistic fool with a superiority complex" ~ N.A.
[How To Ask Questions On JavaRanch]
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19649
    
  18

Also, please Use Code Tags.


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
shan rast
Greenhorn

Joined: Jan 25, 2009
Posts: 7
Rob Prime wrote:Also, please Use Code Tags.



Thanks i got it but i have new problem i have posted a new post please give a solutionn for that
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 37884
    
  22
shan rast wrote:Thanks i got it but i have new problem i have posted a new post please give a solutionn for that
Please tell us how you solved the problem, so others can learn from your experience.
shan rast
Greenhorn

Joined: Jan 25, 2009
Posts: 7
Campbell Ritchie wrote:
shan rast wrote:Thanks i got it but i have new problem i have posted a new post please give a solutionn for that
Please tell us how you solved the problem, so others can learn from your experience.


Pattern p1 = Pattern.compile("<h2[^>]*>"+".*?]*>"+"([^<]+)"+"</a[^>]*>");
this is the pattern and now you just need to call the second group which will give the names of the author
shan rast
Greenhorn

Joined: Jan 25, 2009
Posts: 7
shan rast wrote:
Campbell Ritchie wrote:
shan rast wrote:Thanks i got it but i have new problem i have posted a new post please give a solutionn for that
Please tell us how you solved the problem, so others can learn from your experience.


Pattern p1 = Pattern.compile("&lt;h2[^&gt;]*&gt;&quot;+&quot;.*?<a >]*&gt;&quot;+&quot;([^&lt;]+)&quot;+&quot;&lt;/a[^&gt;]*&gt;&quot;);
this is the pattern and now you just need to call the second group which will give the names of the author


i have parsed the data and now i need to store it in .xml or say it as to write in xml
so thats my second problem i coded for that but its not working please can you give me some link where i can get some tutorial i need it urgent
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3274
    
    5

Please Ease Up, there are many tutorials on writing XML documents using Java, have you tried a Google search?


Cheers, Martijn - Blog,
Twitter, PCGen, Ikasan, My The Well-Grounded Java Developer book!,
My start-up.
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: java parsing using regular expression
 
Similar Threads
Finding the largest table ina web page and displaying it
parsing data and storing in the xml
Find shortest string with all the words in any order
Webpage Scrapping in Java
Please Help -Problem in Regular Expressions