| Author |
java parsing using regular expression
|
shan rast
Greenhorn
Joined: Jan 25, 2009
Posts: 7
|
|
http://www.csc.liv.ac.uk/teaching/modules/year3s1/comp304.html
i need to parse this page and extract the member of staff on this page using regular expression.
Using java.utility.regex
i only need the regular expression rest code i have done
//////
import java.io.*;
import java.net.*;
import java.util.regex.*;
class Spider{
public static void main(String []argv){
try {
URL url = new URL("http://www.csc.liv.ac.uk/teaching/modules/year3s1/comp304.html");
URLConnection urlConnection = url.openConnection();
DataInputStream dis = new DataInputStream(urlConnection.getInputStream());
String html= "", tmp = "";
// read all HTML source from given URL
while ((tmp = dis.readLine()) != null) {
html += " "+tmp;
}
dis.close();
// replace all white spaces region with single space
html = html.replaceAll("\\s+", " ");
// build the pattern using regular expression
//here is the pattern where i have to define a regular expression to find the name of the author from the page
*
*
//please REPLY ME THE REGULAR EXPRESSION NEEDED PLEASE IN THE Pattern.compile
//for the link http://www.csc.liv.ac.uk/teaching/modules/year3s1/comp304.html
Pattern p = Pattern.compile("");
// Match the pattern with given html source
Matcher m = p.matcher(html);
// Get all matches that matched my pattern
while (m.find() == true){
// Print the first matched pattern
System.out.println(m.group(1));
}
}catch (Exception e) {
System.out.println(e);
}
}
}
/////
|
 |
Joe Ess
Bartender
Joined: Oct 29, 2001
Posts: 8291
|
|
Please Do Your Own Homework
I am certain Dr K Atkinson would not want us to give you the answer.
|
"blabbing like a narcissistic fool with a superiority complex" ~ N.A.
[How To Ask Questions On JavaRanch]
|
 |
Rob Spoor
Sheriff
Joined: Oct 27, 2005
Posts: 19232
|
|
|
Also, please Use Code Tags.
|
SCJP 1.4 - SCJP 6 - SCWCD 5
How To Ask Questions How To Answer Questions
|
 |
shan rast
Greenhorn
Joined: Jan 25, 2009
Posts: 7
|
|
Thanks i got it but i have new problem i have posted a new post please give a solutionn for that
|
 |
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32833
|
|
shan rast wrote:Thanks i got it but i have new problem i have posted a new post please give a solutionn for that
Please tell us how you solved the problem, so others can learn from your experience.
|
 |
shan rast
Greenhorn
Joined: Jan 25, 2009
Posts: 7
|
|
Campbell Ritchie wrote:
shan rast wrote:Thanks i got it but i have new problem i have posted a new post please give a solutionn for that
Please tell us how you solved the problem, so others can learn from your experience.
Pattern p1 = Pattern.compile("<h2[^>]*>"+".*?]*>"+"([^<]+)"+"</a[^>]*>");
this is the pattern and now you just need to call the second group which will give the names of the author
|
 |
shan rast
Greenhorn
Joined: Jan 25, 2009
Posts: 7
|
|
shan rast wrote:
Campbell Ritchie wrote:
shan rast wrote:Thanks i got it but i have new problem i have posted a new post please give a solutionn for that
Please tell us how you solved the problem, so others can learn from your experience.
Pattern p1 = Pattern.compile("<h2[^>]*>"+".*?<a >]*>"+"([^<]+)"+"</a[^>]*>");
this is the pattern and now you just need to call the second group which will give the names of the author
i have parsed the data and now i need to store it in .xml or say it as to write in xml
so thats my second problem i coded for that but its not working please can you give me some link where i can get some tutorial i need it urgent
|
 |
Martijn Verburg
author
Bartender
Joined: Jun 24, 2003
Posts: 3268
|
|
|
Please Ease Up, there are many tutorials on writing XML documents using Java, have you tried a Google search?
|
Cheers, Martijn - Blog,
Twitter, PCGen, Ikasan, My The Well-Grounded Java Developer book!,
My start-up.
|
 |
 |
|
|
subject: java parsing using regular expression
|
|
|