wood burning stoves 2.0*
The moose likes I/O and Streams and the fly likes Creating a HTML parser in java Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Creating a HTML parser in java" Watch "Creating a HTML parser in java" New topic
Author

Creating a HTML parser in java

Neaman Shafiq
Greenhorn

Joined: Feb 07, 2001
Posts: 5
i am currently producing an online tutorial for children which they can use to learn HTML. i need to create a mechanism which allows the child to enter his/her HTML code into a window or applet, which i can then take and process and return the output. The process involves parsing the HTML code provided by the child. does anyone know how this can be accomplished using java and its libraries? i do not wish to use string tokenizers for this purpose as they seem, altho effective, a bit tedious to program and inefficient.
help! please!
neamz.
Carl Trusiak
Sheriff

Joined: Jun 13, 2000
Posts: 3340
Read the Documentation on javax.swing.text.html.HTMLEditorKit.Parser

------------------
Hope This Helps
Carl Trusiak


I Hope This Helps
Carl Trusiak, SCJP2, SCWCD
Omar IRAQI
Ranch Hand

Joined: Jul 06, 2001
Posts: 54
Hi Shafiq,
Start by creating an applet. This applet will contain 2 panes :
The first one will contain a javax.swing.JTextArea where kids would enter the html text. Let us call this text area inputHtmlText.
The second pane will contain a javax.swing.JEditorPane, let us call it htmlViewer :
javax.swing.JTextArea inputHtmlText = new javax.swing.JTextArea();
javax.swing.JEditorPane htmlViewer = null;
Suppose that the text entered by a kid in the text area is stored in a String :
String htmlString = inputHtmlText.getText();
Now you will construct a new JEditorPane :
htmlViewer = new javax.swing.JEditorPane("text/html", htmlString);
And you are done.
I assume that you are familiar with events handling and swings.
You should also hope that the browser used by the kid supports JRE 1.2.2
Take care
Omar IRAQI


Omar IRAQI Houssaini
Jan Sauerwein
Greenhorn

Joined: Jul 08, 2001
Posts: 6
When you want to implement the whole HTML 4.0 Standard i wish you a lot of fun. And hope you've enough time the next year.
For very easy HTML the
javax.swing.text.html.HTMLEditorKit.Parser
is enough. But when you want to use some kind of style sheets, or Java-Skript it sucks.
Writing a real Parser isn't easy. There is a lot of mathematics involved. Look at the W3C at there speech definitions and you won't do that any longer. The grammar for a really good html-parser is very complex.
And I recommend to you to use and other programming-language to do that. When you use C/C++ you can use the classes of the mozilla project. So you haven't to do the identifing of the tags and there correctness.
I hope I show you that it will be no good idea to program a complete parser for html.
j.a.n.s
Omar IRAQI
Ranch Hand

Joined: Jul 06, 2001
Posts: 54
Hi Jan Sauerwein,
I appreciate your contribution, but I think you didn't read my solution.
The javax.swing.JEditorPane uses the default parser provided by the HTMLEditorKit, and unfortunatly this is the same parser used to implement Sun HotJava browser!
What is great about the JEditorPane class, is that it hides the programmer from all these parsing details, he just passes the HTML String to the JEditorPane constructor and the JEditorPane object does the rest.
So, I think that if one wants just to display an HTML file content, then the solution that I have provided is the easiest one.
[This message has been edited by Omar IRAQI (edited July 08, 2001).]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Creating a HTML parser in java
 
Similar Threads
How to assign name to a child process which was invoked by parent process?
dont now how to use threads
Java processes
Passing runtime environment to child process on Solaris and citrix from java program
How to detect parent process is killed?