• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • Ron McLeod
  • Junilu Lacar
  • Liutauras Vilda
Sheriffs:
  • Paul Clapham
  • Jeanne Boyarsky
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Piet Souris
  • Carey Brown
Bartenders:
  • Jesse Duncan
  • Frits Walraven
  • Mikalai Zaikin

Java Html parser

 
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi frendz...
I want to make a html parse that will take a .dco fille as input and parse it..
plz help me if someone knows abt it...
 
Sheriff
Posts: 4313
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
you want to take a .doc file and parse it? as in Microsoft Word?

Check out the Jakarta POI project. -- It has an API to manipulate <icrosoft documents with Java.
 
Zeena Shah
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanx 4 ur reply...by .doc i mean any MS word document...infact i want to make a programme that will read in a word file and pull up all the keywords that a user can use for searching that document...like wat is done in google search engine...i made a search engine but that will be too tiring process to manually feed aal the related keywords in the database so that document is availabe when searched.

hope u will understand wat i want...
byz..
 
Jessica Sant
Sheriff
Posts: 4313
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Did you look at the Jakarta POI project?? it should allow you to parse through the Word documents.
 
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi to all,
I tries to read an .doc file to open on browser but i unable to get the Tables and Images from .doc file..
IS anyone know how to convert an MS-office word (.doc and .docx) files to convert to Html using POI jar?
Please reply ............
 
Rancher
Posts: 43028
76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
POI has no facilities for creating HTML. You could look into the JODConverter library - it uses OpenOffice under the hood to convert between many of the formats OO supports.
 
Marshal
Posts: 75842
361
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And welcome to JavaRanch , jetti madhu
 
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to JavaRanch, Jetti.

Please note that you've added your question to a very old topic from 2004 - it would have been better if you just started your own new topic, especially since your question isn't the same as the original one.
 
reply
    Bookmark Topic Watch Topic
  • New Topic