aspose file tools*
The moose likes XML and Related Technologies and the fly likes Word 2003 to XML via XSLT Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Word 2003 to XML via XSLT" Watch "Word 2003 to XML via XSLT" New topic
Author

Word 2003 to XML via XSLT

Eric Pascarello
author
Rancher

Joined: Nov 08, 2001
Posts: 15376
    
    6
Has anyone here tried to do convert an Word 2003 document into XML via XSLT? I may have a requirement in the near future that would require me to grab data from a word doc and put it into a database. If it could be done with an XSLT, it would make my life easier in the future to change.

I am finding poor documentation on the process. Hopefully someone has some insight into this matter.

Eric
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18910
    
    8

I just did that a couple of days ago. First I saved the document as XML (I don't believe that the .doc format is XML itself). Then I eyeballed the XML to find the bits I wanted to extract, and messed around with the XSLT until it extracted only those bits.

Okay, that's not very professional. A quick hack, but it did what I needed. But I know Microsoft has schemas for the XML version of Word 2003. Have you seen this page yet? Looks like a good place to start.
Madhav Lakkapragada
Ranch Hand

Joined: Jun 03, 2000
Posts: 5040
Glad to note that something is "free' from M$.

- m


Take a Minute, Donate an Hour, Change a Life
http://www.ashanet.org/workanhour/2006/?r=Javaranch_ML&a=81
Prabha Enjeti
Greenhorn

Joined: Oct 30, 2005
Posts: 2
Hi,

I have a similar task to convert a MS Word document to an XML.
The word document has images and graphs.Someone suggested me to use Apache POI Framework for this task.Can some one please suggest me how to go about it?

Thanks,
Prabha
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Word 2003 to XML via XSLT